Website Crawler and XML Sitemap Generator


Your privacy is important to us. We will never share your information with third parties.
If you would like to prevent this free tool from crawling your website, please add the following lines to your robots.txt file:
User-agent:NinjaBot
    Disallow: /

See how to use Website Crawler and XML Sitemap Generator

The Website Crawler Tool, which acts as something of a Google sitemap generator will crawl your whole website, help you spell check, create your Google sitemap and point you to some errors on pages (e.g. broken links, 404 pages, etc) with our Ninja Website Crawler Tool.

Check All of the Links on a Website

A Note on User Privacy: Your privacy is important to us. We will never share your information with third parties.

About the Tool

Links are critical to webpages, in terms of both informing the reader and positive SEO best practices. The Find broken Links, Redirects & Site Crawl Tool allows you to review the status of both external links and internal links on a website. The resulting report generated by this tool provides important insight into the link schema of the site, and identifies link redirects and errors; all of which help in planning a link optimization strategy. The tool also features downloadable results, a sitemap feature, and is free for anyone to use.

Getting Started: How it Works

  • Type in or copy/paste the website’s homepage URL into the specified text box.
  • Select the range of URLs from the input website that you wish to be scanned.
  • Select ‘Check

Keep in mind, this tool generates in real-time and may take up to 30 minutes for data collection and processing, based on the number of URLs being reviewed. For your convenience, this tool also allows you to receive your results via email, once the crawl is complete.

A Note to Tool Users: Due to the resources which this tool utilizes, there is a running limit of 5 tool runs per day, per user. To continue benefiting from this tool, please bookmark this page and come back after 24 hours.

Tool Results Bar

The results table data of this tool is interactive; most of the data is linked, either to the URLs referenced or to details about the data. For most cells that contain non-URL data, position the cursor over the cell to see the full results. While the tool is running, a results bar begins to appear, displaying the following information:

  • Status of the tool (Crawling or Done)
  • Number of Internal URLs crawled
  • Number of External links found
  • Number of Internal HTTP Redirects found
  • Number of External HTTP Redirects found
  • Number of Internal HTTP error codes found
  • Number of External HTTP error codes found

Once the tool has finished crawling the input website, an XML sitemap, as well as downloadable results will be available. Downloadable tool results can be accessed as an Excel spreadsheet or HTML file. These files can be accessed using the buttons of the appropriate title at the top of the page.

The tool outputs six tables providing detailed information regarding the following areas, each of which can be downloaded as XLS and HTML files:

  1. Internal links
  2. External links
  3. Internal errors (a subset of Internal Links)
  4. Internal redirects (another subset of Internal Links)
  5. External errors (a subset of External Links)
  6. External redirects (another subset of External Links)

The following is a breakdown of each of the different tables included.

Internal Links Table

Includes the following fields of information:

  • URLs crawled on the site
  • URL’s level from the domain root
  • URL’s returned HTTP status code
  • Number of internal links the URL has within the site (click to see the list of URLs)
  • Link text used for the URL
  • Number of internal links on the page (click to see the list of URLs)
  • Number of external links on the page (click to see the list of URLs)
  • Size of the page on kilobytes (click to see page load speed test results for this URL from Google)
  • The tag text from the URL’s page
  • The description tag text from the URL’s page
  • The keywords tag text from the URL’s page
  • Contents, if used, of the anchor tag’s “rel=” attribute

External Links Table

Includes the following data fields:

  • URL’s returned HTTP status code
  • Number of times that URL is linked to from within the site (click to see the list of affected URLs)
  • External URL used in the link
  • Link text used for the URL
  • Internal page URL on which the link was first found

Internal HTTP Code Errors Table

The Internal errors table gathers all of the pages returning HTTP code errors (4xx and 5xx level codes) in one place to help organize the effort to resolve the problems. It includes the following data fields:

  • URL’s returned HTTP status code
  • Number of times that URL is linked to from within the site (click to see the list of affected URLs)
  • Internal URL used in the link
  • Link text used for the URL
  • Internal page URL on which the link was first found

The Internal Errors table is a subset of the Internal Links table showing just those pages returning HTTP status code errors.

Internal HTTP Redirects Table

This table combines all of the pages returning HTTP redirects in one list, so you can easily review them. You should not have to rely on redirects internally; instead, you can fix the source code containing the redirected link. This table contains the following data fields:

  • URL’s returned HTTP status code
  • Number of times that URL is linked to from within the site (click to see the list of affected URLs)
  • Internal URL used in the link
  • Link text used for the URL
  • Redirect’s target URL
  • Internal page URL on which the link was first found

The Internal Redirects table is a subset of the Internal Links table, showing just those pages returning 301 and 302 HTTP status code redirects.

External HTTP Code Errors Table

The External errors table gathers all of the pages returning HTTP code errors (4xx and 5xx level codes) in one place to help organize the effort to resolve the problems. It includes the following data fields:

  • URL’s returned HTTP status code
  • Number of times that URL is linked to from within the site (click to see the list of affected URLs)
  • Internal URL used in the link
  • Link text used for the URL
  • Redirect’s target URL
  • Internal page URL on which the link was first found

The External Errors table is a subset of the External Links table showing just those pages returning HTTP status code errors.

External HTTP Redirects Table

The External Redirects table combines all of the pages returning HTTP redirects in one list so you can easily review them. As the redirect to the targeted page does not affect your page, fix these URLs is a lower priority. This table contains the following data fields:

  • URL’s returned HTTP status code
  • Number of times that URL is linked to from within the site (click to see the list of affected URLs)
  • External URL used in the link
  • Link text used for the URL
  • Redirect’s target URL
  • Internal page URL on which the link was first found

The External Redirects table is a subset of the External Links table showing just those pages returning 301 and 302 HTTP status code redirects.

(Idea and specs by Jim Boykin)


Feedback

 

Users comments:

Login or register to post comments and rate

4.5 (4 votes)

anns
2014-01-06
Please try now!
Anonymous
2013-12-26
Export to HTML and XLS only showed header rows and none of the data.
2013-08-16
I can't believe this tool is free!