Possible reasons if your site is not indexed

If the index your site failed, the robot will attempt to issue a message describing the probable cause. To have our robot did not have problems with indexing your site, please read the list below basic errors that may prevent indexing.

1. Check your robots.txt file

Not correctly formatted robots.txt file can disallow your site from all crawlers, including those from search engine spiders and our robot.

Possible error - the existence of such instructions in your robots.txt:

User-agent: *
Disallow: /

If you intend to disallow the site from being indexed by search engines, but you want the robot to MySitemapGenerator crawled your site - uncheck "Follow the rules set in the robots.txt file".

2. Make sure that your pages returns a status "HTTP 200 Ok";

"HTTP 200 Ok" means a successful request for a resource. Any other response to your server will be an error.

Just our robot supports handling of server redirects 301, 302, 303 and 307.

3. Check the "Content-Type", which returns a page of your site

MySitemapGenerator searches URL only on pages that form the HTML-code, respectively, must return the header "Content-Type" with the meaning "Text/HTML".

An example of the proper header in the response, which should return a page of HTML-encoded:

Content-Type: text/html; charset=utf-8

4. The size and page load time

The robot does not limit the allowable size of scanned pages, but any page of your site must be formed within 30 seconds. Otherwise, the URL status equal to unavailable.

5. It is important that you understand: The robot takes into account only local references in the area of the specified domain

At the same domain without the www and are considered to be a mirror. Any other sabdomeny or URL outside the domain are not counted.

For example, if you asked for indexing http://website.tld, then in the case of links with an absolute URL like http://www.website.tld/page, they will also be taken into account. Accordingly, if you specify the URL of the site as a robot http://www.website.tld, then the links will take into account the form http://website.tld/page. But references like http://subdomain.website.tld counted as local will not.

6. For websites that work on the CMS with embedded systems to restrict access

Please note that the indexing process, the robot sends a large number of requests to your website. Some of the CMS with appropriate settings, can block our crawler requests for reasons of safety or load on a Web server. Recommend that you remove such protection at the time indexing of your site.