Is it possible for google or any other crawler to crawl and index a page which returns a 301 status code?
I have seen a page in google, which has had a 301 for months. However the cache date of that page in the index is from a few days ago.
Can google just ignore the 301 and crawl the contents of a page?
Google always crawls the target of a redirect, HTTP 301 is not an exception. Could not find a better source than one employee’s discussion post, though. Google Search Appliance documentation says the same and I don’t see why GSA and GoogleBot should handle redirects differently.
Normally Google crawls the page that’s redirected to. Two possible explanations for the site you saw:
- The site just showed a 301 message instead of returning HTTP-headers properly.
- The site redirected to another 301, which redirected to another 301, …
Google visits URLs forever irrespective of what response code you return. They do this just in case a URL ever comes back to life with real content.
The 301 is the best response. Google will drop those URLs from the SERPs eventually. Don’t force a quicker drop unless you want less visitors to your site for the next three to six months.
According to Matt Cutts, the head of the webspam team, people have used 301s to abuse rankings by forwarding a bunch of domains to a new one and thus Google has improved how they handle 301 pages. Let us say you moved to a new domain and 301d all of your pages from old domain to respective pages on the new domain. In this case, Google will eventually phase out the old domain from index and bring the new one in.
What you are saying is rare and if you are worried about it you can let Google know about it via Google Webmaster Forums. They are pretty quick at things like this once it gets someone’s attention. There could, however, be the reason that the page eventually removes 301 and then puts it back on. Or it could be that the 301 is not shown to Google Bot.
You can use the google webmaster tool:
There is a robots-analysis tool where you can test your domain url’s and see for yourself if a 301 redirected page is being crawled or not 😉