301 Redirects and 404 Pages

INTEGRATION SERVICES FOR WEB TRAFFIC MANAGEMENT

Many webmasters wonder how to properly handle missing pages or obsolete content in websites. There are two common methods to handle missing and invalid pages. One is via the use of redirection methods and another with by presenting an error message indicating the requested page is missing.

While reading the methods mentioned here, you should take into account that redirect or error handling, may take place inside the web application. In many cases this is a preferred method over the hard-coded rules inside server scripts which aren't always portable.

By using 301 redirection methods there are good chances to utilize the traffic from old pages, where using 404 error pages voids the old pages web traffic altogether. The questions is which method is better. Lets see in detail what's really going on, because many webmasters have the wrong impression about redirects.

Upon an request to a missing page the server will generate an error response. If the web application is aware of the missing page for a given request it can place the visitor to a related page. Redirection methods perform a delegation by basically instructing the browser or search engine to visit another page. With browsers this technique is usually transparent to users and this transparency via redirects, is the major advantage as we will see in this document.

Lets begin by analyzing the possible replies a server may send for a missing web page.

1. 301 Permanent Redirect
A 301 response via the HTTP headers, instructs the client to relocate the current request to another web page. It also informs the client that this transition, is of a permanent type, implying links that point to the old page should be now updated, to point to the new one. For search engines the 301 redirect method invalidates old links and shifts indexing and ranking to the destination page as soon as possible.

Page indexing and ranking for 301 redirects are handled by search engines, while the redirection itself can be automatic and generates the page transition on the fly for regular visitors.

There is a one problem doing a 301 redirect and that is selection of the destination page. Using a single page or some standard pages to redirect may impact indexing and ranking of the destination pages. After all the objective is to utilize web traffic. Up to this point lets summarize the advantages of using the 301 redirect response.

  • A 301 permanent redirect in the same domain is transparent to browsers
  • Web traffic of old pages is channeled to new or selected web pages
  • Search engines indexing and ranking is automatically updated
  • No maintenance is necessary for the web content
  • Automatic requests are served only with few bytes of content so no waste of bandwidth resources as the response location may never be followed.

Prior to unfolding the possible implementations on handling 301 redirects lets examine the 404 response traditionally used for missing pages.

2. 404 Page Not Found

A 404 response via the HTTP headers implies the page is no longer available or it's temporarily unavailable or it never existed. By default servers have their own handlers for error pages with some errors funneled through a single error handler. A webmaster can always override the default behavior of an error page, setting up his own page content.

When search engines access invalid or missing pages and see a 404 response, they update their index by removing the page from the search results. However visitors only see the page content the server sends back. If a webmaster sets up custom 404 content, visitors may be able to navigate to other areas of the website totally unaware of the missing page and therefore may never update their bookmarks or even notice.

The drawbacks of the 404 responses are several.

  • When the 404 page presents the regular website layout, links like favorites, stored with the visitor's browser, may not be updated, because visitors may not realize the page is missing - they can still navigate to other pages. This applies to custom error pages some wrongly believe is the proper solution.
  • Default 404 server pages offer no navigation to other pages and that is discouraging for humans as the layout looks totally different than the site's layout. Expect a high bounce rate as visitors will click the back button on their browsers.
  • Search engines remove old pages from their index therefore traffic of the old pages is now gone.
  • Complicated maintenance procedures required, to locate and rectify links fast within the web content, as any link mismatch automatically generates an error page.
  • Custom 404 pages waste bandwidth when automated requests are made because the application may have to output content the same way as with a regular 200 OK response. Worthy to note that some popular CMS packages hook the default error handler and load the whole framework in order to process a request that leads to a 404 error page wasting resources. For instance if you don't have a favicon the code will render the whole site's theme for a 404 error.
  • When a page and its associated resources are removed - but still exist in the cache of a spider - a single page rendering may force dozens of so called 404 customized pages.
  • 404 or 4x HTTP headers imply the site doesn't know what to do with the request and shows very little effort is done for error management.

Therefore using a 404 response is something that should be avoided in this context, above all there is no way for the old traffic to be utilized or maintained. And the worse part is that the 404 HTTP header means the page is temporary unavailable. Of course the 410 could substitute the 404 error  but in any case it's an ugly experience for your visitors to see errors.

301 Redirects is the way to go
Following the summary of advantages for 301 redirects, you may wonder how to counter the problem of specifying the proper destination page for a missing page request during a redirect transition. The answer lies within the framework deployed for a website. This is the missing fundamental element when you read other articles elsewhere, claiming that 301 redirects may hurt ranking and indexing of destination pages which totally false.

For websites with a framework that generates content dynamically like osCommerce, setting up the 301 redirects to point to the right place is extremely easy and highly efficient. First of all, we have SEO components, like SEO-G, specifically designed to handle a 301 redirection targeting the most appropriate page by analyzing the page request on errors. This SEO component controls the redirection, based on the database context of the stored URLs.

When serving requests and upon detection of a missing page, the module is invoked, retrieving the request query from the server parameters. The query part of the link, in turn, is streamlined and compared against the stored URL records via a search algorithm which analyzes the link keywords and increases chances to find a match or equivalent page, before the final redirect is issued.

Results of the search process return a list of alternative links to guide the SEO-G redirection code and a best URL selection follows. Finally a permanent 301 redirect is issued, guiding both visitors and search engines to the most relevant page.

The most recent versions of I-Metrics CMS handle error requests more efficiently. Not only retrieve the most relevant and break-down the request into relevant keywords before forming the final link, but they also handle explicitly the known URL extensions. In other words side resources like scripts and images are not redirected.

We also mentioned that there is no need for web content maintenance using a permanent 301 redirect. Here is the reason.

Having ancient methods of 404 error handling and missing page detection, the webmaster had to go into the server error log or to an equivalent reports program, run it for each page of the website and then update each web page separately rectifying the missing links. Even if the report is automatically generated via a bot, for the missing pages, the webmaster still needs to edit the web content.

In contrast, by using advanced web engines like osCommerce and the I-Metrics Layer extensions, this can be done by a single click from the administration panel. You may now wonder how this is possible.

The SEO-G dynamic to static links management system, employs a web content parser for database elements used for the page content construct. A fully automated process can be selected for content updates, targeting specific HTML elements like anchor links. Therefore the recorded links appearing throughout each page of the website can be updated, because valid links are prerecorded for each web page. A 301 redirect as a response to an error request can be sent without side effects as the search process will find the most appropriate page.

This search and update method, is highly efficient as it only targets internal links. External links remain as previously set, because there is no need for a 301 redirect, as these links point outside the website. Worth to note when implementing 301 redirect to be consistent. Following a redirection to a specific request, the same target URL should be used. Do not place the visitor to a random page.

Are 404s helpful for anything?

Yes 404s and in general the 4x HTTP headers can be utilized to avoid indexing of existing pages and are can be very effective and independent of known spider indexing methods. At present to inform spiders for pages that should not be indexed, webmasters may deploy the nofollow property on a link, a noindex/nofollow as the meta robot tag, set the robots.txt to restrict access to certain pages or folders and other spider identification methods to prevent indexing.

Unfortunately these methods not only do not apply to all spiders, but they have various side effects and complicate matters. For example serving different content to a spider and different to a human may be seen as cloaking. Maintaining nofollow attributes, or meta robot tags can be tricky and requires some maintenance with questionable results. This is where a 404 or 410 header can be very helpful. For instance the shopping cart page or the login account page have no SEO value. Setting up these pages to return a 404 header instead of the default 200 OK, ensures spiders will never index them. Everything else about the content of these pages can stay the same so normal visitors see no difference as the header field is transparent and browsers will render the page as normal.

Links to those internal pages which have no SEO value and considered as thin content, may be exposed via side javascripts. Hooking the click event transparently, on virtually any HTML tag, so spiders cannot see the links because the HTML source contains no hard link presence not even a hint of a link. The additional 4x header is a precaution so even if one of these web pages is accidentally indexed it will not be part of the spider's index.

If you are interesting to know more on how SEO-G redirects to the proper page visit our SEO-G redirects page. Hopefully this document will guide you to take the right decision the next time you need to handle missing pages.

Review: 301 Redirects and 404 Pages

Please enter your comment for this technical article based on your experience.

NOTE: HTML is not translated. Reviews are manually approved by the administrator.

Tags supported for code presentation purposes:
- For PHP enclose the code in [code1][/code1] tags
- For MySQL queries use [code3][/code3] tags
- For HTML content use [code5][/code5] tags
- For CSS use [code6][/code6] tags
 
Your Email (Will not be published):
Your Name:
Your Comments:

Blog and News

The CheetahMail Spam Internal Links Structure Blind Redirects and Exploits
 
 
META-G Extreme Tags Generator
 
 
 
I-Metrics Layer by Asymmetric Software
E-Commerce Engine Copyright © 2003 osCommerce (MS2.2)
Copyright © 2003-2012 Asymmetric Software - All rights reserved.
 
 
  Advanced Search
E-Commerce by Asymmetric Software - Innovation and Excellence
  • LOG IN
  • CREATE ACCOUNT
Active Countries - Complete Countries and States configuration control