All major search engines including Google use spider programs to crawl through indexed websites, at regular intervals, gathering information about them in order to provide fast, accurate and up to date search engine results. To ensure your site is effectively crawled (and therefore ensuring your website is known to the search engines) you should understand how search engine spiders work spiders, what they expect to find and what will cause problems.
The most important factor when considering search engine spiders is to ensure that the search spiders have plenty of basic links to follow when they crawl your site. Search engine spiders will begin by indexing the text on your homepage and will then attempt to follow all available links to other pages on your website. It is important to realise that various website elements such as frames, dynamic URLs, Flash movies, image based homepages and all hinder the work of spiders and can result in your website not being indexed and therefore not showing up in search engine results pages. If you decide to include these feature nonetheless it is vital that you generate an XML sitemap that will tell a spider where dynamic content can be found and guide the spider through your website. Even if your site does not include such features placing a text link to your sitemap in your homepage footer is strongly recommended. Not only will the sitemap guide the spider through indexing your site it will also provide users with a valuable way of naviagting through your site quickly.
Finally, it’s important to have a robots.txt file present in your root directory as some search spiders will not even crawl your website without finding one first. To ensure your robots.txt is acceptable to Google check the status of the file regularly in Google Webmaster Tools.
[Read the rest of this entry...]

