Preventing crawling

Support for phpBB mods/hacks.

Moderator: Moderators

Preventing crawling

Postby flowrencemary on Wed Apr 22, 2009 11:33 am

To avoid undesirable content in the search indexes, webmasters can instruct spiders not to crawl certain files or directories through the standard robots.txt file in the root directory of the domain. Additionally, a page can be explicitly excluded from a search engine's database by using a meta tag specific to robots. When a search engine visits a site, the robots.txt located in the root directory is the first file crawled. The robots.txt file is then parsed, and will instruct the robot as to which pages are not to be crawled. As a search engine crawler may keep a cached copy of this file, it may on occasion crawl pages a webmaster does not wish crawled. Pages typically prevented from being crawled include login specific pages such as shopping carts and user-specific content such as search results from internal searches.

seo india
flowrencemary
Registered User
Registered User
 
Posts: 8
Joined: Wed Apr 15, 2009 10:44 am

Return to phpBB Mods

Who is online

Users browsing this forum: Bing [Bot] and 5 guests

cron