# robots.txt for http://www.advancedpump.com/ ===================================================================== Directions: This robots.txt file would allow the "googlebot", which is the search engine spider of Google, to retrieve every page from your site except for files from the "formbot" directory. All files in the "formbot" directory will be ignored by googlebot. if you want a bot to search your site, then dont include it in the robots text file. It us up to us to figure out what we want indexed and what folders not to. googlebot is removed from this file so that it will index all files. If you have a folder that shouldnt be indexed then it needs to be added. 3 main ones are below : User-agent: googlebot Disallow: /formbot User-agent: Slurp Disallow: /formbot User-agent: msnbot/1.1 disallow: /formbot The ones below are all the bots we dont want to crawl the site. ****Remove between the line above and below and any comments before you go live****** Comments in this file are not needed and only increase file size ===================================================================== User-agent: NPBot Disallow: / User-agent: TurnitinBot Disallow: / User-agent: EmailSiphon Disallow: / User-agent: EmailWolf Disallow: / User-agent: ExtractorPro Disallow: / User-agent: CherryPicker Disallow: / User-agent: NICErsPRO Disallow: / User-agent: Teleport Disallow: / User-agent: EmailCollector Disallow: / # Crawlers that are kind enough to obey, but which we'd rather not have # unless they're feeding search engines. User-agent: UbiCrawler Disallow: / User-agent: DOC Disallow: / User-agent: Zao Disallow: / # Some bots are known to be trouble, particularly those designed to copy # entire sites. Please obey robots.txt. User-agent: Zealbot Disallow: / User-agent: MSIECrawler Disallow: / User-agent: SiteSnagger Disallow: / User-agent: WebStripper Disallow: / User-agent: WebCopier Disallow: / User-agent: Fetch Disallow: / User-agent: Offline Explorer Disallow: / User-agent: Teleport Disallow: / User-agent: TeleportPro Disallow: / User-agent: WebZIP Disallow: / User-agent: linko Disallow: / User-agent: HTTrack Disallow: / User-agent: Microsoft.URL.Control Disallow: / User-agent: larbin Disallow: / User-agent: libwww Disallow: / User-agent: ZyBORG Disallow: / User-agent: Download Ninja Disallow: / User-agent: grub-client Disallow: / # # Doesn't follow robots.txt anyway, but... # User-agent: k2spider Disallow: / User-agent: NPBot Disallow: / # A capture bot, downloads gazillions of pages with no public benefit # http://www.webreaper.net/ User-agent: WebReaper Disallow: /