|
|
|
The Web Robots Pages |
|
The Web Robots Database The List of Active Robots has been changed to a new format, called The Web Robots Database. Note that now robot technology is being used in increasing numbers of end-user products, this list is becoming less useful and complete. |
|
The robot information is now stored into individual files, with several HTML
tables providing different views of the data:
The combined raw data in machine readable format is available in a text file. Feel free to email any feature requests (such as better sorting, and a plain HTML View) -- the more I get, the more likely I will implement them. To add a new robot, fill in this empty template, using this schema description, and email it to m.koster@webcrawler.com
OthersThere are robots out there that the database contains no details on. If/when I get those details they will be added, otherwise they'll remain on the list below, as unresponsive or unknown sites.Services with no informationThese services must use robots, but haven't replied to requests for an entry...
User AgentsThese look like new robots, but have no contact info...BizBot04 kirk.overleaf.com HappyBot (gserver.kw.net) CaliforniaBrownSpider EI*Net/0.1 libwww/0.1 Ibot/1.0 libwww-perl/0.40 Merritt/1.0 StatFetcher/1.0 TeacherSoft/1.0 libwww/2.17 WWW Collector processor/0.0ALPHA libwww-perl/0.20 wobot/1.0 from 206.214.202.45 Libertech-Rover www.libertech.com? WhoWhere Robot ITI Spider w3index MyCNNSpider SummyCrawler OGspider linklooker CyberSpyder (amant@www.cyberspyder.com) SlowBot heraSpider Surfbot Bizbot003 WebWalker SandBot EnigmaBot spyder3.microsys.com www.freeloader.com. HostsThese have no known user-agent, but have requested/robots.txt
repeatedly or exhibited crawling patterns.
205.252.60.71 194.20.32.131 198.5.209.201 acke.dc.luth.se dallas.mt.cs.cmu.edu darkwing.cadvision.com waldec.com www2000.ogsm.vanderbilt.edu unet.ca murph.cais.net (rapid fire... sigh) spyder3.microsys.com www.freeloader.com.Some other robots are mentioned in a list of Japanese Search Engines.
|
|
|
|
|
|
|
|
|
Home
| Learn the
Web | Search
| Submission |
Webmaster |
News |
Chat
| |
|