|Posted by:||Brent McCulloch (BrentMcCullo…@discussions.microsoft.com)|
|Date:||Fri, 20 Jul 2007|
Just wondering what other people are doing out there to block spiders that
aren't obeying the robots.txt file.
We have seen a bunch of different spiders who seem to completely ignore
robots.txt and go about indexing our entire site anyways! We don't want this!
Anyone have any ideas? Has anyone out there done this type of thing before?
Every time these spiders crawl parts of our site that we don't want them on,
the application log fills with warnings. So, I was thinking to rig up a
temporary ban of the IPs of these spiders every time the events start showing
up in the Application log.
Not really sure how to do that though! :S
Any suggestions would be appreciated!!!!
Thanks a lot,