Gorufu, littleman, Air, SugarKane? You guys see any errors or better ways to do this....anybody got a bot to add....before I stick this in every site I manage.
Feel free to use this on your own site and start blocking bots too.
(the top part is left out)<Files .htaccess>
deny from all
</Files>
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR]
RewriteCond %{HTTP_USER_AGENT} ^sitecheck.internetseer.com [OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} ^DIIbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^psbot [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
RewriteRule ^.* - [F]
RewriteCond %{HTTP_REFERER} ^http://www.iaea.org$
RewriteRule !^http://[^/.]\.your-site.com.* - [F]
If there are more things that should be added please post them,
Thanks
The .htaccess file idea seems like a much better stop gap measure, and much more versatile, and easier to implement. I'll stop in now and then and see if there is anything more to add to it. Kudos to webmasterworld for having forums and contributors that actually can teach you something and not waste your time.
RewriteCond %{HTTP_REFERER} q=guestbook [NC,OR]
(just the referer ones with guestbook, and see if that makes a difference?), also cut out the first line source comment, just to be on the safe side, then see if you get the same errors.
I've been running it for a few days, without any errors, but that's just one server on one webhoster, so I can't tell you there's nothing wrong with it, maybe some of the other people who have contributed can take a look at it tech.ratmachines.com/downloads/sample_wbmw.txt
here and let us know.
Here are the first and last lines of the script, however, if someone can spot an error (the dots represent the cut out part:
===============================
RewriteEngine On
RewriteCond %{HTTP_REFERER} q=guestbook [NC,OR]
.....................
RewriteCond %{HTTP_USER_AGENT} ^Zeus [OR]
RewriteCond %{HTTP_USER_AGENT} ^ZyBorg
RewriteRule ^.* - [F,L]
===============================
You might want to post what errors you got exactly, then somebody might be able to help you, I'm not very good at this stuff, but some of the people on this forum are.
Troubleshooting
Here are some of the most common problems I've seen people have (or have had myself) with .htaccess files. One thing I should stress first, though: the server error log is your friend. You should always consult the error log when things don't seem to be functioning correctly. If it doesn't say anything about your problem, try boosting the message detail by changing your LogLevel directive to debug. (Or adding a LogLevel debug line of you don't have a LogLevel already).
'Internal Server Error' page is displayed when a document is requested
This indicates a problem with your configuration. Check the Apache error log file for a more detailed explanation of what went wrong. You probably have used a directive that isn't allowed in .htaccess files, or have a directive with incorrect syntax.
.htaccess file doesn't seem to change anything
It's possible that the directory is within the scope of an AllowOverride None directive. Try putting a line of gibberish in the .htaccess file and force a reload of the page. If you still get the same page instead of an 'Internal Server Error' display, then this is probably the cause of the problem. Another slight possibility is that the document you're requesting isn't actually controlled by the .htaccess file you're editing; this can sometimes happen if you're accessing a document with a common name, such as index.html. If there's any chance of this, try changing the actual document and requesting it again to make sure you can see the change. this isn't happening.
I've added some security directives to my .htaccess file, but I'm not getting challenged for a username and password
The most common cause of this is having the .htaccess directives within the scope of a Satisfy Any directive. Explicitly disable this by adding a Satisfy All to the .htaccess file, and try again.
Don't use quotes for mod_rewrite patterns. That's for RedirectMatch syntax.
Comments should be on their own line - Otherwise, you will get warnings if you have that log-level set.
So,
RewriteCond %{HTTP_USER_AGENT} "Microsoft URL Control" [OR] # spambot
should be
# spambot
RewriteCond %{HTTP_USER_AGENT} Microsoft\ URL\ Control [OR]
RewriteRule !err_¦robots\.txt - [F,L]
the underscore in _"err_" needs to be escaped - precede it with a "\". RewriteRule !(err\_¦robots\.txt) - [F,L]
Also, the broken vertical pipe "¦" character above must be changed to a solid vertical pipe before it can be used in .htaccess.
HTH,
Jim
Bad bot script: [webmasterworld.com...]
(See the links at the top of that thread for even more "historical" information on the subject.)
Jim
Did I write this wrong?
Thanks
ladymindy
No problem blocking Teleport Pro. Except then I discovered it can be set to disguise itself as
(compatible; MSIE 6.0; Windows 98; Win 9x 4.90; Hotbar 4.0) as well as a few other things.
Now I'm really confused. I can't very well block that right?!
Suggestions?
Ty