Portal Home > Knowledgebase > Articles Database > Bot/Crawler Blocking Software


Bot/Crawler Blocking Software




Posted by warncke, 04-13-2010, 07:52 PM
I am in the process of developing a Bot/Crawler blocking system, and I would be interested to get some feedback. The program is designed for Apache/Mod_Perl/MySQL. The blocking techniques that I am using are: 1) Request header profiling -- Hashing the request headers and user agent strings of all requests, and then running statistical analysis to see if the request header submitted matches a common type for that user agent. This cuts out low ball bots that fake the user agent string, but don't bother with the rest of the headers. 2) Browsing Pattern Analysis -- Running statistical analysis on the frequency of page accesses to look for bot like patterns 3) Referer tracking -- tracking requests and referers to make sure that they match My goal here is to create blocking software that actually detects bots, as opposed to setting access limits. The intention is that a bot will get detected on the first request, or within the first 10 or so requests. I am interested in general feedback on this subject, and I would like to find some server operators who would be interested in testing this with me.

Posted by Nortorious, 04-13-2010, 09:15 PM
wouldnt it just be easier to do this http://************.org/forums/stopp...ots-t4826.r00t ? its not the same but probably would cause less i/o if you run a lot of sits.

Posted by warncke, 04-14-2010, 10:16 AM
The honey pot technique is aimed at preventing a different kind of bot. The primary focus of what I am doing is preventing content ripping and automation (spamming generally), which running a honey pot would not help with.



Was this answer helpful?

Add to Favourites Add to Favourites    Print this Article Print this Article

Also Read


Language:

LoadingRetrieving latest tweet...

Back to Top Copyright © 2018 DC International LLC. - All Rights Reserved.