Mar 30, 2026arXiv:2603.28546

Shy Guys: A Light-Weight Approach to Detecting Robots on Websites

Rémi Van Boxem, Tom Barbette, C. Pelsser, Cristel Pelsser, R. Sadre, Ramin Sadre

AI Summary

This paper introduces a lightweight bot detection method that analyzes user-agent strings and favicon-based heuristics derived from standard web server logs, avoiding client-side interaction. The approach was evaluated on 4.6 million web requests, achieving a bot detection rate of 67.7% with a 3% false positive rate. This outperforms existing methods and offers a less intrusive alternative to CAPTCHAs and JavaScript challenges.

Key Contribution

You can ditch the CAPTCHA: this passive bot detection method spots two-thirds of bots with minimal false positives, using just server logs and favicon analysis.

Abstract

Automated bots now account for roughly half of all web requests, and an increasing number deliberately spoof their identity to either evade detection or to not respect robots.txt. Existing countermeasures are either resource-intensive (JavaScript challenges, CAPTCHAs), cost-prohibitive (commercial solutions), or degrade the user experience. This paper proposes a lightweight, passive approach to bot detection that combines user-agent string analysis with favicon-based heuristics, operating entirely on standard web server logs with no client-side interaction. We evaluate the method on over 4.6 million requests containing 54,945 unique user-agent strings collected from website hosted all around the earth. Our approach detects 67.7% of bot traffic while maintaining a false-positive rate of 3%, outperforming state of the art (less than 20%). This method can serve as a first line of defence, routing only genuinely ambiguous requests to active challenges and preserving the experience of legitimate users.

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Shy Guys: A Light-Weight Approach to Detecting Robots on Websites

Related Papers