Search papers, labs, and topics across Lattice.
This paper introduces SST-Guard, a browser-based system to detect server-side Google Analytics (sGA) by identifying semantic value patterns (identifiers, event metadata) across network requests, cookies, and the window object, rather than relying on known tracker endpoints. SST-Guard uses a value-template approach with regular expressions to match these semantic value patterns, even when endpoints are customized and URLs/payloads are obfuscated. Validated on Tranco datasets, SST-Guard achieves over 93% accuracy in detecting sGA domains, revealing that over 4% of the top 150k websites employ sGA.
Server-side tracking thought it could hide, but this new browser extension spots Google Analytics even when it's sneakily relaying data through custom endpoints.
As web browsers increasingly restrict client-side tracking, the web tracking ecosystem is shifting from client-side to server-side tracking (SST). In SST, the browser sends tracking requests to an intermediate endpoint, which then forwards them to the tracker's endpoint, eliminating direct client-to-tracker requests. As a result, existing tracking protections that block requests to known tracker endpoints are rendered ineffective. In this paper, we investigate server-side implementation of Google Analytics, the most widely deployed third-party tracking service on the web today. We also present SST-Guard, a multi-modal, browser-based system for detecting and blocking server-side Google Analytics (sGA). Our key insight is that even when the tracker's endpoints change, sGA must necessarily still collect and share the same semantic information as client-side Google Analytics (e.g., identifiers, event metadata). Therefore, rather than detecting requests to known Google Analytics endpoints, SST-Guard aims to detect underlying artifacts of collection and sharing of these semantic values to any arbitrary endpoint. Operationalizing this insight is challenging because real-world sGA deployments commonly customize endpoints and obfuscate URLs/payloads. SST-Guard addresses this challenge using a value-template approach that employs regular expressions to match semantic value patterns across multiple modalities: network requests, cookies, and the window object. We validate SST-Guard on Tranco top-10k websites, detecting 4.02\% (403) sGA domains with over 93\% accuracy across three modalities, with network request classifier demonstrating the highest accuracy (99.8\%). By deploying SST-Guard in the wild, we find 4.21\% (6,314) of Tranco top-150k websites using sGA.