Search papers, labs, and topics across Lattice.
The paper introduces TM-RugPull, a new multimodal dataset of 1,028 token projects designed for early detection of rug-pull attacks, addressing limitations of existing datasets by ensuring temporal data leakage is avoided. The dataset incorporates on-chain behavior, smart contract metadata, and OSINT signals extracted only from the first half of each project's lifespan, with labels derived from forensic reports and longevity criteria. Experiments using this dataset can establish a new benchmark for reproducible fraud detection research in the tokenized ecosystem.
A new dataset, TM-RugPull, finally enables rigorous, leakage-free research into early detection of rug-pull scams across diverse token types, moving beyond DeFi-centric analysis.
Rug-pull attacks pose a systemic threat across the blockchain ecosystem, yet research into early detection is hindered by the lack of scientific-grade datasets. Existing resources often suffer from temporal data leakage, narrow modality, and ambiguous labeling, particularly outside DeFi contexts. To address these limitations, we present TM-RugPull, a rigorously curated, leakage-resistant dataset of 1,028 token projects spanning DeFi, meme coins, NFTs, and celebrity-themed tokens. RugPull enforces strict temporal hygiene by extracting all features on chain behavior, smart contract metadata, and OSINT signals strictly from the first half of each project's lifespan. Labels are grounded in forensic reports and longevity criteria, verified through multi-expert consensus. This dataset enables causally valid, multimodal analysis of rug-pull dynamics and establishes a new benchmark for reproducible fraud detection research.