Search papers, labs, and topics across Lattice.
This paper introduces a quantitative framework for diagnosing failures in Semantic ID tokenizers, addressing issues like codebook underutilization and unstable decision boundaries. By linking semantic boundary confusion to code usage imbalance and geometric constraints, the authors present Decoupled Residual Quantization (DRQ) as a method to enhance the quality of Semantic IDs. Experiments reveal that the quality of Semantic IDs is multi-objective, highlighting the trade-offs between symbolic robustness, reconstruction fidelity, and behavior-aware soft matching.
Semantic ID quality hinges on a delicate balance of robustness and fidelity, with the new DRQ method offering a fresh lens on tokenizer performance.
Semantic IDs represent items as shared discrete token sequences and have become a practical tool for recommendation and retrieval. Yet it remains difficult to tell why a tokenizer fails: poor quality may come from codebook underutilization, unstable decision boundaries, or geometric distortion of the embedding space. This paper develops a quantitative framework for diagnosing these failures through expected codeword overlap and effective codebook capacity. The former measures expected codeword confusion under retrieval-time perturbation, while the latter converts that confusion into an effective number of usable, well-separated codes. The framework links semantic boundary confusion to both code usage imbalance and Euclidean geometric constraints. As a proof of concept, we present Decoupled Residual Quantization (DRQ), which separates continuous geometry reconstruction from discrete distribution matching. Experiments on a large-scale industrial dataset show that Semantic ID quality is multi-objective: symbolic robustness, reconstruction fidelity, and behavior-aware soft matching each stress different aspects of a tokenizer. These downstream observations are based on one proprietary industrial dataset, so they should be read as a case study rather than a universal benchmark claim.