Search papers, labs, and topics across Lattice.
The paper introduces LionGuard 2, a lightweight multilingual content moderation classifier designed for the Singapore context, supporting English, Chinese, Malay, and partial Tamil. It addresses the limitations of large language models in localization and low-resource languages by leveraging pre-trained OpenAI embeddings and a multi-head ordinal classifier. The model achieves state-of-the-art performance across 17 benchmarks, surpassing commercial and open-source systems, while being actively deployed within the Singapore Government.
Forget massive models: a lightweight content moderator using OpenAI embeddings and localized data beats leading commercial systems in multilingual benchmarks.
Modern moderation systems increasingly support multiple languages, but often fail to address localisation and low-resource variants - creating safety gaps in real-world deployments. Small models offer a potential alternative to large LLMs, yet still demand considerable data and compute. We present LionGuard 2, a lightweight, multilingual moderation classifier tailored to the Singapore context, supporting English, Chinese, Malay, and partial Tamil. Built on pre-trained OpenAI embeddings and a multi-head ordinal classifier, LionGuard 2 outperforms several commercial and open-source systems across 17 benchmarks, including both Singapore-specific and public English datasets. The system is actively deployed within the Singapore Government, demonstrating practical efficacy at scale. Our findings show that high-quality local data and robust multilingual embeddings can achieve strong moderation performance, without fine-tuning large models. We release our model weights and part of our training data to support future work on LLM safety.