JD.comMar 17, 2026arXiv:2603.16137

SIA: A Synthesize-Inject-Align Framework for Knowledge-Grounded and Secure E-commerce Search LLMs with Industrial Deployment

Zhouwei Zhai, Zhouwei Zhai, Mengxiang Chen, Anmeng Zhang

AI Summary

The paper introduces SIA, a Synthesize-Inject-Align framework, to improve knowledge grounding and security of LLMs in e-commerce search. SIA synthesizes training data from knowledge graphs and behavioral logs, injects knowledge via depth up-scaling pre-training, and aligns the model with instruction tuning and adversarial training. Deployed at JD.com, SIA significantly improved key business metrics across five search scenarios, demonstrating its industrial viability.

Key Contribution

E-commerce search LLMs can be made both more knowledgeable and secure via a surprisingly simple three-stage framework of data synthesis, parameter-efficient pre-training, and dual-path alignment.

Abstract

Large language models offer transformative potential for e-commerce search by enabling intent-aware recommendations. However, their industrial deployment is hindered by two critical challenges: (1) knowledge hallucination due to insufficient encoding of dynamic, fine-grained product knowledge, and (2) security vulnerabilities under jailbreak attacks that threaten compliance. To address these issues, we propose SI--a Synthesize-Inject-Align framework for building knowledgeable and secure e-commerce search LLMs. Our approach first synthesizes high-quality natural language corpus by combining structured knowledge graphs with unstructured behavioral logs, augmented with reasoning chains and safety-aware data.We then introduce a parameter-efficient pre-training strategy based on Depth Up-Scaling to inject domain knowledge while preserving general capabilities. Finally, a dual-path alignment method via multi-task instruction tuning and adversarial training strengthens both task performance and safety robustness. The framework has been deployed at JD.com, China's largest self-operated e-commerce platform, where A/B tests across five core search scenarios demonstrate significant improvements in key business metrics, validating its industrial effectiveness and scalability.

Natural Language Processing Recommendation & Information Retrieval Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References41

Year2026

VenueN/A

Related Papers

Finding related papers...