National Center for High-PerformanceNational Chung Cheng UniversityNYCUMay 4, 2026arXiv:2605.02169

Heterogeneous Model Fusion for Privacy-Aware Multi-Camera Surveillance via Synthetic Domain Adaptation

Peggy Joy Lu, Wei-Yu Chen, Yao-Tsung Huang, Vincent Shin-Mu Tseng

AI Summary

HeroCrystal, a privacy-preserving framework, is introduced for multi-camera domain-adaptive object detection that tackles data privacy, class imbalance, and heterogeneous architectures. It uses a one-shot, target-aware diffusion model for synthetic data generation, federated learning with probabilistic Faster R-CNN and dynamic model contrastive strategy, and an inconsistent categories integration algorithm for label reconciliation. Experiments show HeroCrystal outperforms existing methods, achieving a new state-of-the-art mAP of 33.4% and improving over prior privacy-preserving approaches by +2.1%.

Key Contribution

Achieve state-of-the-art object detection in multi-camera surveillance without compromising data privacy by fusing models trained on synthetically augmented and federated data.

Abstract

We propose HeroCrystal, a novel privacy-preserving framework for multi-camera domain-adaptive object detection, addressing challenges such as data privacy, class imbalance, and heterogeneous architectures. Our framework consists of three key stages. In the Generated Stage, we introduce a one-shot, target-aware diffusion-based generation module that learns visual style from a single target-domain image while leveraging prompt-based control to synthesize specific object instances. Unlike conventional style transfer-based methods that require large target datasets and ignore semantic-level discrepancies, our approach enables privacy-preserving augmentation to reduce ethical concerns, and introduces controllable rare object generation to mitigate long-tailed category degradation. In the Federated Stage, we employ probabilistic Faster R-CNN on the client side to improve localization accuracy, and a dynamic model contrastive strategy to suppress domain-specific bias. The server side performs model fusion across heterogeneous architectures without accessing raw data. Finally, in the Distilled Stage, we propose an inconsistent categories integration algorithm to resolve label inconsistency and architecture heterogeneity across clients. Extensive experiments on multiple cross-domain detection benchmarks demonstrate that our method outperforms existing multi-source domain adaptation and federated learning baselines under multi-class, privacy-preserving settings. Our method improves mAP by +2.1% over prior privacy-preserving approaches and achieves a new state-of-the-art mAP of 33.4%, highlighting the effectiveness of HeroCrystal in enabling practical multi-camera AI surveillance systems.

Computer Vision Data Curation & Synthetic Data Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Heterogeneous Model Fusion for Privacy-Aware Multi-Camera Surveillance via Synthetic Domain Adaptation

Related Papers