Search papers, labs, and topics across Lattice.
This paper introduces Logic Gate Networks (LGNs) for video copy detection, replacing traditional floating-point feature extractors with compact, logic-based representations. The framework combines frame miniaturization, binary preprocessing, and a trainable LGN embedding model to learn logical operations and interconnections. The resulting discretized Boolean circuit achieves competitive accuracy with significantly smaller descriptors and inference speeds exceeding 11k samples per second.
Achieve competitive video copy detection accuracy with descriptors orders of magnitude smaller and inference speeds exceeding 11k samples per second by replacing floating-point operations with a learned Boolean circuit.
Video copy detection requires robust similarity estimation under diverse visual distortions while operating at very large scale. Although deep neural networks achieve strong performance, their computational cost and descriptor size limit practical deployment in high-throughput systems. In this work, we propose a video copy detection framework based on differentiable Logic Gate Networks (LGNs), which replace conventional floating-point feature extractors with compact, logic-based representations. Our approach combines aggressive frame miniaturization, binary preprocessing, and a trainable LGN embedding model that learns both logical operations and interconnections. After training, the model can be discretized into a purely Boolean circuit, enabling extremely fast and memory-efficient inference. We systematically evaluate different similarity strategies, binarization schemes, and LGN architectures across multiple dataset folds and difficulty levels. Experimental results demonstrate that LGN-based models achieve competitive or superior accuracy and ranking performance compared to prior models, while producing descriptors several orders of magnitude smaller and delivering inference speeds exceeding 11k samples per second. These findings indicate that logic-based models offer a promising alternative for scalable and resource-efficient video copy detection.