Jiaqi Liu

D) [34] to efficiently sample each visual patch, extracting fine-grained information from RGB and depth images. Then, the extracted features are processed by the attention mechanism, forming the Mamba decoder’s output features, as follows: X^R/Di+1=DwConv(XR/Di),X~R/Di+1=ESSM(LN(XR/Di)),XR/Di+1=Attention(LN(X~R/Di+1))+X^R/Di+1,\begin{split}\hat{X}_{R/D}^{i+1}&=DwConv(X_{R/D}^{i}),\\ \widetilde{X}_{R/D}^{i+1}&=ESSM(LN(X_{R/D}^{i})),\\ X_{R/D}^{i+1}&=Attention(LN(\widetilde{X}_{R/D}^{i+1}))+\hat{X}_{R/D}^{i+1},\\ \end{split} (2) where R/DR/D denotes RGB and depth images, respectively; XR/DiX_{R/D}^{i} represents the output of the ii-th mamba decoder, i∈[0,4]i\in[0,4]; specifically, when i=0i=0, XR/DiX_{R/D}^{i} denotes the original image features. LNLN indicates layer normalization. Subsequently, the output of each Mamba decoder is independently integrated into the MRN to assist feature reconstruction, while the output of the final block is also fed into the classifier for label prediction. This design aims to guide the model to leverage label information to disentangle inter-object features, thereby minimizing the interference of spurious features. Thus, the classification loss function is ℒR/D=minℒCE(YmabRGB/Depth,Y),\mathcal{L}_{\mathrm{R/D}}=min\,\,\,\mathcal{L}_{\mathrm{CE}}(Y_{mab}^{\mathrm{RGB/Depth}},Y), (3) where ℒCE(⋅)\mathcal{L}_{\mathrm{CE}}(\cdot) denotes cross-entropy loss; YmabRGBY_{mab}^{\mathrm{RGB}} and YmabDepthY_{mab}^{\mathrm{Depth}} represent the outputs of the RGB and depth image classifiers, respectively; YY is the ground-truth label. 3.3 Information Bottleneck Fusion Module Once the MRN reconstructs the abnormal RGB and depth images synthesized by the MFEN into normal features using the additional features provided by the Mamba encoders, the cascade and cross-attention mechanisms are employed to fuse both modalities, as detailed follows: Ffusion1=fRrec2⊕fDrec2,Ffusion2=fRrec4⊕fDrec4,Ffu=Cross_Att(Ffusion1,Ffusion2).\begin{split}F_{fusion}^{1}=f_{R}^{rec2}\oplus f_{D}^{rec2}&,F_{fusion}^{2}=f_{R}^{rec4}\oplus f_{D}^{rec4},\\ F_{fu}=Cross\_&Att(F_{fusion}^{1},F_{fusion}^{2}).\end{split} (4) Then, we use an information bottleneck regularization module to filter redundant information from the initial fusion feature (i.e., FfuF_{fu}), obtaining a more predictive feature (i.e., FfugF_{fu}^{g}). This module comprises two linear projection layers, dropout, and ReLU activation. Specifically, zz denotes the reprojection of FfugF_{fu}^{g} back to the dimension of FfuF_{fu}, as: Ffu→Linear ProjectionDropout + ReLUz→Linear ProjectionDropout + ReLUFfug.F_{fu}\xrightarrow[\text{Linear Projection}]{\text{Dropout + ReLU}}z\xrightarrow[\text{Linear Projection}]{\text{Dropout + ReLU}}F_{fu}^{g}. (5) To ensure that FfugF_{fu}^{g} sufficiently retains the predictive information of FfuF_{fu} while discarding redundant features, we quantify this relationship using the mutual information between FfugF_{fu}^{g} and FfuF_{fu}, defined as follows: I(Ffu;Ffug)=𝔼p(Ffu,Ffug)[log⁡p(Ffu,Ffug)p(Ffu)p(Ffug)].I(F_{fu};F_{fu}^{g})=\mathbb{E}_{p(F_{fu},F_{fu}^{g})}\left[\log\frac{p(F_{fu},F_{fu}^{g})}{p(F_{fu})p(F_{fu}^{g})}\right]. (6) Moreover, I(Ffu;Ffug)I(F_{fu};F_{fu}^{g}) can be futher divide into two parts, I(Ffu;Ffug|Y)I(F_{fu};F_{fu}^{g}|Y) and I(Ffug;Y)I(F_{fu}^{g};Y), via the mutual information chain rule (See Corollary 1 in Theoretical Analysis Section for details), where I(Ffu;Ffug|Y)I(F_{fu};F_{fu}^{g}|Y) represents redundant information and I(Ffug;Y)I(F_{fu}^{g};Y) indicates prediction-related information for the current object. Thus, to effectively eliminate redundant features, we only need to maximize I(Ffug;Y)I(F_{fu}^{g};Y) while minimizing I(Ffu;Ffug|Y)I(F_{fu};F_{fu}^{g}|Y). Given FfugF_{fu}^{g} derived from FfuF_{fu}, the information contained in FfugF_{fu}^{g} cannot exceed that in FfuF_{fu}, i.e., I(Ffug;Y)≤I(Ffu;Y)I(F_{fu}^{g};Y)\leq I(F_{fu};Y). Consequently, the above information bottleneck objective function is equivalent to: minI(Ffu;Y)−I(Ffug;Y).min\quad I(F_{fu};Y)-I(F_{fu}^{g};Y). (7) Table 1: I-AUROC/AUPRO results of our approach on MVTec, Jiaqi Liu, Yuanyi Zhang and Fang-Wei Fu are with Chern Institute of Mathematics and LPMC, Nankai University, Tianjin 300071, P. R. China, Emails: ljqi@mail.nankai.edu.cn, yuanyiz@mail.nankai.edu.cn, fwfu@nankai.edu.cn

Papers on Lattice

Total citations

Topics

h-index

Research focus

Eval Frameworks & Benchmarks (1)Tool Use & Agents (1)

Frequent co-authors

Peng Xia (1)Jianwen Chen (1)Xinyu Yang (1)Haoqin Tu (1)

Papers (2)

Mar 17, 2026

Mar 17, 2026·also UC Santa Cruz

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

LLM agents can now learn on the fly and adapt to evolving user needs without disruptive downtime, thanks to a novel meta-learning framework that synthesizes new skills from failure trajectories and optimizes the base policy during inactive periods.

Peng Xia, Jianwen Chen, Xinyu Yang +11

Eval Frameworks & Benchmarks Tool Use & Agents

Jiaqi Liu +2Mar 17, 2026

Ciphertext-Policy ABE for $\mathsf{NC}^1$ Circuits with Constant-Size Ciphertexts from Succinct LWE

Constant-size ciphertexts for $\mathsf{NC}^1$ attribute-based encryption are now possible via succinct LWE, unlocking practical applications like scalable broadcast encryption.

Jiaqi Liu, Yuanyi Zhang, Fang-Wei Fu

Search

Jiaqi Liu

Research focus

Frequent co-authors

Papers (2)