Mar 16, 2026arXiv:2603.14927

Masked BRep Autoencoder via Hierarchical Graph Transformer

Yifei Li, Kang Wu, Wenming Wu, Xiaoming Fu

AI Summary

This paper introduces a self-supervised learning framework for CAD models using a masked graph autoencoder to learn representations from boundary representation (BRep) models. The framework employs a hierarchical graph Transformer architecture with cross-scale mutual attention and graph neural network blocks to capture both global geometric dependencies and local topological information. Experiments on part classification, modeling segmentation, and machining feature recognition demonstrate superior performance, especially with limited labeled data, showcasing the model's generalizability and practicality.

Key Contribution

Unlock CAD model understanding with a self-supervised autoencoder that significantly outperforms existing methods, even when labeled data is scarce.

Abstract

We introduce a novel self-supervised learning framework that automatically learns representations from input computer-aided design (CAD) models for downstream tasks, including part classification, modeling segmentation, and machining feature recognition. To train our network, we construct a large-scale, unlabeled dataset of boundary representation (BRep) models. The success of our algorithm relies on two keycomponents. The first is a masked graph autoencoder that reconstructs randomly masked geometries and attributes of BReps for representation learning to enhance the generalization. The second is a hierarchical graph Transformer architecture that elegantly fuses global and local learning by a cross-scale mutual attention block to model long-range geometric dependencies and a graph neural network block to aggregate local topological information. After training the autoencoder, we replace its decoder with a task-specific network trained on a small amount of labeled data for downstream tasks. We conduct experiments on various tasks and achieve high performance, even with a small amount of labeled data, demonstrating the practicality and generalizability of our model. Compared to other methods, our model performs significantly better on downstream tasks with the same amount of training data, particularly when the training data is very limited.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Data Curation & Synthetic Data

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Masked BRep Autoencoder via Hierarchical Graph Transformer

Related Papers