ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting

Mar 28, 2025·
Ruifeng Luo*
,
Zhengjie Liu*
,
Tianxiao Cheng
,
Jie Wang
,
Tongjie Wang
,
Xingguang Wei
Haomin Wang
Haomin Wang
,
Yanpeng Li
,
Fu Chai
,
Fei Cheng
,
Shenglong Ye
,
Wenhai Wang
,
Yanting Zhang
,
Yu Qiao
,
Hongjie Zhang
,
Xianzhong Zhao
· 1 min read
Abstract
Recognizing symbols in architectural CAD drawings is critical for various advanced engineering applications. In this paper, we propose a novel CAD data annotation engine that leverages intrinsic attributes from systematically archived CAD drawings to automatically generate high-quality annotations, thus significantly reducing manual labeling efforts. Utilizing this engine, we construct ArchCAD-400K, a large-scale CAD dataset consisting of 413,062 chunks from 5538 highly standardized drawings, making it over 26 times larger than the largest existing CAD dataset. ArchCAD-400K boasts an extended drawing diversity and broader categories, offering line-grained annotations. Furthermore, we present a new baseline model for panoptic symbol spotting, termed Dual-Pathway Symbol Spotter (DPSS). It incorporates an adaptive fusion module to enhance primitive features with complementary image features, achieving state-of-the-art performance and enhanced robustness. Extensive experiments validate the effectiveness of DPSS, demonstrating the value of ArchCAD-400K and its potential to drive innovation in architectural design and construction.
Type

Citation

If you find this project useful in your research, please consider cite:

@article{luo2025archcad,
  title={ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting},
  author={Luo, Ruifeng and Liu, Zhengjie and Cheng, Tianxiao and Wang, Jie and Wang, Tongjie and Wei, Xingguang and Wang, Haomin and Li, YanPeng and Chai, Fu and Cheng, Fei and others},
  journal={arXiv preprint arXiv:2503.22346},
  year={2025}
}