ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting
Mar 28, 2025·,,,,,
,,,,,,,,,·
1 min read
Ruifeng Luo*
Zhengjie Liu*
Tianxiao Cheng
Jie Wang
Tongjie Wang
Xingguang Wei

Haomin Wang
Yanpeng Li
Fu Chai
Fei Cheng
Shenglong Ye
Wenhai Wang
Yanting Zhang
Yu Qiao
Hongjie Zhang
Xianzhong Zhao
Abstract
Recognizing symbols in architectural CAD drawings is critical for various advanced engineering applications. In this paper, we propose a novel CAD data annotation engine that leverages intrinsic attributes from systematically archived CAD drawings to automatically generate high-quality annotations, thus significantly reducing manual labeling efforts. Utilizing this engine, we construct ArchCAD-400K, a large-scale CAD dataset consisting of 413,062 chunks from 5538 highly standardized drawings, making it over 26 times larger than the largest existing CAD dataset. ArchCAD-400K boasts an extended drawing diversity and broader categories, offering line-grained annotations. Furthermore, we present a new baseline model for panoptic symbol spotting, termed Dual-Pathway Symbol Spotter (DPSS). It incorporates an adaptive fusion module to enhance primitive features with complementary image features, achieving state-of-the-art performance and enhanced robustness. Extensive experiments validate the effectiveness of DPSS, demonstrating the value of ArchCAD-400K and its potential to drive innovation in architectural design and construction.
Type
Citation
If you find this project useful in your research, please consider cite:
@article{luo2025archcad,
title={ArchCAD-400K: An Open Large-Scale Architectural CAD Dataset and New Baseline for Panoptic Symbol Spotting},
author={Luo, Ruifeng and Liu, Zhengjie and Cheng, Tianxiao and Wang, Jie and Wang, Tongjie and Wei, Xingguang and Wang, Haomin and Li, YanPeng and Chai, Fu and Cheng, Fei and others},
journal={arXiv preprint arXiv:2503.22346},
year={2025}
}