Hi! My name is Haomin Wang (็ๆๆป). I am currently a Ph.D. student in the School of Artificial Intelligence at Shanghai Jiao Tong University (SJTU), focusing on Multimodal Large Language Models and World Models.
My recent research primarily focuses on the application of MLLMs to vector graphics, including SVG understanding, generation, and editing. More recently, I have also been exploring world models in the context of embodied intelligence.
I am currently looking for collaborations and new opportunities. Feel free to contact me via email: kiyotakawang@sjtu.edu.cn.
Hereโs my CV.
๐ฅ News
- 2026.06: ย ๐ CTRL-S is accepted by ECCV 2026!
- 2026.03: ย ๐ We released CTRL-S, welcome to have a try!
- 2026.01: ย ๐ InternSVG and InternSpatial are accepted by ICLR 2026!
- 2025.10: ย ๐ We released InternSVG, welcome to have a try!
- 2025.09: ย ๐ VecFormer and ArchCAD-400K are accepted by NeurIPS 2025!
- 2025.08: ย ๐ Our team released InternVL 3.5, welcome to have a try!
๐ Selected Publications

Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning
Haomin Wang, Qi Wei, Qianli Ma, Shengyuan Ding, Jinhui Yin, Kai Chen, Hongjie Zhang
CTRL-S is a reinforcement learning framework for reliable SVG generation with explicit chain-of-thought reasoning. It introduces SVG-Sophia, a high-quality dataset covering SVG code refinement, text-to-SVG, and image-to-SVG tasks. By combining structured SVG reasoning, group-level code generation, and multi-reward optimization over visual fidelity, text-image alignment, format validity, and code efficiency, CTRL-S improves both SVG quality and generalization across diverse scenarios.

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models
Haomin Wang*, Jinhui Yin*, Qi Wei*, Wenguang Zeng, Lixin Gu, Shenglong Ye, Zhangwei Gao, Yaohui Wang, Yanting Zhang, Yuanqi Li, Yanwen Guo, Wenhai Wang, Kai Chen, Yu Qiao, Hongjie Zhang
PDF | Code | Model | Dataset | Benchmark | Homepage
InternSVG is a unified dataโbenchmarkโmodel suite for SVG understanding, editing, generation. It introduces SAgoge, a large-scale multi-domain SVG dataset, and SArena, a comprehensive benchmark for evaluating multimodal SVG capabilities. Built on VLMs with SVG-specific tokenization and two stage training, InternSVG-8B model achieves strong performance across diverse SVG tasks.

Point or Line? Using Line-based Representation for Panoptic Symbol Spotting in CAD Drawings
Xingguang Wei*, Haomin Wang*, Shenglong Ye, Ruifeng Luo, Yanting Zhang, Lixin Gu, Jifeng Dai, Yu Qiao, Wenhai Wang, Hongjie Zhang
VecFormer uses a line-based representation of primitives for panoptic symbol spotting in floor plan CAD drawings, achieving a new state-of-the-art in the FloorPlanCAD dataset.

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale
Intern-S1-Pro Team, Shanghai AI Laboratory
Intern-S1-Pro is a trillion-parameter scientific multimodal foundation model that strengthens both general reasoning and domain-specific scientific intelligence. With advanced agent capabilities and support for over 100 specialized tasks across fields such as chemistry, materials, life sciences, and earth sciences, it serves as a specializable generalist for scientific discovery and multimodal understanding.

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
InternVL Team, Shanghai AI Laboratory
InternVL 3.5 is a new open-source MLLM family that improves multimodal reasoning, versatility, and inference efficiency. It introduces Cascade RL for stronger coarse-to-fine reasoning, ViR for dynamic visual-token resolution, and DvD for efficient vision-language deployment, achieving major gains in reasoning performance and speed. The model also extends to GUI interaction and embodied agent capabilities, reaching SOTA performance among open-source MLLMs.
๐ Honors and Awards
- Outstanding Graduate, Nanjing University.
- Outstanding Student, Nanjing University.
- China Merchants Bank Scholarship, Nanjing University.
- Ruli Scholarship, Nanjing University.
๐ Educations
- 2025.09 - now, Ph.D. in School of Artificial Intelligence, Shanghai Jiao Tong University.
- 2021.09 - 2025.06, B.Eng. in Software Engineering, Nanjing University. (GPA 4.59/5.00, Top 3%)
๐ป Internships
- 2024.07 - 2026.06, Research Intern, Shanghai AI Laboratory.