About

I am a Ph.D. student in Computer Application Technology at the University of Chinese Academy of Sciences, advised by Prof. Xue Wan. My training unit is the Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences.

My research focuses on embodied agents, visual perception and navigation, space robotics, and simulation-to-real validation.

Education

2023.09 - Present — Ph.D. in Computer Application Technology, University of Chinese Academy of Sciences.
2020.09 - 2023.06 — M.S. in Computer Application Technology, University of Chinese Academy of Sciences.
2016.09 - 2020.06 — B.E. in Detection Guidance and Control Technology, College of Automation, Nanjing University of Aeronautics and Astronautics.

News

2026.04 Journal extension of SpaceMind submitted to Acta Astronautica.
2026.03 Released SpaceSense-Bench on arXiv and HuggingFace; technical paper submitted to IROS 2026.
2025.10 Won 2nd place and the Innovation Solution Award in the IROS 2025 RoboSense Challenge.
2025.07 SpaceMind accepted at IAA-SPAICE 2025 — MCP-based VLM agent fusing large and small models for on-orbit servicing.
2025 Paper on cross-domain spacecraft component segmentation accepted at ICDIP 2025.
2024.03 Won 1st place in the CVPR 2024 SPARK Spacecraft Pose Estimation Challenge (team member); ranked 4th in the Segmentation track (team leader).
2023 Received an authorized invention patent on intelligent in-orbit exposure and focus control for space cameras.

Featured Demos

Bilibili

SpaceMind

Embodied VLM agent for autonomous on-orbit servicing, closing the loop across UE5 simulation and a physical robot lab with zero code change.

Paper Code Project Page YouTube Bilibili

Bilibili

SpaceSense-Bench

Large-scale multi-modal benchmark for spacecraft perception: 136 satellites, 70 GB of RGB / depth / LiDAR data, six perception tasks.

Paper Code Project Page Dataset YouTube Bilibili

Selected Work

IAA-SPAICE 2025 / Acta (under review)

SpaceMind: A Modular and Self-Evolving Embodied VLM Agent for Autonomous On-orbit Servicing

Aodi Wu, Haodong Han, Xubo Luo, Ruisuo Wang, Shan He, Xue Wan.

Paper Code Project Page YouTube Bilibili

Modular VLM-agent framework that decomposes skills, MCP tools, and reasoning into three independently extensible dimensions.
Three switchable reasoning modes (Standard / ReAct / Prospective) with skill self-evolution that turns failed episodes into reusable skills.
192 closed-loop runs across 5 satellites, 3 task types, 2 environments; the identical codebase transfers from UE5 simulation to a physical robot lab with 100% rendezvous success.

IROS 2026 (under review)

Aodi Wu, Jianhong Zuo, Zeyuan Zhao, Xubo Luo, Ruisuo Wang, Xue Wan.

Paper Code Project Page Dataset YouTube Bilibili

136 satellite models with around 70 GB of time-synchronized RGB (1024x1024), depth, and 256-beam LiDAR data built in Unreal Engine 5.
Dense 7-class part-level semantic labels at both pixel and point level, plus accurate 6-DoF pose ground truth.
Supports six tasks: 2D / 3D detection, 2D / 3D segmentation, depth estimation, 6-DoF pose, multi-modal fusion. HuggingFace downloads 2700+.

IROS 2025 / 2nd Place + Innovation Award

Enhancing Vision-Language Models for Autonomous Driving through Dynamic Routing and Spatial Reasoning

Aodi Wu, Xubo Luo. IROS 2025 RoboSense Challenge Technical Report.

Code Report

Dynamic routing module that dispatches each question to a task-specific expert prompt, eliminating cross-task interference.
Explicit multi-view coordinate grounding plus Chain-of-Thought / Tree-of-Thought reasoning to fix BACK-camera and left-right confusion.
70.87% on Phase-1 clean data and 72.85% on Phase-2 corrupted data with Qwen2.5-VL-72B; 2nd place overall and Innovation Solution Award.

CVPR 2024 / Pose 1st + Seg 4th

CVPR 2024 SPARK Challenge — Non-cooperative Spacecraft Perception

Spacecraft pose estimation and part segmentation on synthetic and real satellite imagery.

Paper Challenge Page

Pose estimation track — 1st place (team member).
Part segmentation track — 4th place (team leader).
Integrated multiple segmentation algorithms with depth estimation, and fused absolute and relative localization on the SPARK 2024 dataset.

Master's Thesis / In-orbit Verified

Led camera exposure/focus control, visual perception, and monocular navigation, from algorithm research to software-hardware integration and launch support.

ICDIP 2025 Project ICoSR 2022 Paper Patent CN 2023102948012

Cross-domain spacecraft component segmentation with edge-consistency generative networks, robust to synthetic-to-real domain gap.
Intelligent on-orbit exposure and focus control for space cameras, granted as an authorized invention patent (2023).
Monocular relative navigation for non-cooperative spacecraft over 200 m to 10 m, 5.16% mean error at 10 FPS on NVIDIA TX2, deployed on-orbit.

Honors and Awards

2025.11 1st Place, UCAS Graduate Forum, Aerospace Session.
2025.10 2nd Place + Innovation Solution Award, IROS 2025 RoboSense Challenge (team leader).
2024.03 1st Place, CVPR 2024 SPARK Spacecraft Pose Estimation Challenge (team member).
2024.03 4th Place, CVPR 2024 SPARK Spacecraft Part Segmentation Challenge (team leader).
2023 Authorized invention patent on intelligent in-orbit exposure and focus control for space cameras.
2016-2020 National Encouragement Scholarship; GPA 4.0/5.0; 2nd Prize in Jiangsu Undergraduate Electronic Design Contest; 1st Prize in NUAA Electronic Design Contest.