Selected Work
IAA-SPAICE 2025 / Acta (under review)
SpaceMind: A Modular and Self-Evolving Embodied VLM Agent for Autonomous On-orbit Servicing
Aodi Wu, Haodong Han, Xubo Luo, Ruisuo Wang, Shan He, Xue Wan.
- Modular VLM-agent framework that decomposes skills, MCP tools, and reasoning into three independently extensible dimensions.
- Three switchable reasoning modes (Standard / ReAct / Prospective) with skill self-evolution that turns failed episodes into reusable skills.
- 192 closed-loop runs across 5 satellites, 3 task types, 2 environments; the identical codebase transfers from UE5 simulation to a physical robot lab with 100% rendezvous success.
IROS 2026 (under review)
SpaceSense-Bench: A Large-Scale Multi-Modal Benchmark for Spacecraft Perception and Pose Estimation
Aodi Wu, Jianhong Zuo, Zeyuan Zhao, Xubo Luo, Ruisuo Wang, Xue Wan.
- 136 satellite models with around 70 GB of time-synchronized RGB (1024x1024), depth, and 256-beam LiDAR data built in Unreal Engine 5.
- Dense 7-class part-level semantic labels at both pixel and point level, plus accurate 6-DoF pose ground truth.
- Supports six tasks: 2D / 3D detection, 2D / 3D segmentation, depth estimation, 6-DoF pose, multi-modal fusion. HuggingFace downloads 2700+.
IROS 2025 / 2nd Place + Innovation Award
Enhancing Vision-Language Models for Autonomous Driving through Dynamic Routing and Spatial Reasoning
Aodi Wu, Xubo Luo. IROS 2025 RoboSense Challenge Technical Report.
- Dynamic routing module that dispatches each question to a task-specific expert prompt, eliminating cross-task interference.
- Explicit multi-view coordinate grounding plus Chain-of-Thought / Tree-of-Thought reasoning to fix BACK-camera and left-right confusion.
- 70.87% on Phase-1 clean data and 72.85% on Phase-2 corrupted data with Qwen2.5-VL-72B; 2nd place overall and Innovation Solution Award.
CVPR 2024 / Pose 1st + Seg 4th
CVPR 2024 SPARK Challenge — Non-cooperative Spacecraft Perception
Spacecraft pose estimation and part segmentation on synthetic and real satellite imagery.
- Pose estimation track — 1st place (team member).
- Part segmentation track — 4th place (team leader).
- Integrated multiple segmentation algorithms with depth estimation, and fused absolute and relative localization on the SPARK 2024 dataset.
Master's Thesis / In-orbit Verified
DaVinci On-orbit Servicing Satellite — Camera Control, Visual Perception, and Monocular Navigation
Led camera exposure/focus control, visual perception, and monocular navigation, from algorithm research to software-hardware integration and launch support.
- Cross-domain spacecraft component segmentation with edge-consistency generative networks, robust to synthetic-to-real domain gap.
- Intelligent on-orbit exposure and focus control for space cameras, granted as an authorized invention patent (2023).
- Monocular relative navigation for non-cooperative spacecraft over 200 m to 10 m, 5.16% mean error at 10 FPS on NVIDIA TX2, deployed on-orbit.