Aodi Wu

IAA-SPAICE 2025 / Acta (under review)

SpaceMind: A Modular and Self-Evolving Embodied VLM Agent for Autonomous On-orbit Servicing

Aodi Wu, Haodong Han, Xubo Luo, Ruisuo Wang, Shan He, Xue Wan.

Paper Code Project Page YouTube Bilibili

Modular VLM-agent framework that decomposes skills, MCP tools, and reasoning into three independently extensible dimensions.
Three switchable reasoning modes (Standard / ReAct / Prospective) with skill self-evolution that turns failed episodes into reusable skills.
192 closed-loop runs across 5 satellites, 3 task types, 2 environments; the identical codebase transfers from UE5 simulation to a physical robot lab with 100% rendezvous success.

IROS 2026 (under review)

Aodi Wu, Jianhong Zuo, Zeyuan Zhao, Xubo Luo, Ruisuo Wang, Xue Wan.

Paper Code Project Page Dataset YouTube Bilibili

136 satellite models with around 70 GB of time-synchronized RGB (1024x1024), depth, and 256-beam LiDAR data built in Unreal Engine 5.
Dense 7-class part-level semantic labels at both pixel and point level, plus accurate 6-DoF pose ground truth.
Supports six tasks: 2D / 3D detection, 2D / 3D segmentation, depth estimation, 6-DoF pose, multi-modal fusion. HuggingFace downloads 2700+.

IROS 2025 / 2nd Place + Innovation Award

Enhancing Vision-Language Models for Autonomous Driving through Dynamic Routing and Spatial Reasoning

Aodi Wu, Xubo Luo. IROS 2025 RoboSense Challenge Technical Report.

Code Report

Dynamic routing module that dispatches each question to a task-specific expert prompt, eliminating cross-task interference.
Explicit multi-view coordinate grounding plus Chain-of-Thought / Tree-of-Thought reasoning to fix BACK-camera and left-right confusion.
70.87% on Phase-1 clean data and 72.85% on Phase-2 corrupted data with Qwen2.5-VL-72B; 2nd place overall and Innovation Solution Award.

CVPR 2024 / Pose 1st + Seg 4th

CVPR 2024 SPARK Challenge — Non-cooperative Spacecraft Perception

Spacecraft pose estimation and part segmentation on synthetic and real satellite imagery.

Paper Challenge Page

Pose estimation track — 1st place (team member).
Part segmentation track — 4th place (team leader).
Integrated multiple segmentation algorithms with depth estimation, and fused absolute and relative localization on the SPARK 2024 dataset.

Master's Thesis / In-orbit Verified

Led camera exposure/focus control, visual perception, and monocular navigation, from algorithm research to software-hardware integration and launch support.

ICDIP 2025 Project ICoSR 2022 Paper Patent CN 2023102948012

Cross-domain spacecraft component segmentation with edge-consistency generative networks, robust to synthetic-to-real domain gap.
Intelligent on-orbit exposure and focus control for space cameras, granted as an authorized invention patent (2023).
Monocular relative navigation for non-cooperative spacecraft over 200 m to 10 m, 5.16% mean error at 10 FPS on NVIDIA TX2, deployed on-orbit.

Selected Work

SpaceMind: A Modular and Self-Evolving Embodied VLM Agent for Autonomous On-orbit Servicing

Enhancing Vision-Language Models for Autonomous Driving through Dynamic Routing and Spatial Reasoning

CVPR 2024 SPARK Challenge — Non-cooperative Spacecraft Perception

DaVinci On-orbit Servicing Satellite — Camera Control, Visual Perception, and Monocular Navigation

Aodi Wu

Selected Work

SpaceMind: A Modular and Self-Evolving Embodied VLM Agent for Autonomous On-orbit Servicing

SpaceSense-Bench: A Large-Scale Multi-Modal Benchmark for Spacecraft Perception and Pose Estimation

Enhancing Vision-Language Models for Autonomous Driving through Dynamic Routing and Spatial Reasoning

CVPR 2024 SPARK Challenge — Non-cooperative Spacecraft Perception

DaVinci On-orbit Servicing Satellite — Camera Control, Visual Perception, and Monocular Navigation