Selected Work

IAA-SPAICE 2025 / Acta (under review)
SpaceMind pipeline

SpaceMind: A Modular and Self-Evolving Embodied VLM Agent for Autonomous On-orbit Servicing

Aodi Wu, Haodong Han, Xubo Luo, Ruisuo Wang, Shan He, Xue Wan.

  • Modular VLM-agent framework that decomposes skills, MCP tools, and reasoning into three independently extensible dimensions.
  • Three switchable reasoning modes (Standard / ReAct / Prospective) with skill self-evolution that turns failed episodes into reusable skills.
  • 192 closed-loop runs across 5 satellites, 3 task types, 2 environments; the identical codebase transfers from UE5 simulation to a physical robot lab with 100% rendezvous success.
IROS 2026 (under review)
SpaceSense-Bench overview

SpaceSense-Bench: A Large-Scale Multi-Modal Benchmark for Spacecraft Perception and Pose Estimation

Aodi Wu, Jianhong Zuo, Zeyuan Zhao, Xubo Luo, Ruisuo Wang, Xue Wan.

  • 136 satellite models with around 70 GB of time-synchronized RGB (1024x1024), depth, and 256-beam LiDAR data built in Unreal Engine 5.
  • Dense 7-class part-level semantic labels at both pixel and point level, plus accurate 6-DoF pose ground truth.
  • Supports six tasks: 2D / 3D detection, 2D / 3D segmentation, depth estimation, 6-DoF pose, multi-modal fusion. HuggingFace downloads 2700+.
IROS 2025 / 2nd Place + Innovation Award
RoboSense method

Enhancing Vision-Language Models for Autonomous Driving through Dynamic Routing and Spatial Reasoning

Aodi Wu, Xubo Luo. IROS 2025 RoboSense Challenge Technical Report.

  • Dynamic routing module that dispatches each question to a task-specific expert prompt, eliminating cross-task interference.
  • Explicit multi-view coordinate grounding plus Chain-of-Thought / Tree-of-Thought reasoning to fix BACK-camera and left-right confusion.
  • 70.87% on Phase-1 clean data and 72.85% on Phase-2 corrupted data with Qwen2.5-VL-72B; 2nd place overall and Innovation Solution Award.
CVPR 2024 / Pose 1st + Seg 4th
CVPR SPARK challenge

CVPR 2024 SPARK Challenge — Non-cooperative Spacecraft Perception

Spacecraft pose estimation and part segmentation on synthetic and real satellite imagery.

  • Pose estimation track — 1st place (team member).
  • Part segmentation track — 4th place (team leader).
  • Integrated multiple segmentation algorithms with depth estimation, and fused absolute and relative localization on the SPARK 2024 dataset.
Master's Thesis / In-orbit Verified
DaVinci satellite work

DaVinci On-orbit Servicing Satellite — Camera Control, Visual Perception, and Monocular Navigation

Led camera exposure/focus control, visual perception, and monocular navigation, from algorithm research to software-hardware integration and launch support.

  • Cross-domain spacecraft component segmentation with edge-consistency generative networks, robust to synthetic-to-real domain gap.
  • Intelligent on-orbit exposure and focus control for space cameras, granted as an authorized invention patent (2023).
  • Monocular relative navigation for non-cooperative spacecraft over 200 m to 10 m, 5.16% mean error at 10 FPS on NVIDIA TX2, deployed on-orbit.