About

I am a Ph.D. student in Computer Application Technology at the University of Chinese Academy of Sciences, advised by Prof. Xue Wan. My training unit is the Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences.

My research focuses on embodied agents, visual perception and navigation, space robotics, and simulation-to-real validation.

Education

  • 2023.09 - PresentPh.D. in Computer Application Technology, University of Chinese Academy of Sciences.
  • 2020.09 - 2023.06M.S. in Computer Application Technology, University of Chinese Academy of Sciences.
  • 2016.09 - 2020.06B.E. in Detection Guidance and Control Technology, College of Automation, Nanjing University of Aeronautics and Astronautics.

News

  • 2026.04 Journal extension of SpaceMind submitted to Acta Astronautica.
  • 2026.03 Released SpaceSense-Bench on arXiv and HuggingFace; technical paper submitted to IROS 2026.
  • 2025.10 Won 2nd place and the Innovation Solution Award in the IROS 2025 RoboSense Challenge.
  • 2025.07 SpaceMind accepted at IAA-SPAICE 2025 — MCP-based VLM agent fusing large and small models for on-orbit servicing.
  • 2025 Paper on cross-domain spacecraft component segmentation accepted at ICDIP 2025.
  • 2024.03 Won 1st place in the CVPR 2024 SPARK Spacecraft Pose Estimation Challenge (team member); ranked 4th in the Segmentation track (team leader).
  • 2023 Received an authorized invention patent on intelligent in-orbit exposure and focus control for space cameras.

Featured Demos

Selected Work

IAA-SPAICE 2025 / Acta (under review)
SpaceMind pipeline

SpaceMind: A Modular and Self-Evolving Embodied VLM Agent for Autonomous On-orbit Servicing

Aodi Wu, Haodong Han, Xubo Luo, Ruisuo Wang, Shan He, Xue Wan.

  • Modular VLM-agent framework that decomposes skills, MCP tools, and reasoning into three independently extensible dimensions.
  • Three switchable reasoning modes (Standard / ReAct / Prospective) with skill self-evolution that turns failed episodes into reusable skills.
  • 192 closed-loop runs across 5 satellites, 3 task types, 2 environments; the identical codebase transfers from UE5 simulation to a physical robot lab with 100% rendezvous success.
IROS 2026 (under review)
SpaceSense-Bench overview

SpaceSense-Bench: A Large-Scale Multi-Modal Benchmark for Spacecraft Perception and Pose Estimation

Aodi Wu, Jianhong Zuo, Zeyuan Zhao, Xubo Luo, Ruisuo Wang, Xue Wan.

  • 136 satellite models with around 70 GB of time-synchronized RGB (1024x1024), depth, and 256-beam LiDAR data built in Unreal Engine 5.
  • Dense 7-class part-level semantic labels at both pixel and point level, plus accurate 6-DoF pose ground truth.
  • Supports six tasks: 2D / 3D detection, 2D / 3D segmentation, depth estimation, 6-DoF pose, multi-modal fusion. HuggingFace downloads 2700+.
IROS 2025 / 2nd Place + Innovation Award
RoboSense method

Enhancing Vision-Language Models for Autonomous Driving through Dynamic Routing and Spatial Reasoning

Aodi Wu, Xubo Luo. IROS 2025 RoboSense Challenge Technical Report.

  • Dynamic routing module that dispatches each question to a task-specific expert prompt, eliminating cross-task interference.
  • Explicit multi-view coordinate grounding plus Chain-of-Thought / Tree-of-Thought reasoning to fix BACK-camera and left-right confusion.
  • 70.87% on Phase-1 clean data and 72.85% on Phase-2 corrupted data with Qwen2.5-VL-72B; 2nd place overall and Innovation Solution Award.
CVPR 2024 / Pose 1st + Seg 4th
CVPR SPARK challenge

CVPR 2024 SPARK Challenge — Non-cooperative Spacecraft Perception

Spacecraft pose estimation and part segmentation on synthetic and real satellite imagery.

  • Pose estimation track — 1st place (team member).
  • Part segmentation track — 4th place (team leader).
  • Integrated multiple segmentation algorithms with depth estimation, and fused absolute and relative localization on the SPARK 2024 dataset.
Master's Thesis / In-orbit Verified
DaVinci satellite work

DaVinci On-orbit Servicing Satellite — Camera Control, Visual Perception, and Monocular Navigation

Led camera exposure/focus control, visual perception, and monocular navigation, from algorithm research to software-hardware integration and launch support.

  • Cross-domain spacecraft component segmentation with edge-consistency generative networks, robust to synthetic-to-real domain gap.
  • Intelligent on-orbit exposure and focus control for space cameras, granted as an authorized invention patent (2023).
  • Monocular relative navigation for non-cooperative spacecraft over 200 m to 10 m, 5.16% mean error at 10 FPS on NVIDIA TX2, deployed on-orbit.

Honors and Awards

  • 2025.11 1st Place, UCAS Graduate Forum, Aerospace Session.
  • 2025.10 2nd Place + Innovation Solution Award, IROS 2025 RoboSense Challenge (team leader).
  • 2024.03 1st Place, CVPR 2024 SPARK Spacecraft Pose Estimation Challenge (team member).
  • 2024.03 4th Place, CVPR 2024 SPARK Spacecraft Part Segmentation Challenge (team leader).
  • 2023 Authorized invention patent on intelligent in-orbit exposure and focus control for space cameras.
  • 2016-2020 National Encouragement Scholarship; GPA 4.0/5.0; 2nd Prize in Jiangsu Undergraduate Electronic Design Contest; 1st Prize in NUAA Electronic Design Contest.

Research Interests

  • Space AI and autonomous on-orbit servicing
  • Embodied vision-language agents and tool-use reasoning
  • Multimodal spacecraft perception (RGB / depth / LiDAR)
  • Relative navigation and autonomous control
  • Simulation-to-real system validation

Contact

关于我

我是 中国科学院大学 计算机应用技术博士生,导师为万雪研究员,培养单位是 中科院空间应用工程与技术中心

研究方向关注 具身智能体视觉感知与导航空间机器人仿真到真实验证

教育经历

  • 2023.09 - 至今 — 中国科学院大学,计算机应用技术博士
  • 2020.09 - 2023.06 — 中国科学院大学,计算机应用技术硕士
  • 2016.09 - 2020.06 — 南京航空航天大学自动化学院,探测制导与控制技术学士

动态

  • 2026.04 SpaceMind 期刊扩展版投稿至 Acta Astronautica
  • 2026.03 SpaceSense-Bench 在 arXiv 与 HuggingFace 发布,论文投稿 IROS 2026
  • 2025.10 IROS 2025 RoboSense Challenge亚军创新解决方案奖
  • 2025.07 SpaceMindIAA-SPAICE 2025 接收 —— 面向在轨服务、融合大模型与小模型的 MCP 智能体。
  • 2025 关于 跨域航天器部件分割 的工作被 ICDIP 2025 接收。
  • 2024.03 CVPR 2024 SPARK 航天器位姿估计挑战赛冠军队员);分割赛道 第 4 名队长)。
  • 2023空间相机在轨智能曝光与对焦控制 国家发明专利授权。

视频 Demo

代表性工作

IAA-SPAICE 2025 / Acta(在审)
SpaceMind 框架

SpaceMind:面向在轨服务的模块化自演化具身 VLM 智能体

武奥迪, 韩浩东, 雒勖博, 王睿索, 何山, 万雪.

  • 模块化 VLM 智能体框架,将技能、MCP 工具与推理三维度解耦,可独立扩展。
  • 三种可切换推理模式(Standard / ReAct / Prospective)+ 技能自演化机制,把失败经验沉淀为可复用技能。
  • 在 5 颗卫星、3 类任务、2 个环境下完成 192 次闭环;UE5 仿真与真实机器人实验室共享同一份代码,物理迁移 100% 成功。
IROS 2026(在审)
SpaceSense-Bench 概览

SpaceSense-Bench:面向航天器感知与位姿估计的大规模多模态基准

武奥迪, 左健宏, 赵泽渊, 雒勖博, 王睿索, 万雪.

  • 基于 UE5 构建的 136 颗卫星、约 70 GB 时间同步 RGB(1024×1024)/ 深度 / 256 线 LiDAR 数据。
  • 像素级与点云级 7 类部件语义标注,附带高精度 6-DoF 位姿真值。
  • 支持 2D/3D 检测、2D/3D 分割、深度估计、6-DoF 位姿、多模态融合 6 类任务;HuggingFace 下载量 2700+。
IROS 2025 / 亚军 + 创新解决方案奖
RoboSense 方法

基于动态路由与空间推理的自动驾驶 VLM 增强方案

武奥迪, 雒勖博. IROS 2025 RoboSense Challenge 技术报告.

  • 动态路由模块:把每类问题分发到对应专家提示,消除任务间提示干扰。
  • 显式多视图坐标系建模 + Chain-of-Thought / Tree-of-Thought 推理,修正后视相机与左右方位混淆问题。
  • 基于 Qwen2.5-VL-72B,Phase-1 干净数据 70.87%,Phase-2 受扰数据 72.85%,最终总成绩亚军,并获创新解决方案奖。
CVPR 2024 / 位姿冠军 + 分割第 4
CVPR SPARK 挑战赛

CVPR 2024 SPARK 挑战赛 —— 非合作航天器感知

面向仿真与真实卫星图像的航天器位姿估计与部件分割。

  • 位姿估计赛道:获得第 1 名(队员)。
  • 部件分割赛道:获得第 4 名(队长)。
  • 在 SPARK 2024 数据集上集成多分割算法与深度估计,融合绝对定位与相对定位。
硕士工作 / 在轨验证
达芬奇卫星工作

达芬奇空间在轨服务卫星——相机控制、视觉感知与单目导航

负责相机曝光对焦控制、视觉感知与单目导航,从算法研发到软硬件集成与发射保障。

  • 基于边缘一致性生成网络的跨域航天器部件分割,缓解仿真到真实的域差。
  • 空间相机在轨智能曝光与对焦控制方法,已获国家发明专利授权(2023)。
  • 非合作目标单目相对导航,200 m–10 m 范围内 5.16% 平均误差,NVIDIA TX2 上 10 FPS,已应用于在轨任务。

荣誉与奖项

  • 2025.11 冠军,国科大研究生论坛航空航天分论坛。
  • 2025.10 亚军 + 创新解决方案奖,IROS 2025 RoboSense Challenge(队长)。
  • 2024.03 冠军,CVPR 2024 SPARK 航天器位姿估计挑战赛(队员)。
  • 2024.03 第 4 名,CVPR 2024 SPARK 航天器部件分割挑战赛(队长)。
  • 2023空间相机在轨智能曝光与对焦控制 国家发明专利授权。
  • 2016-2020 国家励志奖学金;GPA 4.0/5.0;江苏省电赛二等奖(无线充电小车爬坡);南航校电赛一等奖(单片机编程)。

研究兴趣

  • 空间智能与自主在轨服务
  • 具身视觉语言智能体与工具调用推理
  • 多模态航天器感知(RGB / 深度 / LiDAR)
  • 相对导航与自主控制
  • 仿真到真实的系统验证

联系方式