profile photo

Xinyu Zhang (张鑫语)

I am a third-year PhD student in Computer Science at Rutgers University, advised by Prof. Abdeslam Boularias.

Before joining Rutgers, I worked at Microsoft and Megvii (Face++) as a software engineer.

Previously, I earned a Master's at University of California San Diego (UCSD), advised by Prof. Ken Kreutz-Delgado. I earned a Bachelor at University of Science and Technology of China (USTC).

Google Scholar  •  GitHub
Email: xz653 at rutgers dot edu


Research

I am a third-year PhD student at Rutgers University. During my PhD, I have published four first-author papers in top-tier robotics and AI conferences, including RSS, CoRL, and IROS. These works have received hundreds of stars at GitHub. I have a strong foundation in computer vision, representation learning, vision-language models, and policy learning for intelligent agents.

Autoregressive Action Sequence Learning for Robotic Manipulation

Xinyu Zhang, Yuhan Liu, Haonan Chang, Liam Schramm, Abdeslam Boularias
In review
PDF  •  Code & Video

  • Robot actions as a language, but robot actions are heterogeneous and often continuous.
  • We propose chunking causal transformer to adapt autoregressive models for robot actions
  • A universal architecture that establishes new state of the art in Push-T, ALOHA, and RLBench.
Detect Everything with Few Examples

Xinyu Zhang, Yuhan Liu, Yuting Wang, Abdeslam Boularias
Conference on Robot Learning (CoRL) 2024
PDF  •  Code  •  Website

  • Existing work mix representation learning with detection
  • We don't learn representation, but focus on how to use existing pretrained ones
  • Detect by propagating ROI regions in attention map
Scaling Manipulation Learning with Visual Kinematic Chain Prediction

Xinyu Zhang, Yuhan Liu, Haonan Chang, Abdeslam Boularias
Conference on Robot Learning (CoRL) 2024
PDF  •  Code  •  Website

  • How to learn a single policy for diverse environments?
  • Use a universal, visually grounded, analytically determined action space!
  • That is, the visual projection of the robot kinematic structure
One-Shot Imitation Learning with Invariance Matching for Robotic Manipulation

Xinyu Zhang, Abdeslam Boularias
Robotics: Science and Systems (RSS) 2024
PDF  •  Code  •  Website

  • Bind 3D spatial regions to robot actions, so actions have semantics!
  • Learn to discover these "key regions" and match regions between demonstrations
  • Through region matching, we transfer robot actions to new scenes in one-shot.
Diffusion-based Affordance Prediction for Multi-modality Storage

Haonan Chang, Kowndinya Boyalakuntla, Yuhan Liu, Xinyu Zhang, Liam Schramm, Abdeslam Boularias
International Conference on Intelligent Robots and Systems (IROS) 2024
PDF  •  Code

Optical Flow boosts Unsupervised Localization and Segmentation

Xinyu Zhang, Abdeslam Boularias
International Conference on Intelligent Robots and Systems (IROS) 2023
PDF  •  Code

  • Make DINO features more object-aware
  • By using optical flow as regularization, i.e., similar local flow yields similar local features
Learning Low-Precision Structured Subnetworks Using Joint Layerwise Channel Pruning and Uniform Quantization

Xinyu Zhang, Ian Colbert, Srinjoy Das
MDPI Journal Apply Science 2022
PDF  •  Code

  • Prune layers in topological orders, instead of all at once
  • Because the neuron importance heavily depends on the sparsity of previous layers.
Diversity transfer network for few-shot learning

Mengting Chen, Yuxin Fang, Xinggang Wang, Heng Luo, Yifeng Geng, Xinyu Zhang, Chang Huang, Wenyu Liu, Bo Wang
AAAI Conference on Artificial Intelligence 2020
PDF  •  Code


Layout from Baichuan Huang. Last update on Oct 2024