ZUBAIR IRSHAD

Research Scientist
Toyota Research Institute

Silicon Valley, CA, USA
Email
Linkedin
Twitter
Github
Google Scholar
Youtube

ABOUT ME!

I am a Machine Learning Research Scientist at Toyota Research Institute working on deep-learning based 3D perception systems for robotics. I received my PhD in the George W. Woodruff School of Mechanical Engineering at Georgia Institute of Technology. I was advised by Dr. Zsolt Kira from the Robotics, Perception and Learning (RIPL) Lab. My PhD thesis is titled Learning 3D Robotics Perception using Inductive Priors. My thesis is available here and the dissertation defense video here. My current research focuses on 3D perception, GenAI and Multimodal AI and covers following topics:

I’m a technical reviewer for Machine Learning and Robotics Conferences including CVPR, ECCV, ICCV, ICLR, Neurips, ICRA, IROS and Siggraph and the lead organizer of RoboNeRF Workshop at ICRA’24! Below you will find my projects portfolio. You can find my updated resume here.

Affiliations

TRI
2020 – Present

Georgia Tech
2017 – Present

Fulbright
2017 – 2019

SRI International
Summer 2020

GIKI
2011-2015

NEWS

[Oct 2024]

Our paper, RoVi-Aug, has been covered by the press at TechXplore!.

[Sep 2024]

RoVi-Aug, diffusion-based data-augmentation for robotics manipulation, accepted to CORL’24!.

[Aug 2024]

Gave an invited talk at Habib University on Towards Embodied 3D Foundation Models.

[Jul 2024]

NeRF-MAE, large-scale pretraining using NeRFs, accepted to ECCV’24!

[Jun 2024]

Three IROS’24 papers on Diffusion-based 6D Pose Estimation, Language-embedded Gaussian Splat and Interactive Perception!

[Jun 2024

Attending CVPR’24 remotely. We will present two posters, NeRF-MAE and ICE-Gaussian!

[May 2024]

NeRF-MAE and ICE-Gaussian accepted to CVPR Neural Rendering Intelligence and AI4CC Workshops.

[May 2024]

Attended ICRA’24 in Yokohama, Japan to help present FSD and co-organize RoboNerF Workshop

[Apt 2024]

Gave an Invited talk at FAIR’s Embodied AI Reading Group on Towards 3D Foundation Models

[Mar 2024]

Gave an Invited talk at Stanford’s Computer Vision: Foundations and Applications class on Neural Fields in Vision and Beyond

[Jan 2024]

Started as a Research Scientist at Toyota Research Institute in Bay Area, California.

[Jan 2024]

Gave an invited talk at Shuran Song’s Robot Perception Class at Stanford. Topic: Neural Fields in Robotics and beyond.

[Jan 2024]

Our paper FSD on fast self-supervised 6D pose and shape reconstruction, accepted to ICRA’24.

[Dec 2023]

Our workshop Neural Fields in Robotics accepted to ICRA’24! Call for papers live here.

[Dec 2023]

Passed my PhD defense and received my doctorate! My thesis is titled “Learning 3D Robotics Perception using Inductive Priors”

[Aug 2023]

Accepted to ICCV’23 Doctoral Consortium!

[Jul 2023]

Our Paper, NeO 360, accepted to ICCV’23! Grateful to have trio of papers accepted to ECCV, CVPR and now ICCV

[Jul 2023]

Awesome Implicit NeRF Robotics reached 800 stars on Github

[Jul 2023]

Reviewed 19 papers this year, so far for NeurIPS’23, CVPR’23, ICCV’23 and ICRA’23

[Jun 2023]

Attended CVPR 2023, Virtually. (Poster presentation of our paper, CARTO)

[Jun 2023]

Gave invited talks on Neural Fields in Robotics (Part 1 and 2) at 3D Deep Learning Reading Group

[Apr 2023]

Started as a mentor at Fatima Fellowship, supported by Huggingface

[Apr 2023]

Passed my PhD proposal defense titled ‘Inductive biases for object and agent-centric neural 3D scene representations’

[Apr 2023]

Invited talk at Cohere for AI on Learning Object-centric Neural 3D Scene Representations

[Apr 2023]

Guest lecture at Georgia Tech’s Deep learning Class on ‘Learning Object-centric Centric Neural 3D Scene Representations’

[Feb 2023]

Our paper, CARTO, on fast articulated object reconstruction, accepted into CVPR’23

[Oct 2022]

Attended ECCV’22 virtually (Poster presentation of our paper, ShAPO)

[Aug 2022]

Awarded GRA Funding (with Dr. Zsolt Kira) from Toyota Research Institute for my PhD

[Jul 2022]

Our paper, ShAPO on categorical object reconstruction and 6D pose estiamation, accepted into ECCV’22

[May 2022]

Our paper, SASRA on semantic mapping for Vision-and-Language Navigation, accepted to ICPR’22

[May 2022]

Attended ICRA’22 in person. Gave a talk on our paper, CenterSnap

[Jan 2022]

Started my second internship at Toyota Research Institute, with Machine Learning team in Bay Area, California

[May 2021]

Attended ICRA’21 virtually. Gave a talk on our paper, Robo-VLN

[Jul 2021]

Started my first internship at Toyota Research Institute, with Robotics perception team in Bay Area, California.

[Jan 2021]

Our paper, Robo-VLN, accepted to ICRA’21

[May 2020]

Started summer internship at SRI International, with CVT team in Princeton, New Jersey

[Nov 2019]

Passed PhD Qualifying Exams at Georgia Tech

[Aug 2019]

The beginning of my PhD program

FEATURED PUBLICATIONS

NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields

Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini, Adrien Gaidon, Zsolt Kira, Rares Ambrus

European Conference on Computer Vision, ECCV 2024
CVPR Neural Rendering Intelligence Workshop, 2024

RoVi-Aug: Robot and Viewpoint Augmentation for Cross-Embodiment Robot Learning

Lawrence Chen*, Chenfeng Xu*, Karthik Dharmarajan, Muhammad Zubair Irshad, Richard Cheng, Kurt Keutzer, Masayoshi Tomizuka, Quan Vuong, Ken Goldberg

Conference on Robot Learning, CoRL 2024 (Oral Presentation)

Language-Embedded Gaussian Splats (LEGS): Incrementally Building Room-Scale Representations with a Mobile Robot

Justin Yu*, Kush Hari*, Kishore Srinivas*, Adam Rashid, Chung Min Kim, Justin Kerr, Richard Cheng, Muhammad Zubair Irshad, Ashwin Balakrishna, Thomas Kollar, Ken Goldberg

International Conference on Intelligent Robots and System, IROS 2024

DiffusionNOCS: Managing Symmetry and Uncertainty in Sim2Real Multi-Modal Category-level Pose Estimation

Takuya Ikeda, Sergey Zakharov, Tianyi Ko, Muhammad Zubair Irshad, Robert Lee, Katherine Liu, Rares Ambrus, Koichi Nishiwaki

International Conference on Intelligent Robots and System, IROS 2024
ECCV Workshop on Recovering 6D Object Pose, 2024

FSD: Fast Self-Supervised Single RGB-D to Categorical 3D Objects

Mayank Lunayach, Sergey Zakharov, Dian Chen, Rares Ambrus, Zsolt Kira, Muhammad Zubair Irshad

International Conference on Robotics and Automation, ICRA 2024

ICE-G: Image Conditional Editing of 3D Gaussian Splats

Vishnu Jaganathan, Hannah Huang, Muhammad Zubair Irshad, Varun Jampani, Amit Raj, Zsolt Kira

CVPR AI for Content Creation Workshop, 2024

NeO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes

Muhammad Zubair Irshad, Sergey Zakharov, Katherine Liu, Vitor Guizilini, Thomas Kollar, Adrien Gaidon, Zsolt Kira, Rares Ambrus

International Conference on Computer Vision, ICCV 2023

CARTO: Category and Join Agnositc Reconstruction of Articulated Objects

Mayank Lunayach, Sergey Zakharov, Dian Chen, Rares Ambrus, Zsolt Kira, Muhammad Zubair Irshad

Computer Vision and Pattern Recognition, CVPR 2023

ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization

Muhammad Zubair Irshad*, Sergey Zakharov*, Rares Ambrus, Thomas Kollar, Zsolt Kira, Adrien Gaidon

European Conference on Computer Vision, ECCV 2022

CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation

Muhammad Zubair Irshad, Thomas Kollar, Michael Laskey, Kevin Stone, Zsolt Kira

IEEE International Conference on Robotics and Automation, ICRA 2022

Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation

Muhammad Zubair Irshad, Chih-Yao Ma, Zsolt Kira

IEEE International Conference on Robotics and Automation, ICRA 2021

SASRA: Semantically-aware Spatio-Temporal Reasoning Agent for Vision-and-Language Navigation

Muhammad Zubair Irshad, Niluthpol Mithun, Zachary Seymour, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar

International Conference on Pattern Recognition, ICPR 2022

MACHINE LEARNING WORKSHOPS

RoboNerF: 1st Workshop On Neural Fields In Robotics

Muhammad Zubair Irshad, Nick Heppert, Jonathan Tremblay, Shreyas Kousik, Zsolt Kira, Abhinav Valada

IEEE International Conference on Robotics and Automation (ICRA), 2024

MACHINE LEARNING SOFTWARE & REPO

Awesome Implicit NeRF Robotics

60

    
			<img loading="lazy" decoding="async" width="500" height="500" src="https://zubairirshad.com/wp-content/uploads/2020/06/Deep-Reinforcement-Learning-Agents-e1591680477758.jpg" class="attachment-full size-full" alt="Deep Reinforcement Learning Agents" />        
        
            Deep Reinforcement Learning based control of complex robotic agents
            Details
        
			<img loading="lazy" decoding="async" width="768" height="768" src="https://zubairirshad.com/wp-content/uploads/2020/05/habitat.png" class="attachment-full size-full" alt="Habitat Point Goal Navigation" srcset="https://zubairirshad.com/wp-content/uploads/2020/05/habitat.png 768w, https://zubairirshad.com/wp-content/uploads/2020/05/habitat-150x150.png 150w, https://zubairirshad.com/wp-content/uploads/2020/05/habitat-300x300.png 300w, https://zubairirshad.com/wp-content/uploads/2020/05/habitat-370x370.png 370w" sizes="(max-width: 768px) 100vw, 768px" />        
        
            Embodied Visual Navigation in Habitat
            Details
        
			<img loading="lazy" decoding="async" width="1872" height="1164" src="https://zubairirshad.com/wp-content/uploads/2018/07/arm-5.png" class="attachment-full size-full" alt="" srcset="https://zubairirshad.com/wp-content/uploads/2018/07/arm-5.png 1872w, https://zubairirshad.com/wp-content/uploads/2018/07/arm-5-300x187.png 300w, https://zubairirshad.com/wp-content/uploads/2018/07/arm-5-768x478.png 768w, https://zubairirshad.com/wp-content/uploads/2018/07/arm-5-1024x637.png 1024w" sizes="(max-width: 1872px) 100vw, 1872px" />        
        
            Learning inverse dynamics of 7-DOF Robot Arm
            Details
        
			<img loading="lazy" decoding="async" width="1006" height="635" src="https://zubairirshad.com/wp-content/uploads/2018/07/Screenshot-from-2019-08-06-16-07-53.png" class="attachment-full size-full" alt="" srcset="https://zubairirshad.com/wp-content/uploads/2018/07/Screenshot-from-2019-08-06-16-07-53.png 1006w, https://zubairirshad.com/wp-content/uploads/2018/07/Screenshot-from-2019-08-06-16-07-53-300x189.png 300w, https://zubairirshad.com/wp-content/uploads/2018/07/Screenshot-from-2019-08-06-16-07-53-768x485.png 768w" sizes="(max-width: 1006px) 100vw, 1006px" />        
        
            Complex robot maze navigation using image classification and ROS
            Details
        
			<img loading="lazy" decoding="async" width="1557" height="853" src="https://zubairirshad.com/wp-content/uploads/2019/08/car.png" class="attachment-full size-full" alt="" srcset="https://zubairirshad.com/wp-content/uploads/2019/08/car.png 1557w, https://zubairirshad.com/wp-content/uploads/2019/08/car-300x164.png 300w, https://zubairirshad.com/wp-content/uploads/2019/08/car-768x421.png 768w, https://zubairirshad.com/wp-content/uploads/2019/08/car-1024x561.png 1024w" sizes="(max-width: 1557px) 100vw, 1557px" />        
        
            Vehicle Control for Autonomous Driving
            Details
        
			<img loading="lazy" decoding="async" width="1119" height="833" src="https://zubairirshad.com/wp-content/uploads/2019/08/env_percep-2.png" class="attachment-full size-full" alt="" srcset="https://zubairirshad.com/wp-content/uploads/2019/08/env_percep-2.png 1119w, https://zubairirshad.com/wp-content/uploads/2019/08/env_percep-2-300x223.png 300w, https://zubairirshad.com/wp-content/uploads/2019/08/env_percep-2-768x572.png 768w, https://zubairirshad.com/wp-content/uploads/2019/08/env_percep-2-1024x762.png 1024w" sizes="(max-width: 1119px) 100vw, 1119px" />        
        
            Environment perception stack for Self Driving Cars
            Details
        
			<img loading="lazy" decoding="async" width="1204" height="898" src="https://zubairirshad.com/wp-content/uploads/2018/07/vo_3.png" class="attachment-full size-full" alt="" srcset="https://zubairirshad.com/wp-content/uploads/2018/07/vo_3.png 1204w, https://zubairirshad.com/wp-content/uploads/2018/07/vo_3-300x224.png 300w, https://zubairirshad.com/wp-content/uploads/2018/07/vo_3-768x573.png 768w, https://zubairirshad.com/wp-content/uploads/2018/07/vo_3-1024x764.png 1024w" sizes="(max-width: 1204px) 100vw, 1204px" />        
        
            Visual Odometry for Autonomous Driving
            Details
        
			<img loading="lazy" decoding="async" width="517" height="512" src="https://zubairirshad.com/wp-content/uploads/2018/07/krang-1.png" class="attachment-full size-full" alt="" srcset="https://zubairirshad.com/wp-content/uploads/2018/07/krang-1.png 517w, https://zubairirshad.com/wp-content/uploads/2018/07/krang-1-150x150.png 150w, https://zubairirshad.com/wp-content/uploads/2018/07/krang-1-300x297.png 300w" sizes="(max-width: 517px) 100vw, 517px" />        
        
            End to end imitation learning of dynamically unstable systems
            Details