Embodied Visual Navigation in Habitat

Project Description

PDF Report: Supervised Learning Baselines for PointGoal Navigation in Photo-realistic indoor cluttered enviornments The aim of this work is to solve the embodied point goal navigation task in photo-realistic, indoor environments using Habitat. In this task, a virtual agent (robot) starts at a random position in an unknown environment. The agent is given the coordinates of a goal location. Primary aim of the agent is to navigate to the goal while taking the most optimal path. This is not a trivial task in realistic, cluttered environments as the agent has to traverse an environment while avoiding obstacles in the absence of a map. We present a previously under-explored paradigm in point goal navigation task i.e. using supervised learning for point goal navigation. We implement a benchmark based on Recurrent Neural Network which preserves the temporal information present in the trajectories and predicts the most optimal next action given an observation action pair as ground truth. Our experiments reveal that supervised learning shows promise for this task. We evaluated our work against various classical and deep learning baselines and report losses and accuracy along with SPL metric for these baselines. Our imitation learning approach achieved an accuracy of 56%. We have open sourced our code base, which can be accessed here: https://github.com/zubair-irshad/habitat_imitation_learning/