Please use this identifier to cite or link to this item:
https://hdl.handle.net/2440/135905
Citations | ||
Scopus | Web of Science® | Altmetric |
---|---|---|
?
|
?
|
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kazemi Moghaddam, M. | - |
dc.contributor.author | Abbasnejad, E. | - |
dc.contributor.author | Wu, Q. | - |
dc.contributor.author | Qinfeng Shi, J. | - |
dc.contributor.author | Van Den Hengel, A. | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2022), 2022, pp.3401-3410 | - |
dc.identifier.isbn | 9781665409155 | - |
dc.identifier.issn | 2472-6737 | - |
dc.identifier.uri | https://hdl.handle.net/2440/135905 | - |
dc.description.abstract | In this work, we present a method to improve the efficiency and robustness of the previous model-free Reinforcement Learning (RL) algorithms for the task of object-goal visual navigation. Despite achieving state-of-the-art results, one of the major drawbacks of those approaches is the lack of a forward model that informs the agent about the potential consequences of its actions, i.e., being model-free. In this work, we augment the model-free RL with such a forward model that can predict a representation of a future state, from the beginning of a navigation episode, if the episode were to be successful. Furthermore, in order for efficient training, we develop an algorithm to integrate a replay buffer into the model-free RL that alternates between training the policy and the forward model. We call our agent ForeSI; ForeSI is trained to imagine a future latent state that leads to success. By explicitly imagining such a state, during the navigation, our agent is able to take better actions leading to two main advantages: first, in the absence of an object detector, ForeSI presents a more robust policy, i.e., it leads to about 5% absolute improvement on the Success Rate (SR); second, when combined with an off the-shelf object detector to help better distinguish the target object, our method leads to about 3% absolute improvement on the SR and about 2% absolute improvement on Success weighted by inverse Path Length (SPL), i.e., presents higher efficiency. | - |
dc.description.statementofresponsibility | Mahdi Kazemi Moghaddam, Ehsan Abbasnejad, Qi Wu, Javen Qinfeng shi and Anton Van Den Hengel | - |
dc.language.iso | en | - |
dc.publisher | IEEE | - |
dc.relation.ispartofseries | IEEE Winter Conference on Applications of Computer Vision | - |
dc.rights | ©2021 IEEE | - |
dc.source.uri | https://ieeexplore.ieee.org/xpl/conhome/9706406/proceeding | - |
dc.subject | Vision for Robotics Multimedia Applications; Vision and Languages; Vision Systems and Applications; Visual Reasoning; Analysis and Understanding | - |
dc.title | ForeSI: Success-Aware Visual Navigation Agent | - |
dc.type | Conference paper | - |
dc.contributor.conference | IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (4 Jan 2022 - 8 Jan 2022 : Waikoloa, Hawaii) | - |
dc.identifier.doi | 10.1109/WACV51458.2022.00346 | - |
dc.publisher.place | Online | - |
pubs.publication-status | Published | - |
dc.identifier.orcid | Kazemi Moghaddam, M. [0000-0001-6544-1120] | - |
dc.identifier.orcid | Wu, Q. [0000-0003-3631-256X] | - |
dc.identifier.orcid | Van Den Hengel, A. [0000-0003-3027-8364] | - |
Appears in Collections: | Australian Institute for Machine Learning publications |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.