Auto-Rectify Network for Unsupervised Indoor Depth Estimation

Bian, J.W.; Zhan, H.; Wang, N.; Chin, T.J.; Shen, C.; Reid, I.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/137141

Scopus	Web of Science®	Altmetric
Citations
?	?

Full metadata record

DC Field	Value	Language
dc.contributor.author	Bian, J.W.	-
dc.contributor.author	Zhan, H.	-
dc.contributor.author	Wang, N.	-
dc.contributor.author	Chin, T.J.	-
dc.contributor.author	Shen, C.	-
dc.contributor.author	Reid, I.	-
dc.date.issued	2021	-
dc.identifier.citation	IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021; 44(12)	-
dc.identifier.issn	0162-8828	-
dc.identifier.issn	1939-3539	-
dc.identifier.uri	https://hdl.handle.net/2440/137141	-
dc.description.abstract	Single-View depth estimation using the CNNs trained from unlabelled videos has shown significant promise. However, excellent results have mostly been obtained in street-scene driving scenarios, and such methods often fail in other settings, particularly indoor videos taken by handheld devices. In this work,we establish that the complex ego-motions exhibited in handheld settings are a critical obstacle for learning depth. Our fundamental analysis suggests that the rotation behaves as noise during training, as opposed to the translation (baseline) which provides supervision signals. To address the challenge, we propose a data pre-processing method that rectifies training images by removing their relative rotations for effective learning. The significantly improved performance validates our motivation. Towards end-to-end learning without requiring pre-processing, we propose an Auto-Rectify Network with novel loss functions, which can automatically learn to rectify images during training. Consequently, our results out perform the previous unsupervised SOTA method by a large margin on the challenging NYUv2 dataset.We also demonstrate the generalization of our trained model in ScanNet and Make3D, and the universality of our proposed learning method on 7-Scenes and KITTI datasets.	-
dc.description.statementofresponsibility	Jia-Wang Bian, Huangying Zhan, Naiyan Wang, Tat-Jun Chin, Chunhua Shen, and Ian Reid	-
dc.language.iso	en	-
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	-
dc.rights	© 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.	-
dc.source.uri	http://dx.doi.org/10.1109/tpami.2021.3136220	-
dc.subject	Single-view depth estimation; unsupervised learning; image rectification	-
dc.subject.mesh	Algorithms	-
dc.title	Auto-Rectify Network for Unsupervised Indoor Depth Estimation	-
dc.type	Journal article	-
dc.identifier.doi	10.1109/TPAMI.2021.3136220	-
dc.relation.grant	http://purl.org/au-research/grants/arc/FL130100102	-
pubs.publication-status	Published	-
dc.identifier.orcid	Bian, J.W. [0000-0003-2046-3363]	-
dc.identifier.orcid	Shen, C. [0000-0002-8648-8718]	-
dc.identifier.orcid	Reid, I. [0000-0001-7790-6423]	-
Appears in Collections:	Computer Science publications

Files in This Item:

There are no files associated with this item.

Show simple item record

Adelaide Research & Scholarship