Please use this identifier to cite or link to this item:
https://hdl.handle.net/2440/111360
Citations | ||
Scopus | Web of Science® | Altmetric |
---|---|---|
?
|
?
|
Type: | Conference paper |
Title: | Scaling CNNs for high resolution volumetric reconstruction from a single image |
Author: | Johnston, A. Garg, R. Carneiro, G. Reid, I. van den Hengel, A. |
Citation: | Proceedings of the IEEE International Conference on Computer Vision Workshop (ICCVW 2017), 2017, vol.2018-January, pp.930-939 |
Publisher: | IEEE |
Publisher Place: | Piscataway, NJ |
Issue Date: | 2017 |
Series/Report no.: | IEEE International Conference on Computer Vision Workshops |
ISBN: | 9781538610350 |
ISSN: | 2473-9936 |
Conference Name: | IEEE International Conference on Computer Vision Workshop (ICCVW 2017) (22 Oct 2017 - 29 Oct 2017 : Venice, ITALY) |
Statement of Responsibility: | Adrian Johnston, Ravi Garg, Gustavo Carneiro, Ian Reid, Anton van den Hengel |
Abstract: | One of the long-standing tasks in computer vision is to use a single 2-D view of an object in order to produce its 3-D shape. Recovering the lost dimension in this process has been the goal of classic shape-from-X methods, but often the assumptions made in those works are quite limiting to be useful for general 3-D objects. This problem has been recently addressed with deep learning methods containing a 2-D (convolution) encoder followed by a 3-D (deconvolution) decoder. These methods have been reasonably successful, but memory and run time constraints impose a strong limitation in terms of the resolution of the reconstructed 3-D shapes. In particular, state-of-the-art methods are able to reconstruct 3-D shapes represented by volumes of at most 323 voxels using state-of-the-art desktop computers. In this work, we present a scalable 2-D single view to 3-D volume reconstruction deep learning method, where the 3-D (deconvolution) decoder is replaced by a simple inverse discrete cosine transform (IDCT) decoder. Our simpler architecture has an order of magnitude faster inference when reconstructing 3-D volumes compared to the convolution-deconvolutional model, an exponentially smaller memory complexity while training and testing, and a sub-linear run-time training complexity with respect to the output volume size. We show on benchmark datasets that our method can produce high-resolution reconstructions with state of the art accuracy. |
Rights: | © 2017 IEEE |
DOI: | 10.1109/ICCVW.2017.114 |
Grant ID: | http://purl.org/au-research/grants/arc/CE140100016 http://purl.org/au-research/grants/arc/FL130100102 |
Published version: | http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8234943 |
Appears in Collections: | Aurora harvest 3 Computer Science publications |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.