Please use this identifier to cite or link to this item:
https://hdl.handle.net/2440/136664
Citations | ||
Scopus | Web of Science® | Altmetric |
---|---|---|
?
|
?
|
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Zhuang, B. | - |
dc.contributor.author | Tan, M. | - |
dc.contributor.author | Liu, J. | - |
dc.contributor.author | Liu, L. | - |
dc.contributor.author | Reid, I. | - |
dc.contributor.author | Shen, C. | - |
dc.date.issued | 2021 | - |
dc.identifier.citation | IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021; 44(10):6140-6152 | - |
dc.identifier.issn | 0162-8828 | - |
dc.identifier.issn | 1939-3539 | - |
dc.identifier.uri | https://hdl.handle.net/2440/136664 | - |
dc.description.abstract | This paper tackles the problem of training a deep convolutional neural network of both low-bitwidth weights and activations. Optimizing a low-precision network is very challenging due to the non-differentiability of the quantizer, which may result in substantial accuracy loss. To address this, we propose three practical approaches, including (i) progressive quantization; (ii) stochastic precision; and (iii) joint knowledge distillation to improve the network training. First, for progressive quantization, we propose two schemes to progressively find good local minima. Specifically, we propose to first optimize a network with quantized weights and subsequently quantize activations. This is in contrast to the traditional methods which optimize them simultaneously. Furthermore, we propose a second progressive quantization scheme which gradually decreases the bitwidth from high-precision to low-precision during training. Second, to alleviate the excessive training burden due to the multi-round training stages, we further propose a one-stage stochastic precision strategy to randomly sample and quantize sub-networks while keeping other parts in full-precision. Finally, we adopt a novel learning scheme to jointly train a full-precision model alongside the low-precision one. By doing so, the full-precision model provides hints to guide the low-precision model training and significantly improves the performance of the low-precision network. Extensive experiments on various datasets (e.g., CIFAR-100, ImageNet) show the effectiveness of the proposed methods. | - |
dc.description.statementofresponsibility | Bohan Zhuang, Mingkui Tan, Jing Liu, Lingqiao Liu, Ian Reid, and Chunhua Shen | - |
dc.language.iso | en | - |
dc.publisher | IEEE | - |
dc.rights | © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information. | - |
dc.source.uri | http://dx.doi.org/10.1109/tpami.2021.3088904 | - |
dc.subject | Quantized neural network; progressive quantization; stochastic precision; knowledge distillation; image classification | - |
dc.subject.mesh | Algorithms | - |
dc.subject.mesh | Neural Networks, Computer | - |
dc.title | Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations | - |
dc.type | Journal article | - |
dc.identifier.doi | 10.1109/TPAMI.2021.3088904 | - |
dc.relation.grant | http://purl.org/au-research/grants/arc/CE140100016 | - |
dc.relation.grant | http://purl.org/au-research/grants/arc/FL130100102 | - |
pubs.publication-status | Published | - |
dc.identifier.orcid | Reid, I. [0000-0001-7790-6423] | - |
Appears in Collections: | Computer Science publications |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.