Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/121618
Citations
Scopus Web of Science® Altmetric
?
?
Type: Conference paper
Title: Jointly modeling static visual appearance and temporal pattern for unsupervised video hashing
Author: Li, C.
Yang, Y.
Cao, J.
Huang, Z.
Citation: Proceedings of the 2017 ACM Conference on Information and Knowledge Management, 2017, pp.9-17
Publisher: Association for Computing Machinery
Issue Date: 2017
ISBN: 9781450349185
Conference Name: International Conference on Information and Knowledge Management (CIKM) (6 Nov 2017 - 10 Nov 2017 : Singapore, SINGAPORE)
Statement of
Responsibility: 
Chao Li, Yang Yang, Jiewei Cao, Zi Huang
Abstract: Recently, hashing has been evidenced as an efficient and effective method to facilitate large-scale video retrieval. Most of existing hashing methods are based on visual features, which are expected to capture the appearance of videos. The intrinsic temporal pattern embedded in videos has also shown its discriminative power for similarity search, and is explored and utilised in some recent studies. However, how to leverage the strengths in both aspects remains unknown. In this paper, we propose to jointly model static visual appearance and temporal pattern for video hash code generation, as both of them are believed to be carrying important information for learning an effective hash function. A novel unsupervised video hashing framework is designed correspondingly, where its hash function is comprised of two encoders including the temporal encoder and the appearance encoder. The two encoders are learned by selfsupervision and designed to be able to reconstruct the temporal pattern of videos and visual appearance of frames respectively. Last but not least, for jointly learning of the two encoders, we impose three learning criteria including minimal binarization loss, balanced hash codes and independent hash codes. From the extensive experiments conducted on two large-scale video datasets (i.e. FCVID and ActivityNet), we have confirmed the superior performance of our method comparing to the state-of-the-art video hashing methods.
Keywords: Video hashing; deep learning; LSTM; learning to hash; visual content retrieval
Description: Session 1A: Multimedia
Rights: © 2017 Association for Computing Machinery
DOI: 10.1145/3132847.3133030
Grant ID: http://purl.org/au-research/grants/arc/FT130101530
http://purl.org/au-research/grants/arc/DP170103954
Published version: http://dx.doi.org/10.1145/3132847.3133030
Appears in Collections:Aurora harvest 4
Computer Science publications

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.