Visual Question Answering: a tutorial

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/116146

Scopus	Web of Science®	Altmetric
Citations
?	?

Type:	Journal article
Title:	Visual Question Answering: a tutorial
Author:	Teney, D. Wu, Q. Van Den Hengel, A.
Citation:	IEEE: Signal Processing Magazine, 2017; 34(6):63-75
Publisher:	IEEE
Issue Date:	2017
ISSN:	1053-5888 1558-0792
Statement of Responsibility:	Damien Teney, Qi Wu, and Anton van den Hengel
Abstract:	The task of visual question answering (VQA) is receiving increasing interest from researchers in both the computer vision and natural language processing fields. Tremendous advances have been seen in the field of computer vision due to the success of deep learning, in particular on low- and midlevel tasks, such as image segmentation or object recognition. These advances have fueled researchers' confidence for tackling more complex tasks that combine vision with language and high-level reasoning. VQA is a prime example of this trend. This article presents the ongoing work in the field and the current approaches to VQA based on deep learning. VQA constitutes a test for deep visual understanding and a benchmark for general artificial intelligence (AI). While the field of VQA has seen recent successes, it remains a largely unsolved task.
Rights:	© 2017 IEEE
DOI:	10.1109/MSP.2017.2739826
Published version:	http://dx.doi.org/10.1109/msp.2017.2739826
Appears in Collections:	Aurora harvest 8 Australian Institute for Machine Learning publications

Files in This Item:

File	Description	Size	Format
hdl_116146.pdf	Accepted version	4.61 MB	Adobe PDF	View/Open

Show full item record

Adelaide Research & Scholarship