AIML at VQA-Med 2020: Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering

Liao, Z.; Wu, Q.; Shen, C.; Van Den Hengel, A.; Verjans, J.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/132209

Full metadata record

DC Field	Value	Language
dc.contributor.author	Liao, Z.	-
dc.contributor.author	Wu, Q.	-
dc.contributor.author	Shen, C.	-
dc.contributor.author	Van Den Hengel, A.	-
dc.contributor.author	Verjans, J.	-
dc.contributor.editor	Cappellato, L.	-
dc.contributor.editor	Eickhoff, C.	-
dc.contributor.editor	Ferro, N.	-
dc.contributor.editor	Névéol, A.	-
dc.date.issued	2020	-
dc.identifier.citation	CEUR Workshop Proceedings, 2020 / Cappellato, L., Eickhoff, C., Ferro, N., Névéol, A. (ed./s), vol.2696, pp.1-14	-
dc.identifier.issn	1613-0073	-
dc.identifier.uri	https://hdl.handle.net/2440/132209	-
dc.description	Session - ImageCLEF: Multimedia Retrieval in Medicine, Lifelogging, and Internet.	-
dc.description.abstract	In this paper, we describe our contribution to the 2020 ImageCLEF Medical Domain Visual Question Answering (VQA-Med) challenge. Our submissions scored first place on the VQA challenge leaderboard, and also the first place on the associated Visual Question Generation (VQG) challenge leaderboard. Our VQA approach was developed using a knowledge inference methodology called Skeleton-based Sentence Mapping (SSM). Using all the questions and answers, we derived a set of classifiable tasks and inferred the corresponding labels. As a result, we were able to transform the VQA task into a multi-task image classification problem which allowed us to focus on the image modelling aspect. We further propose a class-wise and task-wise normalization facilitating optimization of multiple tasks in a single network. This enabled us to apply a multi-scale and multi-architecture ensemble strategy for robust prediction. Lastly, we positioned the VQG task as a transfer learning problem using the VGA task trained models. The VQG task was also solved using classification.	-
dc.description.statementofresponsibility	Zhibin Liao, Qi Wu, Chunhua Shen, Anton van den Hengel, and Johan Verjans	-
dc.language.iso	en	-
dc.publisher	CEUR-WS	-
dc.relation.ispartofseries	CEUR Workshop Proceedings; 2696	-
dc.rights	Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).	-
dc.source.uri	http://ceur-ws.org/Vol-2696	-
dc.subject	Visual Question Answering; Visual Question Generation; Knowledge Inference; Deep Neural Networks; Skeleton-based Sentence Mapping; Class-wise and Task-wise Normalization	-
dc.title	AIML at VQA-Med 2020: Knowledge inference via a skeleton-based sentence mapping approach for medical domain visual question answering	-
dc.type	Conference paper	-
dc.contributor.conference	International Conference of the CLEF Initiative (CLEF) (22 Sep 2020 - 25 Sep 2020 : virtual online)	-
dc.publisher.place	online	-
pubs.publication-status	Published	-
dc.identifier.orcid	Liao, Z. [0000-0001-9965-4511]	-
dc.identifier.orcid	Wu, Q. [0000-0003-3631-256X]	-
dc.identifier.orcid	Van Den Hengel, A. [0000-0003-3027-8364]	-
dc.identifier.orcid	Verjans, J. [0000-0002-8336-6774]	-
Appears in Collections:	Australian Institute for Machine Learning publications

Files in This Item:

File	Description	Size	Format
hdl_132209.pdf	Published version	501.4 kB	Adobe PDF	View/Open

Show simple item record

Adelaide Research & Scholarship