AE TextSpotter: Learning visual and linguistic representation for ambiguous text spotting

Wang, W.; Liu, X.; Ji, X.; Xie, E.; Liang, D.; Yang, Z.B.; Lu, T.; Shen, C.; Luo, P.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/132177

Scopus	Web of Science®	Altmetric
Citations
?	?

Full metadata record

DC Field	Value	Language
dc.contributor.author	Wang, W.	-
dc.contributor.author	Liu, X.	-
dc.contributor.author	Ji, X.	-
dc.contributor.author	Xie, E.	-
dc.contributor.author	Liang, D.	-
dc.contributor.author	Yang, Z.B.	-
dc.contributor.author	Lu, T.	-
dc.contributor.author	Shen, C.	-
dc.contributor.author	Luo, P.	-
dc.date.issued	2020	-
dc.identifier.citation	Lecture Notes in Artificial Intelligence, 2020, vol.12359, pp.457-473	-
dc.identifier.isbn	3030585670	-
dc.identifier.isbn	9783030585679	-
dc.identifier.issn	0302-9743	-
dc.identifier.issn	1611-3349	-
dc.identifier.uri	https://hdl.handle.net/2440/132177	-
dc.description.abstract	Scene text spotting aims to detect and recognize the entire word or sentence with multiple characters in natural images. It is still challenging because ambiguity often occurs when the spacing between characters is large or the characters are evenly spread in multiple rows and columns, making many visually plausible groupings of the characters (e.g. “BERLIN” is incorrectly detected as “BERL” and “IN” in Fig. 1(c)). Unlike previous works that merely employed visual features for text detection, this work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter), which learns both visual and linguistic features to significantly reduce ambiguity in text detection. The proposed AE TextSpotter has three important benefits. 1) The linguistic representation is learned together with the visual representation in a framework. To our knowledge, it is the first time to improve text detection by using a language model. 2) A carefully designed language module is utilized to reduce the detection confidence of incorrect text lines, making them easily pruned in the detection stage. 3) Extensive experiments show that AE TextSpotter outperforms other state-of-theart methods by a large margin. For example, we carefully select a set of extremely ambiguous samples from the IC19-ReCTS dataset, where our approach surpasses other methods by more than 4%.	-
dc.description.statementofresponsibility	Wenhai Wang, Xuebo Liu, Xiaozhong Ji, Enze Xie, Ding Liang, ZhiBo Yang, Tong Lu, B, Chunhua Shen, and Ping Luo	-
dc.language.iso	en	-
dc.publisher	Springer	-
dc.relation.ispartofseries	Lecture Notes in Computer Science; 12359	-
dc.rights	© Springer Nature Switzerland AG 2020	-
dc.source.uri	https://link.springer.com/book/10.1007/978-3-030-58568-6	-
dc.subject	Text spotting; Text detection; Text recognition; Text detection ambiguity	-
dc.title	AE TextSpotter: Learning visual and linguistic representation for ambiguous text spotting	-
dc.type	Conference paper	-
dc.contributor.conference	European Conference on Computer Vision (ECCV) (23 Aug 2020 - 28 Aug 2020 : virtual online)	-
dc.identifier.doi	10.1007/978-3-030-58568-6_27	-
dc.publisher.place	Cham, Switzerland	-
pubs.publication-status	Published	-
dc.identifier.orcid	Shen, C. [0000-0002-8648-8718]	-
Appears in Collections:	Computer Science publications

Files in This Item:

There are no files associated with this item.

Show simple item record

Adelaide Research & Scholarship