Please use this identifier to cite or link to this item:
https://hdl.handle.net/2440/109414
Citations | ||
Scopus | Web of Science® | Altmetric |
---|---|---|
?
|
?
|
Type: | Conference paper |
Title: | CITPM: A cluster-based iterative topical phrase mining framework |
Author: | Li, B. Wang, B. Zhou, R. Yang, X. Liu, C. |
Citation: | Lecture Notes in Artificial Intelligence, 2016, vol.9642, pp.197-213 |
Publisher: | Springer |
Issue Date: | 2016 |
Series/Report no.: | Lecture Notes in Computer Science |
ISBN: | 9783319320243 |
ISSN: | 0302-9743 1611-3349 |
Conference Name: | International Conference of Database Systems for Advanced Applications (DASFAA) (16 Apr 2016 - 19 Apr 2016 : Dallas, USTX) |
Statement of Responsibility: | Bing Li, BinWang, Rui Zhou, Xiaochun Yang, B, and Chengfei Liu |
Abstract: | A phrase is a natural, meaningful, essential semantic unit. In topic modeling, visualizing phrases for individual topics is an effective way to explore and understand unstructured text corpora. Unfortunately, existing approaches predominately rely on the general distributional features between topics and phrases on an entire corpus, while ignore the impact of domain-level topical distribution. This often leads to losing domain-specific terminologies, and as a consequence, weakens the coherence of topical phrases. In this paper, we present a novel framework CITPM for topical phrase mining. Our framework views a corpus as a mixture of clusters (domains), and each cluster is characterized by documents sharing similar topical distributions. The CITPM framework iteratively performs phrase mining, topical inferring and cluster updating until a satisfactory final result is obtained. The empirical verification demonstrates our framework outperforms state-of-the-art works in both aspects of interpretability and efficiency. |
Keywords: | Topical phrase; Phrase mining; Document clustering |
Rights: | © Springer International Publishing Switzerland 2016 |
DOI: | 10.1007/978-3-319-32025-0_13 |
Grant ID: | http://purl.org/au-research/grants/arc/DP140103499 |
Published version: | http://dx.doi.org/10.1007/978-3-319-32025-0_13 |
Appears in Collections: | Aurora harvest 8 Computer Science publications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
RA_hdl_109414.pdf Restricted Access | Restricted Access | 984.06 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.