Identify topic relations in scientific literature using topic modeling

Abstract

Over the past five years, topic models have been applied to bibliometrics research as an efficient tool for discovering latent and potentially useful content. The combination of topic modeling algorithms and bibliometrics has generated new challenges of interpreting and understanding the outcome of topic modeling. Motivated by these new challenges, this paper proposes a systematic methodology for topic analysis in scientific literature corpora to face the concerns of conducting post topic modeling analysis. By linking the corpus metadata with the discovered topics, we feature them with a number of topic-based analytic indices to explore their significance, developing trend, and received attention. A topic relation identification approach is then presented to quantitatively model the relations among the topics. To demonstrate the feasibility and effectiveness of our methodology, we present two case studies, using big data and dye-sensitized solar cell publications derived from searches in World of Science. Possible application of the methodology in telling good stories of a target corpus is also explored to facilitate further research management and opportunity discovery.

Publication
IEEE Transactions on Engineering Management
Shirui Pan
Shirui Pan
Professor | ARC Future Fellow

My research interests include data mining, machine learning, and graph analysis.