Supervised learning for suicidal ideation detection in online user content


Early detection and treatment are regarded as the most effective ways to prevent suicidal ideation and potential suicide attempts-two critical risk factors resulting in successful suicides. Online communication channels are becoming a new way for people to express their suicidal tendencies. This paper presents an approach to understand suicidal ideation through online user-generated content with the goal of early detection via supervised learning. Analysing users’ language preferences and topic descriptions reveals rich knowledge that can be used as an early warning system for detecting suicidal tendencies. Suicidal individuals express strong negative feelings, anxiety, and hopelessness. Suicidal thoughts may involve family and friends. And topics they discuss cover both personal and social issues. To detect suicidal ideation, we extract several informative sets of features, including statistical, syntactic, linguistic, word embedding, and topic features, and we compare six classifiers, including four traditional supervised classifiers and two neural network models. An experimental study demonstrates the feasibility and practicability of the approach and provides benchmarks for the suicidal ideation detection on the active online platforms: Reddit SuicideWatch and Twitter.

Shaoxiong Ji
PhD Student @ Aalto U.

My research interests include data mining, machine learning, and graph analysis.