Conference papers

Harnessing the Power of Text Mining for the Detection of Abusive Content in Social Media

Hao Chen, Technological University DublinFollow
Susan McKeever, Technological University DublinFollow
Sarah Jane Delany, Technological University DublinFollow

Document Type

Conference Paper

Rights

Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Publication Details

Advances in Computational Intelligence Systems

Volume 513 of the series Advances in Intelligent Systems and Computing pp 187-205;

Presented at 16th UKCI September 2016 Lancaster University

Abstract

Abstract The issues of cyberbullying and online harassment have gained considerable coverage in the last number of years. Social media providers need to be able to detect abusive content both accurately and efficiently in order to protect their users. Our aim is to investigate the application of core text mining techniques for the automatic detection of abusive content across a range of social media sources include blogs, forums, media-sharing, Q&A and chat - using datasets from Twitter, YouTube, MySpace, Kongregate, Formspring and Slashdot. Using supervised machine learning, we compare alternative text representations and dimension reduction approaches, including feature selection and feature enhancement, demonstrating the impact of these techniques on detection accuracies. In addition, we investigate the need for sampling on imbalanced datasets. Our conclusions are: (1) Dataset balancing boosts accuracies significantly for social media abusive content detection; (2) Feature reduction, important for large feature sets that are typical of social media datasets, improves efficiency whilst maintaining detection accuracies; (3) The use of generic structural features common across all our datasets proved to be of limited use in the automatic detection of abusive content. Our findings can support practitioners in selecting appropriate text mining strategies in this area.

DOI

https://doi.org/10.1007/978-3-319-46562-3_12

Recommended Citation

Chen, H., McKeever, S. & Delany (2016) Harnessing the Power of Text Mining for the Detection of Abusive Content in Social Media, 16th UKCI September 2016, Lancaster University. doi:10.1007/978-3-319-46562-3_12

Download

Included in

Artificial Intelligence and Robotics Commons, Databases and Information Systems Commons

COinS

Conference papers

Harnessing the Power of Text Mining for the Detection of Abusive Content in Social Media

Document Type

Rights

Publication Details

Abstract

DOI

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Conference papers

Harnessing the Power of Text Mining for the Detection of Abusive Content in Social Media

Authors

Document Type

Rights

Publication Details

Abstract

DOI

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links