Conference papers

The Use of Deep Learning Distributed Representations in the Identification of Abusive Text

Susan McKeever, Technological University DublinFollow
hao chen, Technological University DublinFollow
Sarah Jane Delany, Technological University DublinFollow

Document Type

Conference Paper

Rights

Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Disciplines

1.2 COMPUTER AND INFORMATION SCIENCE, Computer Sciences

Publication Details

13th International AAAI Conference on Web and Social Media ICWSM-2019, Munich, Germany, June 2019

Abstract

The selection of optimal feature representations is a critical step in the use of machine learning in text classification. Traditional features (e.g. bag of words and n-grams) have dominated for decades, but in the past five years, the use of learned distributed representations has become increasingly common. In this paper, we summarise and present a categorisation of the stateof-the-art distributed representation techniques, including word and sentence embedding models. We carry out an empirical analysis of the performance of the various feature representations using the scenario of detecting abusive comments. We compare classification accuracies across a range of off-the-shelf embedding models using 10 labelled datasets gathered from different social media platforms. Our results show that multi-task sentence embedding models perform best with consistently highest classification results in comparison to other embedding models. We hope our work can be a guideline for practitioners in selecting appropriate features in text classification task, particularly in the domain of abuse detection.

Recommended Citation

Chen, H., McKeever, S., & Delany, S. J. (2019). The Use of Deep Learning Distributed Representations in the Identification of Abusive Text. Proceedings of the International AAAI Conference on Web and Social Media, vol. 13, no. 01, pg. 125-133.

Funder

Dublin Institute of Technology

Download

Included in

Artificial Intelligence and Robotics Commons, Music Commons

COinS

Conference papers

The Use of Deep Learning Distributed Representations in the Identification of Abusive Text

Document Type

Rights

Disciplines

Publication Details

Abstract

Recommended Citation

Funder

Included in

Search

Browse

Author Corner

Links

Conference papers

The Use of Deep Learning Distributed Representations in the Identification of Abusive Text

Authors

Document Type

Rights

Disciplines

Publication Details

Abstract

Recommended Citation

Funder

Included in

Share

Search

Browse

Author Corner

Links