Articles

Inclusive Counterfactual Generation: Leveraging LLMs in Identifying Online Hate

M. Atif Qureshi, Technological University DublinFollow
Arjumand Younus, University College Dublin, Ireland
Simon Caton, University College Dublin, Ireland

Author ORCID Identifier

0000-0003-4413-4476

Document Type

Conference Paper

Disciplines

Computer Sciences, Information Science

Publication Details

Qureshi, M.A., Younus, A., Caton, S. (2024). Inclusive Counterfactual Generation: Leveraging LLMs in Identifying Online Hate. In: Stefanidis, K., Systä, K., Matera, M., Heil, S., Kondylakis, H., Quintarelli, E. (eds) Web Engineering. ICWE 2024. Lecture Notes in Computer Science, vol 14629. Springer, Cham.

https://doi.org/10.1007/978-3-031-62362-2_3

Abstract

Counterfactually augmented data has recently been proposed as a successful solution for socially situated NLP tasks such as hate speech detection. The chief component within the existing counterfactual data augmentation pipeline, however, involves manually flipping labels and making minimal content edits to training data. In a hate speech context, these forms of editing have been shown to still retain offensive hate speech content. Inspired by the recent success of large language models (LLMs), especially the development of ChatGPT, which have demonstrated improved language comprehension abilities, we propose an inclusivity-oriented approach to automatically generate counterfactually augmented data using LLMs. We show that hate speech detection models trained with LLM-produced counterfactually augmented data can outperform both state-of-the-art and human-based methods.

DOI

https://doi.org/10.1007/978-3-031-62362-2_3

Recommended Citation

Qureshi, M. Atif; Younus, Arjumand; and Caton, Simon, "Inclusive Counterfactual Generation: Leveraging LLMs in Identifying Online Hate" (2024). Articles. 242.
https://arrow.tudublin.ie/creaart/242

Funder

Science Foundation Ireland

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.

Download

Contact the Author

Included in

Engineering Commons, Social and Behavioral Sciences Commons

COinS

Articles

Inclusive Counterfactual Generation: Leveraging LLMs in Identifying Online Hate

Author ORCID Identifier

Document Type

Disciplines

Publication Details

Abstract

DOI

Recommended Citation

Funder

Creative Commons License

Included in

Search

Browse

Author Corner

Articles

Inclusive Counterfactual Generation: Leveraging LLMs in Identifying Online Hate

Authors

Author ORCID Identifier

Document Type

Disciplines

Publication Details

Abstract

DOI

Recommended Citation

Funder

Creative Commons License

Included in

Share

Search

Browse

Author Corner