Author ORCID Identifier

0000-0001-7658-7264

Document Type

Conference Paper

Disciplines

Computer Sciences, Women's and gender studies

Publication Details

https://link.springer.com/chapter/10.1007/978-3-031-26438-2_17

Conference: Irish Conference on Artificial Intelligence and Cognitive Science (AICS 2022),08/12/2022.Pages: 214-225

doi:10.1007/978-3-031-26438-2_17

Abstract

Writing style and choice of words used in textual content can vary between men and women both in terms of who the text is talking about and who is writing the text. The focus of this paper is on author gender prediction, identifying the gender of who is writing the text. We compare closed and open vocabulary approaches on different types of textual content including more traditional writing styles such as in books, and more recent writing styles used in user generated content on digital platforms such as blogs and social media messaging. As supervised machine learning approaches can reflect human biases in the data they are trained on, we also consider the gender bias of the different approaches across the different types of dataset. We show that open vocabulary approaches perform better both in terms of prediction performance and with less gender bias.

DOI

https://doi.org/10.1007/978-3-031-26438-2_17

Funder

Science Foundation Ireland

Creative Commons License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.


Share

COinS