Author ORCID Identifier

https://orcid.org/0009-0005-2171-5501

Document Type

Conference Paper

Disciplines

1.2 COMPUTER AND INFORMATION SCIENCE, Computer Sciences, Women's and gender studies, Social sciences

Publication Details

https://ieeexplore.ieee.org/document/10470830

DOI: 10.1109/AICS60730.2023.10470830.

Irish Conference on Artificial Intelligence and Cognitive Science (AICS 2023)

Abstract

Gendered language is the use of words that denote an individual's gender. This can be explicit where the gender is evident in the actual word used, e.g. mother, she, man, but it can also be implicit where social roles or behaviours can signal an individual's gender - for example, expectations that women display communal traits (e.g., affectionate, caring, gentle) and men display agentic traits (e.g., assertive, competitive, decisive). The use of gendered language in NLP systems can perpetuate gender stereotypes and bias. This paper proposes an approach to generating gendered language datasets using ChatGPT which will provide data for data-driven approaches for gender stereotype detection and gender bias mitigation. The approach focuses on generating implicit gendered language that captures and reflects stereotypical characteristics or traits of a particular gender. This is done by engineering prompts to ChatGPT that use gender-coded words from gender-coded lexicons. The evaluation of the datasets generated shows good instances of English-language gendered sentences that can be identified as those that are consistent with gender stereotypes and those that are contradictory. The generated data also shows strong gender bias.

DOI

10.1109/AICS60730.2023.10470830

Dataset-for-Gendered-Language.zip (347 kB)
Dataset for Gendered Language

Funder

Technological University Dublin

Creative Commons License

Creative Commons Attribution-Share Alike 4.0 International License
This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.


Share

COinS