Author ORCID Identifier

https://orcid.org/0009-0005-2171-5501

Document Type

Conference Paper

Disciplines

1.2 COMPUTER AND INFORMATION SCIENCE, Computer Sciences, Women's and gender studies, Ethics

Publication Details

In Proceedings of the 6th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 347–357, Vienna, Austria. Association for Computational Linguistics.

Abstract

Gendered language is the use of words that indicate an individual’s gender. Though useful in certain context, it can reinforce gender stereotypes and introduce bias, particularly in machine learning models used for tasks like occupation classification. When textual content such as biographies contains gender cues, it can influence model predictions, leading to unfair outcomes such as reduced hiring opportunities for women. To address this issue, we propose GenWriter, an approach that integrates Case-Based Reasoning (CBR) with Large Language Models (LLMs) to rewrite biographies in a way that obfuscates gender while preserving semantic content. We evaluate GenWriter by measuring gender bias in occupation classification before and after rewriting the biographies used for training the occupation classification model. Our results show that GenWriter significantly reduces gender bias by 89% in nurse biographies and 62% in surgeon biographies, while maintaining classification accuracy. In comparison, an LLM-only rewriting approach achieves smaller bias reductions (by 44% and 12% in nurse and surgeon biographies, respectively) and leads to some classification performance degradation.

DOI

https://doi.org/10.21427/t045-bk02

Funder

Technological University Dublin

Creative Commons License

Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.


Share

COinS