Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence
Word embeddings have been considered one of the biggest breakthroughs of deep learning for natural language processing. They are learned numerical vector representations of words where similar words have similar representations. Contextual word embeddings are the promising second-generation of word embeddings assigning a representation to a word based on its context. This can result in different representations for the same word depending on the context (e.g. river bank and commercial bank). There is evidence of social bias (human-like implicit biases based on gender, race, and other social constructs) in word embeddings. While detecting bias in static (classical or non-contextual) word embeddings is a well-researched topic, there has been limited work in detecting bias in contextual word embeddings, mostly focussed on using the Word Embedding Association Test (WEAT). This paper explores measuring social bias (gender, ethnicity, and religion) in contextual word embeddings using a number of fairness metrics, including the Relative Norm Distance (RND), the Relative Negative Sentiment Bias (RNSB) and the already mentioned WEAT. It extends the Word Embeddings Fairness Evaluation (WEFE) framework to facilitate measuring social biases in contextual embeddings and compares these with biases in static word embeddings. The results show when ranking performance over a number of fairness metrics that contextual word embedding pre-trained models BERT and RoBERTa have more social bias than static word embedding pre-trained models GloVe and Word2Vec.
Mora, A. C. (2022). Measuring and Comparing Social Bias in Static and Contextual Word Embeddings. Technological University Dublin. DOI: 10.21427/9FEA-6F46