Leveraging Bias in Pre-Trained Word Embeddings for Unsupervised Microaggression Detection

Tolúlope' Ògúnremí, Nazanin Sabri, Valerio Basile, Tommaso Caselli

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

76 Downloads (Pure)

Abstract

Microaggressions are subtle manifestations of bias (Breitfeller et al., 2019).
These demonstrations of bias can often be classified as a subset of abusive language. However, not as much focus has been placed on the recognition of these instances. As a result, limited data is available on the topic, and only in English. Being able to detect microaggressions without the need for labeled data would be advantageous since it would allow content moderation also for languages lacking annotated data. In this study, we introduce an unsupervised method to detect microaggressions in natural language expressions.
The algorithm relies on pre-trained word-embeddings, leveraging the bias encoded in the model in order to detect microaggressions in unseen textual instances. We test the method on a dataset of racial and gender-based microaggressions, reporting promising results. We further run the algorithm on out-of-domain unseen data with the purpose of bootstrapping corpora of
microaggressions “in the wild”, and discuss the benefits and drawbacks of our
proposed method.
Original languageEnglish
Title of host publicationProceedings of the Eighth Italian Conference on Computational Linguistics
EditorsElisabetta Fersini, Marco Passarotti, Viviana Patti
PublisherCEUR Workshop Proceedings (CEUR-WS.org)
Number of pages7
Publication statusPublished - 2021
EventItalian Conference on Computational Linguistics 2021: CLiC-it 2021 - Milan, Italy
Duration: 26-Jan-202228-Jan-2022
Conference number: 8

Conference

ConferenceItalian Conference on Computational Linguistics 2021
Country/TerritoryItaly
CityMilan
Period26/01/202228/01/2022

Keywords

  • micro-aggression
  • hate speech
  • NLP

Fingerprint

Dive into the research topics of 'Leveraging Bias in Pre-Trained Word Embeddings for Unsupervised Microaggression Detection'. Together they form a unique fingerprint.
  • Best Student Paper Award

    Ògúnremí, T. (Recipient), Sabri, N. (Recipient), Basile, V. (Recipient) & Caselli, T. (Recipient), 2022

    Prize: National/international honourAcademic

Cite this