Abstract
It has become crucial to develop tools for automated hate speech and abuse detection. These tools would help to stop the bullies and the haters and provide a safer environment for individuals especially from marginalized groups to freely express themselves. However, recent research shows that machine learning models are biased and they might make the right decisions for the wrong reasons. In this thesis, I set out to understand the performance of hate speech and abuse detection models and the different biases that could influence them. I show that hate speech and abuse detection models are not only subject to social bias but also to other types of bias that have not been explored before. Finally, I investigate the causal effect of the social and intersectional bias on the performance and unfairness of hate speech detection models.
Original language | English |
---|---|
Title of host publication | The 60th Annual Meeting of the Association for Computational Linguistics |
Subtitle of host publication | Proceedings of the Student Research Workshop, May 22-27, 2022 |
Editors | Samuel Louvan, Andrea Madotto, Brielen Madureira |
Publisher | The Association for Computational Linguistics |
Pages | 31-43 |
Number of pages | 13 |
ISBN (Print) | 9781955917230 |
Publication status | Published - 22 May 2022 |