Dynabench: Rethinking benchmarking in NLP D Kiela, M Bartolo, Y Nie, D Kaushik, A Geiger, Z Wu, B Vidgen, G Prasad, ... arXiv preprint arXiv:2104.14337, 2021 | 395 | 2021 |
Directions in abusive language training data, a systematic review: Garbage in, garbage out B Vidgen, L Derczynski Plos one 15 (12), e0243300, 2020 | 326 | 2020 |
HateCheck: Functional tests for hate speech detection models P Röttger, B Vidgen, D Nguyen, Z Waseem, H Margetts, JB Pierrehumbert arXiv preprint arXiv:2012.15606, 2020 | 252 | 2020 |
Learning from the worst: Dynamically generated datasets to improve online hate detection B Vidgen, T Thrush, Z Waseem, D Kiela arXiv preprint arXiv:2012.15761, 2020 | 239 | 2020 |
Challenges and frontiers in abusive content detection B Vidgen, A Harris, D Nguyen, R Tromble, S Hale, H Margetts Proceedings of the third workshop on abusive language online, 2019 | 222 | 2019 |
Trustllm: Trustworthiness in large language models Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li, C Gao, Y Huang, W Lyu, ... arXiv preprint arXiv:2401.05561, 2024 | 219 | 2024 |
Detecting weak and strong Islamophobic hate speech on social media B Vidgen, T Yasseri Journal of Information Technology & Politics 17 (1), 66-78, 2020 | 191 | 2020 |
P-Values: Misunderstood and Misused B Vidgen, T Yasseri Frontiers in Physics 4, 6, 2016 | 151 | 2016 |
Two contrasting data annotation paradigms for subjective NLP tasks P Röttger, B Vidgen, D Hovy, JB Pierrehumbert arXiv preprint arXiv:2112.07475, 2021 | 147 | 2021 |
Semeval-2023 task 10: Explainable detection of online sexism HR Kirk, W Yin, B Vidgen, P Röttger arXiv preprint arXiv:2303.04222, 2023 | 122 | 2023 |
Detecting East Asian prejudice on social media B Vidgen, A Botelho, D Broniatowski, E Guest, M Hall, H Margetts, ... arXiv preprint arXiv:2005.03909, 2020 | 110 | 2020 |
An expert annotated dataset for the detection of online misogyny E Guest, B Vidgen, A Mittos, N Sastry, G Tyson, H Margetts Proceedings of the 16th conference of the European chapter of the …, 2021 | 109 | 2021 |
Xstest: A test suite for identifying exaggerated safety behaviours in large language models P Röttger, HR Kirk, B Vidgen, G Attanasio, F Bianchi, D Hovy arXiv preprint arXiv:2308.01263, 2023 | 94 | 2023 |
Introducing CAD: the contextual abuse dataset B Vidgen, D Nguyen, H Margetts, P Rossini, R Tromble | 93 | 2021 |
Personalisation within bounds: A risk taxonomy and policy framework for the alignment of large language models with personalised feedback HR Kirk, B Vidgen, P Röttger, SA Hale arXiv preprint arXiv:2303.05453, 2023 | 88 | 2023 |
The benefits, risks and bounds of personalizing the alignment of large language models to individuals HR Kirk, B Vidgen, P Röttger, SA Hale Nature Machine Intelligence, 1-10, 2024 | 61 | 2024 |
Hatemoji: A test suite and adversarially-generated dataset for benchmarking and detecting emoji-based hate HR Kirk, B Vidgen, P Röttger, T Thrush, SA Hale arXiv preprint arXiv:2108.05921, 2021 | 58 | 2021 |
Understanding RT’s audiences: Exposure not endorsement for Twitter followers of Russian state-sponsored media R Crilley, M Gillespie, B Vidgen, A Willis The International Journal of Press/Politics 27 (1), 220-242, 2022 | 56 | 2022 |
Recruitment and ongoing engagement in a UK smartphone study examining the association between weather and pain: cohort study KL Druce, J McBeth, SN van der Veer, DA Selby, B Vidgen, K Georgatzis, ... JMIR mHealth and uHealth 5 (11), e8162, 2017 | 56 | 2017 |
Multilingual HateCheck: Functional tests for multilingual hate speech detection models P Röttger, H Seelawi, D Nozza, Z Talat, B Vidgen arXiv preprint arXiv:2206.09917, 2022 | 54 | 2022 |