Experiment 2: Waseem & Wikitoxic
Basic information
- Task: Predicting whether a given text is abusive or non-abusive
- Dataset: Waseem
- Classes: Not abusive or Abusive
- Train/Dev/Test examples: 10144 / 3381 / 3382
- Problem: Previous work found that this dataset contains a strong negative bias against females. In other words, texts related to females are usually classified as abusive although the texts themselves are not abusive at all.
- Out-of-domain test set: Wikitoxic (18965 examples)
- For more details, please see section 6 in the paper.
Word Clouds & Annotations
Warning: The wordclouds may contain phrases in the training data which are offensive in nature.
Model 1: Waseem_CNN_20200507192504
Feature 0 | Feature 1 | Feature 2 | Feature 3 | Feature 4 | Feature 5 | Feature 6 | Feature 7 | Feature 8 | Feature 9 | Feature 10 | Feature 11 | Feature 12 | Feature 13 | Feature 14 | Feature 15 | Feature 16 | Feature 17 | Feature 18 | Feature 19 | Feature 20 | Feature 21 | Feature 22 | Feature 23 | Feature 24 | Feature 25 | Feature 26 | Feature 27 | Feature 28 | Feature 29 |
Model weights: - Not abusive = -0.161 - Abusive = 0.184 | Model weights: - Not abusive = 0.416 - Abusive = 0.222 | Model weights: - Not abusive = 0.188 - Abusive = -0.045 | Model weights: - Not abusive = 0.463 - Abusive = -0.383 | Model weights: - Not abusive = -0.373 - Abusive = -0.137 | Model weights: - Not abusive = 0.062 - Abusive = -0.115 | Model weights: - Not abusive = 0.425 - Abusive = -0.324 | Model weights: - Not abusive = -0.416 - Abusive = 0.169 | Model weights: - Not abusive = 0.304 - Abusive = 0.020 | Model weights: - Not abusive = -0.074 - Abusive = 0.155 | Model weights: - Not abusive = -0.146 - Abusive = 0.318 | Model weights: - Not abusive = -0.052 - Abusive = 0.198 | Model weights: - Not abusive = -0.169 - Abusive = 0.209 | Model weights: - Not abusive = 0.119 - Abusive = -0.322 | Model weights: - Not abusive = -0.128 - Abusive = -0.311 | Model weights: - Not abusive = 0.338 - Abusive = 0.481 | Model weights: - Not abusive = -0.379 - Abusive = 0.113 | Model weights: - Not abusive = -0.222 - Abusive = -0.458 | Model weights: - Not abusive = 0.248 - Abusive = 0.091 | Model weights: - Not abusive = -0.122 - Abusive = 0.404 | Model weights: - Not abusive = -0.125 - Abusive = -0.594 | Model weights: - Not abusive = 0.378 - Abusive = 0.027 | Model weights: - Not abusive = -0.385 - Abusive = 0.463 | Model weights: - Not abusive = 0.026 - Abusive = 0.413 | Model weights: - Not abusive = 0.040 - Abusive = 0.429 | Model weights: - Not abusive = 0.269 - Abusive = -0.241 | Model weights: - Not abusive = 0.419 - Abusive = 0.211 | Model weights: - Not abusive = 0.546 - Abusive = -0.147 | Model weights: - Not abusive = -0.338 - Abusive = 0.203 | Model weights: - Not abusive = -0.392 - Abusive = -0.089 |
Human answers: - Not abusive = 2 - Abusive = 3 - It could be either = 5 | Human answers: - Not abusive = 3 - Abusive = 2 - It could be either = 5 | Human answers: - Not abusive = 1 - Abusive = 7 - It could be either = 2 | Human answers: - Not abusive = 5 - Abusive = 1 - It could be either = 4 | Human answers: - Not abusive = 0 - Abusive = 8 - It could be either = 2 | Human answers: - Not abusive = 4 - Abusive = 5 - It could be either = 1 | Human answers: - Not abusive = 2 - Abusive = 1 - It could be either = 7 | Human answers: - Not abusive = 7 - Abusive = 1 - It could be either = 2 | Human answers: - Not abusive = 9 - Abusive = 0 - It could be either = 1 | Human answers: - Not abusive = 0 - Abusive = 10 - It could be either = 0 | Human answers: - Not abusive = 0 - Abusive = 10 - It could be either = 0 | Human answers: - Not abusive = 1 - Abusive = 2 - It could be either = 7 | Human answers: - Not abusive = 1 - Abusive = 9 - It could be either = 0 | Human answers: - Not abusive = 7 - Abusive = 0 - It could be either = 3 | Human answers: - Not abusive = 5 - Abusive = 0 - It could be either = 5 | Human answers: - Not abusive = 4 - Abusive = 0 - It could be either = 6 | Human answers: - Not abusive = 0 - Abusive = 10 - It could be either = 0 | Human answers: - Not abusive = 0 - Abusive = 7 - It could be either = 3 | Human answers: - Not abusive = 4 - Abusive = 0 - It could be either = 6 | Human answers: - Not abusive = 5 - Abusive = 3 - It could be either = 2 | Human answers: - Not abusive = 4 - Abusive = 3 - It could be either = 3 | Human answers: - Not abusive = 3 - Abusive = 1 - It could be either = 6 | Human answers: - Not abusive = 0 - Abusive = 8 - It could be either = 2 | Human answers: - Not abusive = 1 - Abusive = 9 - It could be either = 0 | Human answers: - Not abusive = 2 - Abusive = 5 - It could be either = 3 | Human answers: - Not abusive = 1 - Abusive = 8 - It could be either = 1 | Human answers: - Not abusive = 3 - Abusive = 1 - It could be either = 6 | Human answers: - Not abusive = 1 - Abusive = 4 - It could be either = 5 | Human answers: - Not abusive = 2 - Abusive = 8 - It could be either = 0 | Human answers: - Not abusive = 2 - Abusive = 3 - It could be either = 5 |
Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Disabled - One: Enabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Disabled - One: Enabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Disabled - One: Enabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Disabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled |
Model 2: Waseem_CNN_20200507194756
Feature 0 | Feature 1 | Feature 2 | Feature 3 | Feature 4 | Feature 5 | Feature 6 | Feature 7 | Feature 8 | Feature 9 | Feature 10 | Feature 11 | Feature 12 | Feature 13 | Feature 14 | Feature 15 | Feature 16 | Feature 17 | Feature 18 | Feature 19 | Feature 20 | Feature 21 | Feature 22 | Feature 23 | Feature 24 | Feature 25 | Feature 26 | Feature 27 | Feature 28 | Feature 29 |
Model weights: - Not abusive = 0.304 - Abusive = 0.046 | Model weights: - Not abusive = 0.095 - Abusive = -0.401 | Model weights: - Not abusive = 0.177 - Abusive = 0.402 | Model weights: - Not abusive = -0.417 - Abusive = -0.014 | Model weights: - Not abusive = -0.201 - Abusive = 0.143 | Model weights: - Not abusive = 0.367 - Abusive = -0.091 | Model weights: - Not abusive = -0.239 - Abusive = -0.091 | Model weights: - Not abusive = 0.293 - Abusive = -0.172 | Model weights: - Not abusive = 0.488 - Abusive = 0.094 | Model weights: - Not abusive = 0.091 - Abusive = -0.247 | Model weights: - Not abusive = -0.336 - Abusive = 0.129 | Model weights: - Not abusive = -0.001 - Abusive = 0.379 | Model weights: - Not abusive = 0.441 - Abusive = 0.037 | Model weights: - Not abusive = 0.486 - Abusive = 0.272 | Model weights: - Not abusive = -0.088 - Abusive = 0.347 | Model weights: - Not abusive = 0.206 - Abusive = 0.447 | Model weights: - Not abusive = -0.359 - Abusive = -0.108 | Model weights: - Not abusive = 0.209 - Abusive = -0.353 | Model weights: - Not abusive = -0.112 - Abusive = 0.339 | Model weights: - Not abusive = 0.409 - Abusive = -0.378 | Model weights: - Not abusive = 0.048 - Abusive = 0.488 | Model weights: - Not abusive = 0.344 - Abusive = -0.335 | Model weights: - Not abusive = -0.052 - Abusive = 0.235 | Model weights: - Not abusive = 0.021 - Abusive = -0.317 | Model weights: - Not abusive = 0.079 - Abusive = 0.329 | Model weights: - Not abusive = -0.113 - Abusive = 0.194 | Model weights: - Not abusive = 0.064 - Abusive = 0.309 | Model weights: - Not abusive = 0.403 - Abusive = -0.283 | Model weights: - Not abusive = -0.139 - Abusive = 0.063 | Model weights: - Not abusive = -0.447 - Abusive = 0.205 |
Human answers: - Not abusive = 9 - Abusive = 1 - It could be either = 0 | Human answers: - Not abusive = 4 - Abusive = 5 - It could be either = 1 | Human answers: - Not abusive = 2 - Abusive = 8 - It could be either = 0 | Human answers: - Not abusive = 3 - Abusive = 6 - It could be either = 1 | Human answers: - Not abusive = 0 - Abusive = 10 - It could be either = 0 | Human answers: - Not abusive = 3 - Abusive = 1 - It could be either = 6 | Human answers: - Not abusive = 0 - Abusive = 10 - It could be either = 0 | Human answers: - Not abusive = 1 - Abusive = 8 - It could be either = 1 | Human answers: - Not abusive = 6 - Abusive = 3 - It could be either = 1 | Human answers: - Not abusive = 8 - Abusive = 0 - It could be either = 2 | Human answers: - Not abusive = 1 - Abusive = 9 - It could be either = 0 | Human answers: - Not abusive = 5 - Abusive = 4 - It could be either = 1 | Human answers: - Not abusive = 3 - Abusive = 3 - It could be either = 4 | Human answers: - Not abusive = 8 - Abusive = 1 - It could be either = 1 | Human answers: - Not abusive = 9 - Abusive = 0 - It could be either = 1 | Human answers: - Not abusive = 3 - Abusive = 3 - It could be either = 4 | Human answers: - Not abusive = 3 - Abusive = 2 - It could be either = 5 | Human answers: - Not abusive = 6 - Abusive = 3 - It could be either = 1 | Human answers: - Not abusive = 7 - Abusive = 1 - It could be either = 2 | Human answers: - Not abusive = 4 - Abusive = 3 - It could be either = 3 | Human answers: - Not abusive = 4 - Abusive = 5 - It could be either = 1 | Human answers: - Not abusive = 6 - Abusive = 2 - It could be either = 2 | Human answers: - Not abusive = 1 - Abusive = 7 - It could be either = 2 | Human answers: - Not abusive = 7 - Abusive = 0 - It could be either = 3 | Human answers: - Not abusive = 0 - Abusive = 10 - It could be either = 0 | Human answers: - Not abusive = 1 - Abusive = 8 - It could be either = 1 | Human answers: - Not abusive = 0 - Abusive = 10 - It could be either = 0 | Human answers: - Not abusive = 7 - Abusive = 0 - It could be either = 3 | Human answers: - Not abusive = 4 - Abusive = 6 - It could be either = 0 | Human answers: - Not abusive = 0 - Abusive = 10 - It could be either = 0 |
Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Disabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Disabled - One: Enabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled |
Model 3: Waseem_CNN_20200507202103
Feature 0 | Feature 1 | Feature 2 | Feature 3 | Feature 4 | Feature 5 | Feature 6 | Feature 7 | Feature 8 | Feature 9 | Feature 10 | Feature 11 | Feature 12 | Feature 13 | Feature 14 | Feature 15 | Feature 16 | Feature 17 | Feature 18 | Feature 19 | Feature 20 | Feature 21 | Feature 22 | Feature 23 | Feature 24 | Feature 25 | Feature 26 | Feature 27 | Feature 28 | Feature 29 |
Model weights: - Not abusive = 0.295 - Abusive = 0.048 | Model weights: - Not abusive = 0.576 - Abusive = 0.214 | Model weights: - Not abusive = 0.381 - Abusive = -0.132 | Model weights: - Not abusive = -0.462 - Abusive = -0.104 | Model weights: - Not abusive = 0.081 - Abusive = -0.171 | Model weights: - Not abusive = -0.263 - Abusive = 0.024 | Model weights: - Not abusive = -0.173 - Abusive = 0.138 | Model weights: - Not abusive = -0.210 - Abusive = 0.140 | Model weights: - Not abusive = 0.121 - Abusive = -0.058 | Model weights: - Not abusive = 0.082 - Abusive = 0.378 | Model weights: - Not abusive = -0.129 - Abusive = 0.497 | Model weights: - Not abusive = 0.478 - Abusive = 0.286 | Model weights: - Not abusive = 0.438 - Abusive = -0.240 | Model weights: - Not abusive = 0.053 - Abusive = -0.270 | Model weights: - Not abusive = 0.367 - Abusive = -0.199 | Model weights: - Not abusive = -0.213 - Abusive = 0.078 | Model weights: - Not abusive = 0.046 - Abusive = -0.310 | Model weights: - Not abusive = 0.267 - Abusive = 0.006 | Model weights: - Not abusive = 0.080 - Abusive = 0.346 | Model weights: - Not abusive = -0.342 - Abusive = -0.101 | Model weights: - Not abusive = 0.236 - Abusive = -0.392 | Model weights: - Not abusive = 0.317 - Abusive = -0.116 | Model weights: - Not abusive = -0.236 - Abusive = 0.302 | Model weights: - Not abusive = -0.345 - Abusive = 0.065 | Model weights: - Not abusive = -0.463 - Abusive = -0.203 | Model weights: - Not abusive = -0.377 - Abusive = 0.429 | Model weights: - Not abusive = 0.014 - Abusive = 0.424 | Model weights: - Not abusive = 0.483 - Abusive = 0.265 | Model weights: - Not abusive = 0.087 - Abusive = 0.375 | Model weights: - Not abusive = 0.209 - Abusive = 0.419 |
Human answers: - Not abusive = 7 - Abusive = 1 - It could be either = 2 | Human answers: - Not abusive = 7 - Abusive = 0 - It could be either = 3 | Human answers: - Not abusive = 2 - Abusive = 6 - It could be either = 2 | Human answers: - Not abusive = 0 - Abusive = 8 - It could be either = 2 | Human answers: - Not abusive = 9 - Abusive = 1 - It could be either = 0 | Human answers: - Not abusive = 3 - Abusive = 2 - It could be either = 5 | Human answers: - Not abusive = 0 - Abusive = 10 - It could be either = 0 | Human answers: - Not abusive = 1 - Abusive = 8 - It could be either = 1 | Human answers: - Not abusive = 2 - Abusive = 4 - It could be either = 4 | Human answers: - Not abusive = 3 - Abusive = 6 - It could be either = 1 | Human answers: - Not abusive = 4 - Abusive = 4 - It could be either = 2 | Human answers: - Not abusive = 4 - Abusive = 2 - It could be either = 4 | Human answers: - Not abusive = 3 - Abusive = 2 - It could be either = 5 | Human answers: - Not abusive = 3 - Abusive = 7 - It could be either = 0 | Human answers: - Not abusive = 5 - Abusive = 1 - It could be either = 4 | Human answers: - Not abusive = 1 - Abusive = 7 - It could be either = 2 | Human answers: - Not abusive = 8 - Abusive = 0 - It could be either = 2 | Human answers: - Not abusive = 5 - Abusive = 3 - It could be either = 2 | Human answers: - Not abusive = 1 - Abusive = 9 - It could be either = 0 | Human answers: - Not abusive = 2 - Abusive = 8 - It could be either = 0 | Human answers: - Not abusive = 6 - Abusive = 0 - It could be either = 4 | Human answers: - Not abusive = 5 - Abusive = 4 - It could be either = 1 | Human answers: - Not abusive = 2 - Abusive = 8 - It could be either = 0 | Human answers: - Not abusive = 6 - Abusive = 4 - It could be either = 0 | Human answers: - Not abusive = 4 - Abusive = 5 - It could be either = 1 | Human answers: - Not abusive = 1 - Abusive = 7 - It could be either = 2 | Human answers: - Not abusive = 1 - Abusive = 9 - It could be either = 0 | Human answers: - Not abusive = 5 - Abusive = 2 - It could be either = 3 | Human answers: - Not abusive = 3 - Abusive = 2 - It could be either = 5 | Human answers: - Not abusive = 5 - Abusive = 1 - It could be either = 4 |
Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Disabled - One: Enabled | Decision: - MTurk: Enabled - One: Enabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Enabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled | Decision: - MTurk: Disabled - One: Disabled |
Results
Results (Average ± SD) of Experiment 2: Waseem & Wikitoxic, CNNs; Boldface numbers are the best scores in the columns. They are further underlined if they are significantly better than the scores of all the other models (based on approximate randomization test with α = 0.05)
Downloads
- Wordclouds and annotations
- The dataset of this experiment as well as other experiments can be downloaded here.
- If you want to use the original trained models in the experiments, please contact Piyawat (pl1515 [at] imperial [dot] ac [dot] uk).