Update README.md
Browse files
README.md
CHANGED
|
@@ -1,36 +1,36 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
-
- fr
|
| 6 |
-
- de
|
| 7 |
-
- it
|
| 8 |
-
- es
|
| 9 |
-
- pt
|
| 10 |
-
- pl
|
| 11 |
-
- zh
|
| 12 |
-
- nl
|
| 13 |
-
base_model:
|
| 14 |
-
- FacebookAI/xlm-roberta-base
|
| 15 |
-
datasets:
|
| 16 |
-
- PleIAs/ToxicCommons
|
| 17 |
-
- yangezheng/SWSR-SexComment
|
| 18 |
-
tags:
|
| 19 |
-
- toxicity
|
| 20 |
-
- data
|
| 21 |
-
---
|
| 22 |
# Multilingual Toxicity Classifiers used in Apertus Pretraining
|
| 23 |
#### Author: Olivia Simin Fan (@Olivia-umich)
|
| 24 |
|
| 25 |
Language specific toxicity classifiers in English, French, German, Italian, Spanish, Portuguese, Polish, Chinese and Dutch, trained on [PleIAs/ToxicCommons](https://github.com/Pleias/toxic-commons) and [SWSR-SexComments](https://arxiv.org/pdf/2108.03070) datasets.
|
| 26 |
-
|
| 27 |
|
| 28 |
## Model Description
|
| 29 |
Our toxicity classifier employs a two-stage approach: we first extract the multilingual document embeddings using [*XLM-RoBERTa*](https://huggingface.co/FacebookAI/xlm-roberta-base),
|
| 30 |
then train a language-specific 2-layer MLP for binary toxicity classification on top of these embeddings for 6 epochs.
|
| 31 |
The classifier checkpoints with the best accuracy on the held-out validation set are further employed to annotate the toxicity scores on FineWeb-2 and FineWeb.
|
| 32 |
|
| 33 |
-
The validation accuracies on the held-out test set
|
| 34 |
| Language | Accuracy |
|
| 35 |
|----------|----------|
|
| 36 |
| English (en) | 80.13% |
|
|
@@ -110,7 +110,8 @@ class RobertaClassifier(nn.Module):
|
|
| 110 |
|
| 111 |
|
| 112 |
```python
|
| 113 |
-
|
|
|
|
| 114 |
DEVICE = "cpu"
|
| 115 |
|
| 116 |
model = RobertaClassifier(device=DEVICE, num_classes=2)
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
- fr
|
| 6 |
+
- de
|
| 7 |
+
- it
|
| 8 |
+
- es
|
| 9 |
+
- pt
|
| 10 |
+
- pl
|
| 11 |
+
- zh
|
| 12 |
+
- nl
|
| 13 |
+
base_model:
|
| 14 |
+
- FacebookAI/xlm-roberta-base
|
| 15 |
+
datasets:
|
| 16 |
+
- PleIAs/ToxicCommons
|
| 17 |
+
- yangezheng/SWSR-SexComment
|
| 18 |
+
tags:
|
| 19 |
+
- toxicity
|
| 20 |
+
- data
|
| 21 |
+
---
|
| 22 |
# Multilingual Toxicity Classifiers used in Apertus Pretraining
|
| 23 |
#### Author: Olivia Simin Fan (@Olivia-umich)
|
| 24 |
|
| 25 |
Language specific toxicity classifiers in English, French, German, Italian, Spanish, Portuguese, Polish, Chinese and Dutch, trained on [PleIAs/ToxicCommons](https://github.com/Pleias/toxic-commons) and [SWSR-SexComments](https://arxiv.org/pdf/2108.03070) datasets.
|
| 26 |
+
We subsample non-toxic examples to create balanced 50%-50% training sets for each language. We separate 10% from the balanced dataset to build a balanced held-out validation set.
|
| 27 |
|
| 28 |
## Model Description
|
| 29 |
Our toxicity classifier employs a two-stage approach: we first extract the multilingual document embeddings using [*XLM-RoBERTa*](https://huggingface.co/FacebookAI/xlm-roberta-base),
|
| 30 |
then train a language-specific 2-layer MLP for binary toxicity classification on top of these embeddings for 6 epochs.
|
| 31 |
The classifier checkpoints with the best accuracy on the held-out validation set are further employed to annotate the toxicity scores on FineWeb-2 and FineWeb.
|
| 32 |
|
| 33 |
+
The validation accuracies on the balanced held-out test set are provided as below:
|
| 34 |
| Language | Accuracy |
|
| 35 |
|----------|----------|
|
| 36 |
| English (en) | 80.13% |
|
|
|
|
| 110 |
|
| 111 |
|
| 112 |
```python
|
| 113 |
+
LANGUAGE = "english" # choose from ["english", "chinese", "french", "german", "italian", "spanish", "portuguese", "polish", "dutch"]
|
| 114 |
+
MODEL_PATH = f"{MODEL_DIR}/{LANGUAGE}.pth"
|
| 115 |
DEVICE = "cpu"
|
| 116 |
|
| 117 |
model = RobertaClassifier(device=DEVICE, num_classes=2)
|