toxicity
data
Olivia-umich commited on
Commit
4b6e1f5
·
verified ·
1 Parent(s): 158c376

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -24
README.md CHANGED
@@ -1,36 +1,36 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - en
5
- - fr
6
- - de
7
- - it
8
- - es
9
- - pt
10
- - pl
11
- - zh
12
- - nl
13
- base_model:
14
- - FacebookAI/xlm-roberta-base
15
- datasets:
16
- - PleIAs/ToxicCommons
17
- - yangezheng/SWSR-SexComment
18
- tags:
19
- - toxicity
20
- - data
21
- ---
22
  # Multilingual Toxicity Classifiers used in Apertus Pretraining
23
  #### Author: Olivia Simin Fan (@Olivia-umich)
24
 
25
  Language specific toxicity classifiers in English, French, German, Italian, Spanish, Portuguese, Polish, Chinese and Dutch, trained on [PleIAs/ToxicCommons](https://github.com/Pleias/toxic-commons) and [SWSR-SexComments](https://arxiv.org/pdf/2108.03070) datasets.
26
-
27
 
28
  ## Model Description
29
  Our toxicity classifier employs a two-stage approach: we first extract the multilingual document embeddings using [*XLM-RoBERTa*](https://huggingface.co/FacebookAI/xlm-roberta-base),
30
  then train a language-specific 2-layer MLP for binary toxicity classification on top of these embeddings for 6 epochs.
31
  The classifier checkpoints with the best accuracy on the held-out validation set are further employed to annotate the toxicity scores on FineWeb-2 and FineWeb.
32
 
33
- The validation accuracies on the held-out test set is provided as below:
34
  | Language | Accuracy |
35
  |----------|----------|
36
  | English (en) | 80.13% |
@@ -110,7 +110,8 @@ class RobertaClassifier(nn.Module):
110
 
111
 
112
  ```python
113
- MODEL_PATH = f"{MODEL_DIR}/english.pth"
 
114
  DEVICE = "cpu"
115
 
116
  model = RobertaClassifier(device=DEVICE, num_classes=2)
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - fr
6
+ - de
7
+ - it
8
+ - es
9
+ - pt
10
+ - pl
11
+ - zh
12
+ - nl
13
+ base_model:
14
+ - FacebookAI/xlm-roberta-base
15
+ datasets:
16
+ - PleIAs/ToxicCommons
17
+ - yangezheng/SWSR-SexComment
18
+ tags:
19
+ - toxicity
20
+ - data
21
+ ---
22
  # Multilingual Toxicity Classifiers used in Apertus Pretraining
23
  #### Author: Olivia Simin Fan (@Olivia-umich)
24
 
25
  Language specific toxicity classifiers in English, French, German, Italian, Spanish, Portuguese, Polish, Chinese and Dutch, trained on [PleIAs/ToxicCommons](https://github.com/Pleias/toxic-commons) and [SWSR-SexComments](https://arxiv.org/pdf/2108.03070) datasets.
26
+ We subsample non-toxic examples to create balanced 50%-50% training sets for each language. We separate 10% from the balanced dataset to build a balanced held-out validation set.
27
 
28
  ## Model Description
29
  Our toxicity classifier employs a two-stage approach: we first extract the multilingual document embeddings using [*XLM-RoBERTa*](https://huggingface.co/FacebookAI/xlm-roberta-base),
30
  then train a language-specific 2-layer MLP for binary toxicity classification on top of these embeddings for 6 epochs.
31
  The classifier checkpoints with the best accuracy on the held-out validation set are further employed to annotate the toxicity scores on FineWeb-2 and FineWeb.
32
 
33
+ The validation accuracies on the balanced held-out test set are provided as below:
34
  | Language | Accuracy |
35
  |----------|----------|
36
  | English (en) | 80.13% |
 
110
 
111
 
112
  ```python
113
+ LANGUAGE = "english" # choose from ["english", "chinese", "french", "german", "italian", "spanish", "portuguese", "polish", "dutch"]
114
+ MODEL_PATH = f"{MODEL_DIR}/{LANGUAGE}.pth"
115
  DEVICE = "cpu"
116
 
117
  model = RobertaClassifier(device=DEVICE, num_classes=2)