Safetensors
Latvian
bert
normundsg commited on
Commit
390f23f
·
verified ·
1 Parent(s): ad00ffd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +99 -97
README.md CHANGED
@@ -1,98 +1,100 @@
1
- ---
2
- license: mit
3
- datasets:
4
- - SkyWater21/lv_go_emotions
5
- language:
6
- - lv
7
- ---
8
- Fine-tuned [multilingual BERT](https://huggingface.co/google-bert/bert-base-multilingual-cased) for multi-label emotion classification task.
9
-
10
- Model was trained on [lv_go_emotions](https://huggingface.co/datasets/SkyWater21/lv_go_emotions) dataset. This dataset is Latvian translation of [GoEmotions](https://huggingface.co/datasets/go_emotions) dataset. Google Translate was used to generate the machine translation.
11
-
12
- Original 26 emotions were mapped to 6 base emotions as per Dr. Ekman theory.
13
-
14
- Labels predicted by classifier:
15
- ```yaml
16
- 0: anger
17
- 1: disgust
18
- 2: fear
19
- 3: joy
20
- 4: sadness
21
- 5: surprise
22
- 6: neutral
23
- ```
24
-
25
- Label mapping from 27 emotions from GoEmotion to 6 base emotions as per Dr. Ekman theory:
26
- |GoEmotion|Ekman|
27
- |---|---|
28
- | admiration | joy|
29
- | amusement | joy|
30
- | anger | anger|
31
- | annoyance | anger|
32
- | approval | joy|
33
- | caring | joy|
34
- | confusion | surprise|
35
- | curiosity | surprise|
36
- | desire | joy|
37
- | disappointment | sadness|
38
- | disapproval | anger|
39
- | disgust | disgust|
40
- | embarrassment | sadness|
41
- | excitement | joy|
42
- | fear | fear|
43
- | gratitude | joy|
44
- | grief | sadness|
45
- | joy | joy|
46
- | love | joy|
47
- | nervousness | fear|
48
- | optimism | joy|
49
- | pride | joy|
50
- | realization | surprise|
51
- | relief | joy|
52
- | remorse | sadness|
53
- | sadness | sadness|
54
- | surprise | surprise|
55
- | neutral | neutral|
56
-
57
- Seed used for random number generator is 42:
58
- ```python
59
- def set_seed(seed=42):
60
- random.seed(seed)
61
- np.random.seed(seed)
62
- torch.manual_seed(seed)
63
- if torch.cuda.is_available():
64
- torch.cuda.manual_seed_all(seed)
65
- ```
66
-
67
- Training parameters:
68
- ```yaml
69
- max_length: null
70
- batch_size: 64
71
- shuffle: True
72
- num_workers: 8
73
- pin_memory: False
74
- drop_last: False
75
- optimizer: adam
76
- lr: 0.00001
77
- weight_decay: 0
78
-
79
- problem_type: multi_label_classification
80
-
81
- num_epochs: 4
82
- ```
83
-
84
-
85
- Evaluation results on test split of [lv_go_emotions](https://huggingface.co/datasets/SkyWater21/lv_go_emotions/viewer/simplified_ekman)
86
- | |Precision|Recall|F1-Score|AUC-ROC|Support|
87
- |--------------|---------|------|--------|-------|-------|
88
- |anger | 0.58| 0.36| 0.45| 0.83| 726|
89
- |disgust | 0.88| 0.12| 0.21| 0.90| 123|
90
- |fear | 0.75| 0.48| 0.58| 0.93| 98|
91
- |joy | 0.82| 0.76| 0.79| 0.90| 2104|
92
- |sadness | 0.69| 0.46| 0.55| 0.88| 379|
93
- |surprise | 0.61| 0.51| 0.55| 0.87| 677|
94
- |neutral | 0.65| 0.62| 0.64| 0.83| 1787|
95
- |micro avg | 0.71| 0.60| 0.65| 0.92| 5894|
96
- |macro avg | 0.71| 0.47| 0.54| 0.88| 5894|
97
- |weighted avg | 0.71| 0.60| 0.64| 0.87| 5894|
 
 
98
  |samples avg | 0.63| 0.62| 0.62| nan| 5894|
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - AiLab-IMCS-UL/go_emotions-lv
5
+ language:
6
+ - lv
7
+ base_model:
8
+ - google-bert/bert-base-multilingual-cased
9
+ ---
10
+ Fine-tuned [multilingual BERT](https://huggingface.co/google-bert/bert-base-multilingual-cased) for multi-label emotion classification task.
11
+
12
+ Model was trained on [lv_go_emotions](https://huggingface.co/datasets/SkyWater21/lv_go_emotions) dataset. This dataset is Latvian translation of [GoEmotions](https://huggingface.co/datasets/go_emotions) dataset. Google Translate was used to generate the machine translation.
13
+
14
+ Original 26 emotions were mapped to 6 base emotions as per Dr. Ekman theory.
15
+
16
+ Labels predicted by classifier:
17
+ ```yaml
18
+ 0: anger
19
+ 1: disgust
20
+ 2: fear
21
+ 3: joy
22
+ 4: sadness
23
+ 5: surprise
24
+ 6: neutral
25
+ ```
26
+
27
+ Label mapping from 27 emotions from GoEmotion to 6 base emotions as per Dr. Ekman theory:
28
+ |GoEmotion|Ekman|
29
+ |---|---|
30
+ | admiration | joy|
31
+ | amusement | joy|
32
+ | anger | anger|
33
+ | annoyance | anger|
34
+ | approval | joy|
35
+ | caring | joy|
36
+ | confusion | surprise|
37
+ | curiosity | surprise|
38
+ | desire | joy|
39
+ | disappointment | sadness|
40
+ | disapproval | anger|
41
+ | disgust | disgust|
42
+ | embarrassment | sadness|
43
+ | excitement | joy|
44
+ | fear | fear|
45
+ | gratitude | joy|
46
+ | grief | sadness|
47
+ | joy | joy|
48
+ | love | joy|
49
+ | nervousness | fear|
50
+ | optimism | joy|
51
+ | pride | joy|
52
+ | realization | surprise|
53
+ | relief | joy|
54
+ | remorse | sadness|
55
+ | sadness | sadness|
56
+ | surprise | surprise|
57
+ | neutral | neutral|
58
+
59
+ Seed used for random number generator is 42:
60
+ ```python
61
+ def set_seed(seed=42):
62
+ random.seed(seed)
63
+ np.random.seed(seed)
64
+ torch.manual_seed(seed)
65
+ if torch.cuda.is_available():
66
+ torch.cuda.manual_seed_all(seed)
67
+ ```
68
+
69
+ Training parameters:
70
+ ```yaml
71
+ max_length: null
72
+ batch_size: 64
73
+ shuffle: True
74
+ num_workers: 8
75
+ pin_memory: False
76
+ drop_last: False
77
+ optimizer: adam
78
+ lr: 0.00001
79
+ weight_decay: 0
80
+
81
+ problem_type: multi_label_classification
82
+
83
+ num_epochs: 4
84
+ ```
85
+
86
+
87
+ Evaluation results on test split of [lv_go_emotions](https://huggingface.co/datasets/SkyWater21/lv_go_emotions/viewer/simplified_ekman)
88
+ | |Precision|Recall|F1-Score|AUC-ROC|Support|
89
+ |--------------|---------|------|--------|-------|-------|
90
+ |anger | 0.58| 0.36| 0.45| 0.83| 726|
91
+ |disgust | 0.88| 0.12| 0.21| 0.90| 123|
92
+ |fear | 0.75| 0.48| 0.58| 0.93| 98|
93
+ |joy | 0.82| 0.76| 0.79| 0.90| 2104|
94
+ |sadness | 0.69| 0.46| 0.55| 0.88| 379|
95
+ |surprise | 0.61| 0.51| 0.55| 0.87| 677|
96
+ |neutral | 0.65| 0.62| 0.64| 0.83| 1787|
97
+ |micro avg | 0.71| 0.60| 0.65| 0.92| 5894|
98
+ |macro avg | 0.71| 0.47| 0.54| 0.88| 5894|
99
+ |weighted avg | 0.71| 0.60| 0.64| 0.87| 5894|
100
  |samples avg | 0.63| 0.62| 0.62| nan| 5894|