feat: update evaluations.
Browse files
README.md
CHANGED
@@ -338,8 +338,8 @@ The evaluation is based on using [Llama-3.1-8B-Instruct](https://huggingface.co/
|
|
338 |
|
339 |
| **Model** | **Average** | **AQuAS** | **RagQuAS** | **CAM** | **CAM_E1** | **CAM_E2** | **CAM_E3** | **Shops** | **Insurance** |
|
340 |
|----------------------------|-------------|-----------|-------------|----------|------------|------------|------------|-----------|---------------|
|
341 |
-
| **RigoChat-7b-v2** | **79.
|
342 |
-
| GPT-4o | 78.26 | **85.23** | 77.91 | 78.00 | 74.91 | 73.45 |
|
343 |
| stablelm-2-12b-chat | 77.74 | 78.88 | 78.21 | 77.82 | 78.73 | **77.27** | 74.73 | 77.03 | 79.26 |
|
344 |
| Mistral-Small-Instruct-2409| 77.29 | 80.56 | 78.81 | 77.82 | 75.82 | 73.27 | 73.45 | 78.25 | 80.36 |
|
345 |
| Qwen2.5-7B-Instruct | 77.17 | 80.93 | 77.41 | 77.82 | 75.09 | 75.45 | 72.91 | 78.08 | 79.67 |
|
|
|
338 |
|
339 |
| **Model** | **Average** | **AQuAS** | **RagQuAS** | **CAM** | **CAM_E1** | **CAM_E2** | **CAM_E3** | **Shops** | **Insurance** |
|
340 |
|----------------------------|-------------|-----------|-------------|----------|------------|------------|------------|-----------|---------------|
|
341 |
+
| **RigoChat-7b-v2** | **79.55** | 82.52 | 79.10 | **78.91**| **79.17** | 76.73 | **78.23** | **80.79** | **81.04** |
|
342 |
+
| GPT-4o | 78.26 | **85.23** | 77.91 | 78.00 | 74.91 | 73.45 | 77.09 | 78.60 | 80.89 |
|
343 |
| stablelm-2-12b-chat | 77.74 | 78.88 | 78.21 | 77.82 | 78.73 | **77.27** | 74.73 | 77.03 | 79.26 |
|
344 |
| Mistral-Small-Instruct-2409| 77.29 | 80.56 | 78.81 | 77.82 | 75.82 | 73.27 | 73.45 | 78.25 | 80.36 |
|
345 |
| Qwen2.5-7B-Instruct | 77.17 | 80.93 | 77.41 | 77.82 | 75.09 | 75.45 | 72.91 | 78.08 | 79.67 |
|