Xenova HF staff commited on
Commit
57aa3c6
·
verified ·
1 Parent(s): 4db41b7

Transformers.js improvements (#25)

Browse files

- [WIP] Transformers.js improvements (b86e3eda672bc29427317d7d6753fbf808730d94)
- Update README.md (26cd37002254a59f114c509de0b80ef80aaf90d6)

Files changed (2) hide show
  1. README.md +29 -2
  2. config.json +5 -0
README.md CHANGED
@@ -40,7 +40,7 @@ For more details refer to: https://github.com/huggingface/smollm. You will find
40
 
41
  ### How to use
42
 
43
- ### Transformers
44
  ```bash
45
  pip install transformers
46
  ```
@@ -62,13 +62,40 @@ print(tokenizer.decode(outputs[0]))
62
  ```
63
 
64
 
65
- ### Chat in TRL
66
  You can also use the TRL CLI to chat with the model from the terminal:
67
  ```bash
68
  pip install trl
69
  trl chat --model_name_or_path HuggingFaceTB/SmolLM2-1.7B-Instruct --device cpu
70
  ```
71
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  ## Evaluation
73
 
74
  In this section, we report the evaluation results of SmolLM2. All evaluations are zero-shot unless stated otherwise, and we use [lighteval](https://github.com/huggingface/lighteval) to run them.
 
40
 
41
  ### How to use
42
 
43
+ #### Transformers
44
  ```bash
45
  pip install transformers
46
  ```
 
62
  ```
63
 
64
 
65
+ #### Chat in TRL
66
  You can also use the TRL CLI to chat with the model from the terminal:
67
  ```bash
68
  pip install trl
69
  trl chat --model_name_or_path HuggingFaceTB/SmolLM2-1.7B-Instruct --device cpu
70
  ```
71
 
72
+ #### Transformers.js
73
+
74
+ ```bash
75
+ npm i @huggingface/transformers
76
+ ```
77
+
78
+ ```js
79
+ import { pipeline } from "@huggingface/transformers";
80
+
81
+ // Create a text generation pipeline
82
+ const generator = await pipeline(
83
+ "text-generation",
84
+ "HuggingFaceTB/SmolLM2-1.7B-Instruct",
85
+ );
86
+
87
+ // Define the list of messages
88
+ const messages = [
89
+ { role: "system", content: "You are a helpful assistant." },
90
+ { role: "user", content: "Tell me a joke." },
91
+ ];
92
+
93
+ // Generate a response
94
+ const output = await generator(messages, { max_new_tokens: 128 });
95
+ console.log(output[0].generated_text.at(-1).content);
96
+ // "Why don't scientists trust atoms?\n\nBecause they make up everything!"
97
+ ```
98
+
99
  ## Evaluation
100
 
101
  In this section, we report the evaluation results of SmolLM2. All evaluations are zero-shot unless stated otherwise, and we use [lighteval](https://github.com/huggingface/lighteval) to run them.
config.json CHANGED
@@ -25,9 +25,14 @@
25
  "torch_dtype": "bfloat16",
26
  "transformers_version": "4.42.3",
27
  "transformers.js_config": {
 
28
  "kv_cache_dtype": {
29
  "q4f16": "float16",
30
  "fp16": "float16"
 
 
 
 
31
  }
32
  },
33
  "use_cache": true,
 
25
  "torch_dtype": "bfloat16",
26
  "transformers_version": "4.42.3",
27
  "transformers.js_config": {
28
+ "dtype": "q4",
29
  "kv_cache_dtype": {
30
  "q4f16": "float16",
31
  "fp16": "float16"
32
+ },
33
+ "use_external_data_format": {
34
+ "model.onnx": true,
35
+ "model_fp16.onnx": true
36
  }
37
  },
38
  "use_cache": true,