RichardErkhov commited on
Commit
76325a8
·
verified ·
1 Parent(s): 9870ff0

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +67 -0
README.md ADDED
@@ -0,0 +1,67 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ kukulemon-v3-soul_mix-32k-7B - bnb 4bits
11
+ - Model creator: https://huggingface.co/grimjim/
12
+ - Original model: https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ base_model:
20
+ - grimjim/kukulemon-32K-7B
21
+ - grimjim/rogue-enchantress-32k-7B
22
+ library_name: transformers
23
+ tags:
24
+ - mergekit
25
+ - merge
26
+ license: cc-by-nc-4.0
27
+ pipeline_tag: text-generation
28
+ ---
29
+ # kukulemon-v3-soul_mix-32k-7B
30
+
31
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
32
+
33
+ We explore merger at extremely low weight as an alternative to fine-tuning. The additional model was applied at a weight of 10e-5, which was selected to be comparable to a few epochs of training. The low weight also amounts to the additional model being flattened, though technically not sparsified.
34
+
35
+ - [Full weights](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B)
36
+ - [GGUF quants](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B-GGUF)
37
+
38
+ ## Merge Details
39
+ ### Merge Method
40
+
41
+ This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [grimjim/kukulemon-32K-7B](https://huggingface.co/grimjim/kukulemon-32K-7B) as a base.
42
+
43
+ ### Models Merged
44
+
45
+ The following model was included in the merge:
46
+ * [grimjim/rogue-enchantress-32k-7B](https://huggingface.co/grimjim/rogue-enchantress-32k-7B)
47
+
48
+ ### Configuration
49
+
50
+ The following YAML configuration was used to produce this model:
51
+
52
+ ```yaml
53
+ base_model: grimjim/kukulemon-32K-7B
54
+ dtype: bfloat16
55
+ merge_method: task_arithmetic
56
+ slices:
57
+ - sources:
58
+ - layer_range: [0, 32]
59
+ model: grimjim/kukulemon-32K-7B
60
+ - layer_range: [0, 32]
61
+ model: grimjim/rogue-enchantress-32k-7B
62
+ parameters:
63
+ weight: 10e-5
64
+
65
+ ```
66
+
67
+