SicariusSicariiStuff's picture
Update README.md
cb6f594 verified
|
raw
history blame
2.93 kB
metadata
language:
  - en
license: apache-2.0
LLAMA-3_8B_Unaligned_Alpha_RP_Soup
LLAMA-3_8B_Unaligned_Alpha_RP_Soup

Model Details

This model is the outcome of multiple merges, starting with the base model SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha. The merging process was conducted in several stages:

Merge 1: LLAMA-3_8B_Unaligned_Alpha was SLERP merged with EtherealRainbow-v0.3-8B.
Merge 2: LLAMA-3_8B_Unaligned_Alpha was SLERP merged with TheDrummer/Llama-3SOME-8B-v2.
Soup 1: Merge 1 was combined with Merge 2.
Final Merge: Soup 1 was SLERP merged with Nitral-Archive/Hathor_Enigmatica-L3-8B-v0.4.

The final model is surprisingly coherent (although slightly more censored), which is a bit unexpected, since all the intermediate merge steps were pretty incoherent.

Mergekit configs:

Merge 1

slices:
  - sources:
      - model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha
        layer_range: [0, 32]
      - model: BeaverAI/Llama-3SOME-8B-v2d
        layer_range: [0, 32]
merge_method: slerp
base_model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
dtype: float16

Merge 2

slices:
  - sources:
      - model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha
        layer_range: [0, 32]
      - model: invisietch/EtherealRainbow-v0.3-8B
        layer_range: [0, 32]
merge_method: slerp
base_model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
dtype: float16

Soup 1

slices:
  - sources:
      - model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha
        layer_range: [0, 32]
      - model: Nitral-Archive/Hathor_Enigmatica-L3-8B-v0.4
        layer_range: [0, 32]
merge_method: slerp
base_model: SicariusSicariiStuff/LLAMA-3_8B_Unaligned_Alpha
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
dtype: float16

Final Merge

slices:
  - sources:
      - model: Soup 1
        layer_range: [0, 32]
      - model: Nitral-Archive/Hathor_Enigmatica-L3-8B-v0.4
        layer_range: [0, 32]
merge_method: slerp
base_model: Soup 1
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
dtype: float16