QuantFactory/ArliAI-Llama-3-8B-Instruct-ORPO-v0.1-GGUF

This is quantized version of OwenArli/ArliAI-Llama-3-8B-Instruct-ORPO-v0.1 created using llama.cpp

Model Description

Based on Meta-Llama-3-8b-Instruct, and is governed by Meta Llama 3 License agreement: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

ORPO fine tuning method using the following datasets:

https://huggingface.co/datasets/Intel/orca_dpo_pairs
https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo
https://huggingface.co/datasets/unalignment/toxic-dpo-v0.2
https://huggingface.co/datasets/M4-ai/prm_dpo_pairs_cleaned
https://huggingface.co/datasets/jondurbin/truthy-dpo-v0.1

Despite the toxic datasets to reduce refusals, this model is still relatively safe but refuses less than the original Meta model.

As of now ORPO fine tuning seems to improve some metrics while reducing other metrics by a lot:

Instruct format:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{{ system_prompt }}<|eot_id|><|start_header_id|>user<|end_header_id|>

{{ user_message_1 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{{ model_answer_1 }}<|eot_id|><|start_header_id|>user<|end_header_id|>

{{ user_message_2 }}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Quants:

QuantFactory
/

ArliAI-Llama-3-8B-Instruct-ORPO-v0.1-GGUF

QuantFactory/ArliAI-Llama-3-8B-Instruct-ORPO-v0.1-GGUF

Model Description

Model tree for QuantFactory/ArliAI-Llama-3-8B-Instruct-ORPO-v0.1-GGUF

Collection including QuantFactory/ArliAI-Llama-3-8B-Instruct-ORPO-v0.1-GGUF

ArliAI-Llama-3