meta-llama/Llama-3.2-3B-Instruct, UQFF quantization

Run with mistral.rs. Documentation: UQFF docs.

  1. Flexible 🌀: Multiple quantization formats in one file format with one framework to run them all.
  2. Reliable 🔒: Compatibility ensured with embedded and checked semantic versioning information from day 1.
  3. Easy 🤗: Download UQFF models easily and quickly from HF中国镜像站, or use a local file.
  4. Customizable 🛠️: Make and publish your own UQFF files in minutes.

Examples

Quantization type(s) Example
FP8 ./mistralrs-server -i plain -m EricB/Llama-3.2-3B-Instruct-UQFF --from-uqff llama3.2-3b-instruct-f8e4m3.uqff
HQQ4 ./mistralrs-server -i plain -m EricB/Llama-3.2-3B-Instruct-UQFF --from-uqff llama3.2-3b-instruct-hqq4.uqff
HQQ8 ./mistralrs-server -i plain -m EricB/Llama-3.2-3B-Instruct-UQFF --from-uqff llama3.2-3b-instruct-hqq8.uqff
Q3K ./mistralrs-server -i plain -m EricB/Llama-3.2-3B-Instruct-UQFF --from-uqff llama3.2-3b-instruct-q3k.uqff
Q4K ./mistralrs-server -i plain -m EricB/Llama-3.2-3B-Instruct-UQFF --from-uqff llama3.2-3b-instruct-q4k.uqff
Q5K ./mistralrs-server -i plain -m EricB/Llama-3.2-3B-Instruct-UQFF --from-uqff llama3.2-3b-instruct-q5k.uqff
Q8_0 ./mistralrs-server -i plain -m EricB/Llama-3.2-3B-Instruct-UQFF --from-uqff llama3.2-3b-instruct-q8_0.uqff
Downloads last month
9
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for EricB/Llama-3.2-3B-Instruct-UQFF

Quantized
(268)
this model

Collection including EricB/Llama-3.2-3B-Instruct-UQFF