meta-llama/Llama-3.2-1B-Instruct, UQFF quantization

Run with mistral.rs. Documentation: UQFF docs.

  1. Flexible 🌀: Multiple quantization formats in one file format with one framework to run them all.
  2. Reliable 🔒: Compatibility ensured with embedded and checked semantic versioning information from day 1.
  3. Easy 🤗: Download UQFF models easily and quickly from HF中国镜像站, or use a local file.
  4. Customizable 🛠️: Make and publish your own UQFF files in minutes.

Examples

Quantization type(s) Example
FP8 ./mistralrs-server -i plain -m EricB/Llama-3.2-1B-Instruct-UQFF --from-uqff llama3.2-1b-instruct-f8e4m3.uqff
HQQ4 ./mistralrs-server -i plain -m EricB/Llama-3.2-1B-Instruct-UQFF --from-uqff llama3.2-1b-instruct-hqq4.uqff
HQQ8 ./mistralrs-server -i plain -m EricB/Llama-3.2-1B-Instruct-UQFF --from-uqff llama3.2-1b-instruct-hqq8.uqff
Q3K ./mistralrs-server -i plain -m EricB/Llama-3.2-1B-Instruct-UQFF --from-uqff llama3.2-1b-instruct-q3k.uqff
Q4K ./mistralrs-server -i plain -m EricB/Llama-3.2-1B-Instruct-UQFF --from-uqff llama3.2-1b-instruct-q4k.uqff
Q5K ./mistralrs-server -i plain -m EricB/Llama-3.2-1B-Instruct-UQFF --from-uqff llama3.2-1b-instruct-q5k.uqff
Q8_0 ./mistralrs-server -i plain -m EricB/Llama-3.2-1B-Instruct-UQFF --from-uqff llama3.2-1b-instruct-q8_0.uqff
Downloads last month
15
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for EricB/Llama-3.2-1B-Instruct-UQFF

Quantized
(215)
this model

Collection including EricB/Llama-3.2-1B-Instruct-UQFF