Convert fine-tuned TinyLlama-1.1B-Chat-v1.0 to ONNX Format
#2
by
lakpriya
- opened
Hi! I’m interested in using my own fine-tuned version of the TinyLlama-1.1B-Chat-v1.0
model with onnx, which should also support Transformer.js
. I was wondering how you converted the model to ONNX format (and if you used any specific tools or steps to quantize it to INT8). Could you share your conversion process or any scripts you used? I'd love to replicate it for local usage. Thanks in advance!
lakpriya
changed discussion title from
Convert oTinyLlama-1.1B-Chat-v1.0 to ONNX Format
to Convert fine-tuned TinyLlama-1.1B-Chat-v1.0 to ONNX Format