Spaces:
Running
Running
Any plan for Q4_K_M like quantization?
#10
by
xuhenglight
- opened
The 4int quantization does not work quite well like Q4_K_M quantization, so is there any way to do Q4_K_M like quantization using MLX?
xuhenglight
changed discussion title from
Any plan for Q4_K_M like quantization
to Any plan for Q4_K_M like quantization?
I got feedback from mlx team member: https://github.com/ml-explore/mlx/issues/1934#issuecomment-2702764245
but not sure if mixed_3_6 will plays better than int4 quantization.