Any plan for Q4_K_M like quantization?

#10
by xuhenglight - opened

The 4int quantization does not work quite well like Q4_K_M quantization, so is there any way to do Q4_K_M like quantization using MLX?

xuhenglight changed discussion title from Any plan for Q4_K_M like quantization to Any plan for Q4_K_M like quantization?

I got feedback from mlx team member: https://github.com/ml-explore/mlx/issues/1934#issuecomment-2702764245
but not sure if mixed_3_6 will plays better than int4 quantization.

Sign up or log in to comment