Yi-Coder-9B-Chat-f16.gguf |
f16 |
17.66GB |
False |
Full F16 weights. |
Yi-Coder-9B-Chat-Q8_0.gguf |
Q8_0 |
9.38GB |
False |
Extremely high quality, generally unneeded but max available quant. |
Yi-Coder-9B-Chat-Q6_K_L.gguf |
Q6_K_L |
7.37GB |
False |
Uses Q8_0 for embed and output weights. Very high quality, near perfect, recommended. |
Yi-Coder-9B-Chat-Q6_K.gguf |
Q6_K |
7.25GB |
False |
Very high quality, near perfect, recommended. |
Yi-Coder-9B-Chat-Q5_K_L.gguf |
Q5_K_L |
6.42GB |
False |
Uses Q8_0 for embed and output weights. High quality, recommended. |
Yi-Coder-9B-Chat-Q5_K_M.gguf |
Q5_K_M |
6.26GB |
False |
High quality, recommended. |
Yi-Coder-9B-Chat-Q5_K_S.gguf |
Q5_K_S |
6.11GB |
False |
High quality, recommended. |
Yi-Coder-9B-Chat-Q4_K_L.gguf |
Q4_K_L |
5.52GB |
False |
Uses Q8_0 for embed and output weights. Good quality, recommended. |
Yi-Coder-9B-Chat-Q4_K_M.gguf |
Q4_K_M |
5.33GB |
False |
Good quality, default size for most use cases, recommended. |
Yi-Coder-9B-Chat-Q4_K_S.gguf |
Q4_K_S |
5.07GB |
False |
Slightly lower quality with more space savings, recommended. |
Yi-Coder-9B-Chat-Q4_0.gguf |
Q4_0 |
5.05GB |
False |
Legacy format, generally not worth using over similarly sized formats |
Yi-Coder-9B-Chat-Q4_0_8_8.gguf |
Q4_0_8_8 |
5.04GB |
False |
Optimized for ARM inference. Requires 'sve' support (see link below). |
Yi-Coder-9B-Chat-Q4_0_4_8.gguf |
Q4_0_4_8 |
5.04GB |
False |
Optimized for ARM inference. Requires 'i8mm' support (see link below). |
Yi-Coder-9B-Chat-Q4_0_4_4.gguf |
Q4_0_4_4 |
5.04GB |
False |
Optimized for ARM inference. Should work well on all ARM chips, pick this if you're unsure. |
Yi-Coder-9B-Chat-Q3_K_XL.gguf |
Q3_K_XL |
4.92GB |
False |
Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
Yi-Coder-9B-Chat-IQ4_XS.gguf |
IQ4_XS |
4.79GB |
False |
Decent quality, smaller than Q4_K_S with similar performance, recommended. |
Yi-Coder-9B-Chat-Q3_K_L.gguf |
Q3_K_L |
4.69GB |
False |
Lower quality but usable, good for low RAM availability. |
Yi-Coder-9B-Chat-Q3_K_M.gguf |
Q3_K_M |
4.32GB |
False |
Low quality. |
Yi-Coder-9B-Chat-IQ3_M.gguf |
IQ3_M |
4.06GB |
False |
Medium-low quality, new method with decent performance comparable to Q3_K_M. |
Yi-Coder-9B-Chat-Q3_K_S.gguf |
Q3_K_S |
3.90GB |
False |
Low quality, not recommended. |
Yi-Coder-9B-Chat-IQ3_XS.gguf |
IQ3_XS |
3.72GB |
False |
Lower quality, new method with decent performance, slightly better than Q3_K_S. |
Yi-Coder-9B-Chat-Q2_K_L.gguf |
Q2_K_L |
3.61GB |
False |
Uses Q8_0 for embed and output weights. Very low quality but surprisingly usable. |
Yi-Coder-9B-Chat-Q2_K.gguf |
Q2_K |
3.35GB |
False |
Very low quality but surprisingly usable. |
Yi-Coder-9B-Chat-IQ2_M.gguf |
IQ2_M |
3.10GB |
False |
Relatively low quality, uses SOTA techniques to be surprisingly usable. |