Pantheon-RP-1.6-12b-Nemo-bf16.gguf |
bf16 |
24.50GB |
❌ |
Full BF16 weights. |
Pantheon-RP-1.6-12b-Nemo-Q8_0.gguf |
Q8_0 |
13.02GB |
❌ |
Extremely high quality, generally unneeded but max available quant. |
Pantheon-RP-1.6-12b-Nemo-Q6_K_L.gguf |
Q6_K_L |
10.38GB |
❌ |
Uses Q8_0 for embed and output weights. Very high quality, near perfect, recommended. |
Pantheon-RP-1.6-12b-Nemo-Q6_K.gguf |
Q6_K |
10.06GB |
❌ |
Very high quality, near perfect, recommended. |
Pantheon-RP-1.6-12b-Nemo-Q5_K_L.gguf |
Q5_K_L |
9.14GB |
❌ |
Uses Q8_0 for embed and output weights. High quality, recommended. |
Pantheon-RP-1.6-12b-Nemo-Q5_K_M.gguf |
Q5_K_M |
8.73GB |
❌ |
High quality, recommended. |
Pantheon-RP-1.6-12b-Nemo-Q5_K_S.gguf |
Q5_K_S |
8.52GB |
❌ |
High quality, recommended. |
Pantheon-RP-1.6-12b-Nemo-Q4_K_L.gguf |
Q4_K_L |
7.98GB |
❌ |
Uses Q8_0 for embed and output weights. Good quality, recommended. |
Pantheon-RP-1.6-12b-Nemo-Q4_K_M.gguf |
Q4_K_M |
7.48GB |
❌ |
Good quality, default size for most use cases, recommended. |
Pantheon-RP-1.6-12b-Nemo-Q3_K_XL.gguf |
Q3_K_XL |
7.15GB |
❌ |
Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
Pantheon-RP-1.6-12b-Nemo-Q4_K_S.gguf |
Q4_K_S |
7.12GB |
❌ |
Slightly lower quality with more space savings, recommended. |
Pantheon-RP-1.6-12b-Nemo-IQ4_XS.gguf |
IQ4_XS |
6.74GB |
❌ |
Decent quality, smaller than Q4_K_S with similar performance, recommended. |
Pantheon-RP-1.6-12b-Nemo-Q3_K_L.gguf |
Q3_K_L |
6.56GB |
❌ |
Lower quality but usable, good for low RAM availability. |
Pantheon-RP-1.6-12b-Nemo-Q3_K_M.gguf |
Q3_K_M |
6.08GB |
❌ |
Low quality. |
Pantheon-RP-1.6-12b-Nemo-IQ3_M.gguf |
IQ3_M |
5.72GB |
❌ |
Medium-low quality, new method with decent performance comparable to Q3_K_M. |
Pantheon-RP-1.6-12b-Nemo-Q3_K_S.gguf |
Q3_K_S |
5.53GB |
❌ |
Low quality, not recommended. |
Pantheon-RP-1.6-12b-Nemo-Q2_K_L.gguf |
Q2_K_L |
5.45GB |
❌ |
Uses Q8_0 for embed and output weights. Very low quality but surprisingly usable. |
Pantheon-RP-1.6-12b-Nemo-IQ3_XS.gguf |
IQ3_XS |
5.31GB |
❌ |
Lower quality, new method with decent performance, slightly better than Q3_K_S. |
Pantheon-RP-1.6-12b-Nemo-Q2_K.gguf |
Q2_K |
4.79GB |
❌ |
Very low quality but surprisingly usable. |
Pantheon-RP-1.6-12b-Nemo-IQ2_M.gguf |
IQ2_M |
4.44GB |
❌ |
Relatively low quality, uses SOTA techniques to be surprisingly usable. |