bartowski/Pantheon-RP-1.6-12b-Nemo-GGUF-torrent

Last updated on Aug 20, 2024

Filename	Quant Type	File Size	Split	Description
Pantheon-RP-1.6-12b-Nemo-bf16.gguf	bf16	24.50GB	❌	Full BF16 weights.
Pantheon-RP-1.6-12b-Nemo-Q8_0.gguf	Q8_0	13.02GB	❌	Extremely high quality, generally unneeded but max available quant.
Pantheon-RP-1.6-12b-Nemo-Q6_K_L.gguf	Q6_K_L	10.38GB	❌	Uses Q8_0 for embed and output weights. Very high quality, near perfect, recommended.
Pantheon-RP-1.6-12b-Nemo-Q6_K.gguf	Q6_K	10.06GB	❌	Very high quality, near perfect, recommended.
Pantheon-RP-1.6-12b-Nemo-Q5_K_L.gguf	Q5_K_L	9.14GB	❌	Uses Q8_0 for embed and output weights. High quality, recommended.
Pantheon-RP-1.6-12b-Nemo-Q5_K_M.gguf	Q5_K_M	8.73GB	❌	High quality, recommended.
Pantheon-RP-1.6-12b-Nemo-Q5_K_S.gguf	Q5_K_S	8.52GB	❌	High quality, recommended.
Pantheon-RP-1.6-12b-Nemo-Q4_K_L.gguf	Q4_K_L	7.98GB	❌	Uses Q8_0 for embed and output weights. Good quality, recommended.
Pantheon-RP-1.6-12b-Nemo-Q4_K_M.gguf	Q4_K_M	7.48GB	❌	Good quality, default size for most use cases, recommended.
Pantheon-RP-1.6-12b-Nemo-Q3_K_XL.gguf	Q3_K_XL	7.15GB	❌	Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability.
Pantheon-RP-1.6-12b-Nemo-Q4_K_S.gguf	Q4_K_S	7.12GB	❌	Slightly lower quality with more space savings, recommended.
Pantheon-RP-1.6-12b-Nemo-IQ4_XS.gguf	IQ4_XS	6.74GB	❌	Decent quality, smaller than Q4_K_S with similar performance, recommended.
Pantheon-RP-1.6-12b-Nemo-Q3_K_L.gguf	Q3_K_L	6.56GB	❌	Lower quality but usable, good for low RAM availability.
Pantheon-RP-1.6-12b-Nemo-Q3_K_M.gguf	Q3_K_M	6.08GB	❌	Low quality.
Pantheon-RP-1.6-12b-Nemo-IQ3_M.gguf	IQ3_M	5.72GB	❌	Medium-low quality, new method with decent performance comparable to Q3_K_M.
Pantheon-RP-1.6-12b-Nemo-Q3_K_S.gguf	Q3_K_S	5.53GB	❌	Low quality, not recommended.
Pantheon-RP-1.6-12b-Nemo-Q2_K_L.gguf	Q2_K_L	5.45GB	❌	Uses Q8_0 for embed and output weights. Very low quality but surprisingly usable.
Pantheon-RP-1.6-12b-Nemo-IQ3_XS.gguf	IQ3_XS	5.31GB	❌	Lower quality, new method with decent performance, slightly better than Q3_K_S.
Pantheon-RP-1.6-12b-Nemo-Q2_K.gguf	Q2_K	4.79GB	❌	Very low quality but surprisingly usable.
Pantheon-RP-1.6-12b-Nemo-IQ2_M.gguf	IQ2_M	4.44GB	❌	Relatively low quality, uses SOTA techniques to be surprisingly usable.