Hermes-3-Llama-3.1-70B-Q8_0.gguf |
Q8_0 |
74.98GB |
✅ |
Extremely high quality, generally unneeded but max available quant. |
Hermes-3-Llama-3.1-70B-Q6_K.gguf |
Q6_K |
57.89GB |
✅ |
Very high quality, near perfect, recommended. |
Hermes-3-Llama-3.1-70B-Q5_K_M.gguf |
Q5_K_M |
49.95GB |
✅ |
High quality, recommended. |
Hermes-3-Llama-3.1-70B-Q4_K_L.gguf |
Q4_K_L |
43.30GB |
❌ |
Uses Q8_0 for embed and output weights. Good quality, recommended. |
Hermes-3-Llama-3.1-70B-Q4_K_M.gguf |
Q4_K_M |
42.52GB |
❌ |
Good quality, default size for most use cases, recommended. |
Hermes-3-Llama-3.1-70B-Q4_K_S.gguf |
Q4_K_S |
40.35GB |
❌ |
Slightly lower quality with more space savings, recommended. |
Hermes-3-Llama-3.1-70B-Q3_K_XL.gguf |
Q3_K_XL |
38.06GB |
❌ |
Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
Hermes-3-Llama-3.1-70B-IQ4_XS.gguf |
IQ4_XS |
37.90GB |
❌ |
Decent quality, smaller than Q4_K_S with similar performance, recommended. |
Hermes-3-Llama-3.1-70B-Q3_K_L.gguf |
Q3_K_L |
37.14GB |
❌ |
Lower quality but usable, good for low RAM availability. |
Hermes-3-Llama-3.1-70B-Q3_K_M.gguf |
Q3_K_M |
34.27GB |
❌ |
Low quality. |
Hermes-3-Llama-3.1-70B-IQ3_M.gguf |
IQ3_M |
31.94GB |
❌ |
Medium-low quality, new method with decent performance comparable to Q3_K_M. |
Hermes-3-Llama-3.1-70B-Q3_K_S.gguf |
Q3_K_S |
30.91GB |
❌ |
Low quality, not recommended. |
Hermes-3-Llama-3.1-70B-IQ3_XXS.gguf |
IQ3_XXS |
27.47GB |
❌ |
Lower quality, new method with decent performance, comparable to Q3 quants. |
Hermes-3-Llama-3.1-70B-Q2_K_L.gguf |
Q2_K_L |
27.40GB |
❌ |
Uses Q8_0 for embed and output weights. Very low quality but surprisingly usable. |
Hermes-3-Llama-3.1-70B-Q2_K.gguf |
Q2_K |
26.38GB |
❌ |
Very low quality but surprisingly usable. |
Hermes-3-Llama-3.1-70B-IQ2_M.gguf |
IQ2_M |
24.12GB |
❌ |
Relatively low quality, uses SOTA techniques to be surprisingly usable. |
Hermes-3-Llama-3.1-70B-IQ2_XS.gguf |
IQ2_XS |
21.14GB |
❌ |
Low quality, uses SOTA techniques to be usable. |
Hermes-3-Llama-3.1-70B-IQ2_XXS.gguf |
IQ2_XXS |
19.10GB |
❌ |
Very low quality, uses SOTA techniques to be usable. |
Hermes-3-Llama-3.1-70B-IQ1_M.gguf |
IQ1_M |
16.75GB |
❌ |
Extremely low quality, not recommended. |