stabilityai/stable-diffusion-3-medium
💡
Original seeder @aitracker
Stable Diffusion 3 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that significantly improves performance in image quality, typography, complex prompt understanding, and resource efficiency. It is developed by Stability AI and released under the Stability Non-Commercial Research Community License for non-commercial use. For commercial use, a separate license is required.
Key Features
- Improved Performance: Stable Diffusion 3 Medium outperforms state-of-the-art text-to-image generation systems in typography and prompt adherence, based on human preference evaluations.
- Multimodal Diffusion Transformer (MMDiT): The model uses separate sets of weights for image and language representations, enhancing text understanding and spelling capabilities compared to previous versions of Stable Diffusion.
- Resource Efficiency: The model can be used with various parameter sizes, ranging from 800m to 8B, to accommodate different hardware capabilities.
- Flexible Text Encoders: The model includes three fixed, pretrained text encoders (OpenCLIP-ViT/G, CLIP-ViT/L, and T5-xxl) for efficient text processing.
- Training Dataset: The model was pre-trained on 1 billion images and fine-tuned on 30M high-quality aesthetic images and 3M preference data images.
Availability and Licensing
- Non-Commercial Use: Released under the Stability AI Non-Commercial Research Community License for academic research and other non-commercial purposes.
- Commercial Use: Requires a separate commercial license from Stability AI.
- Platforms: Available on the Stability API Platform, Stable Assistant, and Discord via Stable Artisan.
- Local Use: Recommended for use with ComfyUI for inference.
Safety and Evaluation
- Safety Measures: Implemented safety measures throughout development to reduce the risk of severe harms.
- Evaluation Methods: Includes structured evaluations and internal and external red-teaming testing for specific, severe harms.
- Risks and Mitigations: Identified risks include harmful content, misuse, and privacy violations, with mitigations such as filtered data sets, safeguards, and content safety guardrails.