stabilityai/stable-diffusion-3-medium

Last updated on Jun 20, 2024

💡

Original seeder @aitracker

Stable Diffusion 3 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that significantly improves performance in image quality, typography, complex prompt understanding, and resource efficiency. It is developed by Stability AI and released under the Stability Non-Commercial Research Community License for non-commercial use. For commercial use, a separate license is required.

Key Features

Improved Performance: Stable Diffusion 3 Medium outperforms state-of-the-art text-to-image generation systems in typography and prompt adherence, based on human preference evaluations.
Multimodal Diffusion Transformer (MMDiT): The model uses separate sets of weights for image and language representations, enhancing text understanding and spelling capabilities compared to previous versions of Stable Diffusion.
Resource Efficiency: The model can be used with various parameter sizes, ranging from 800m to 8B, to accommodate different hardware capabilities.
Flexible Text Encoders: The model includes three fixed, pretrained text encoders (OpenCLIP-ViT/G, CLIP-ViT/L, and T5-xxl) for efficient text processing.
Training Dataset: The model was pre-trained on 1 billion images and fine-tuned on 30M high-quality aesthetic images and 3M preference data images.

Availability and Licensing

Non-Commercial Use: Released under the Stability AI Non-Commercial Research Community License for academic research and other non-commercial purposes.
Commercial Use: Requires a separate commercial license from Stability AI.
Platforms: Available on the Stability API Platform, Stable Assistant, and Discord via Stable Artisan.
Local Use: Recommended for use with ComfyUI for inference.

Safety and Evaluation

Safety Measures: Implemented safety measures throughout development to reduce the risk of severe harms.
Evaluation Methods: Includes structured evaluations and internal and external red-teaming testing for specific, severe harms.
Risks and Mitigations: Identified risks include harmful content, misuse, and privacy violations, with mitigations such as filtered data sets, safeguards, and content safety guardrails.