# Diffusion Transformer (DiT) **Diffusion Transformer (DiT)** replaces the [[U-Net]] denoiser in diffusion models with a pure [[Transformer]] operating on latent image patches. Introduced by Peebles & Xie (2022), it underpins Stable Diffusion 3, [[flux1|FLUX.1]], and Sora-style video architectures — and is increasingly the default architecture for modern high-quality generation. ^overview