02-diffusion_basics - Mora Project

# [[01-ecosystem|Prev]] | Article 02 — Diffusion basics | [[03-prompting_fundamentals|Next]] Most modern AI image generators are built on **[[diffusion_models#^diffusion-definition|diffusion models]]**. Instead of guessing pixels all at once, *they iteratively refine a chaotic signal (a noise) into a clear image*. This process is what gives us fine-grained control over the final output. ![[diffusion_models#^overview]] That "pure, random noise" isn't arbitrary, it's a [[Gaussian noise#^ai-advantage|Gaussian noise]], a specific kind of randomness whose mathematical properties let the model reverse the corruption cleanly, one step at a time. The [[diffusion_models#Forward and Reverse Process|forward and reverse process]] only works because this noise is predictable enough to be undone. ## Latent space ![[latent_space#^overview]] To solve this, modern generative models like [[stable_diffusion|Stable Diffusion]] operate in **Latent Space** — instead of working on raw pixels, the system [[latent_space#Semantic Compression vs. Pixel Compression|compresses]] the image into a smaller mathematical representation called a ==latent tensor==. This compact form retains the *meaning* of the image (shapes, textures, composition) rather than storing every pixel value, unlocking [[latent_space#The Benefits of Latent Operations benefits|significant efficiency and learning benefits]]. ![[latent_space#^image-decompression]] ### Translating pixels to latent space ![[VAE#^overview]] ![[VAE#^two-opposed-processes]] ![[VAE#^swapping-vae]] # Diffusion isn't just hallucinating random images; it is guided by your [[03-prompting_fundamentals|prompt (see article 3)]].