tools_platforms_overview

# Tools and platforms The ecosystem splits into **hosted platforms** (fast, low setup) and **local stacks** (more control). On the closed side: [[midjourney|Midjourney]], [[dalle|DALL·E]], [[adobe_firefly|Adobe Firefly]]. On the open side: [[stable_diffusion|Stable Diffusion]], [[flux1|FLUX.1]], [[comfyui|ComfyUI]], [[automatic1111|AUTOMATIC1111]], [[invokeai|InvokeAI]]. ^overview That division is less fixed than it used to be. On the model side, labs that built closed products are releasing open-weight variants: [[flux1|FLUX.1]] from Black Forest Labs ships an **Apache-licensed variant** and an **open-weight version**, reaching benchmark scores similar to [[midjourney|Midjourney]] and [[dalle|DALL·E]]. Beyond models, large closed players, Microsoft, Meta, Apple, and others, are increasingly publishing tools, inference runtimes, coding libraries, and research frameworks under **OSS licenses**. Resulting in the open stack adapts and evolves quickly, by *pulling in state-of-the-art work from across the industry* rather than relying solely on the open-source community. ^blurring Following is a very short list of tools, platforms, models and such related to AI Generation. More will be added as needed. ## Closed platforms Hosted services with subscription pricing — no local setup, strong safety filters, predictable performance. You can't inspect or modify the underlying weights. ^closed-platforms ### Midjourney **Midjourney** delivers polished results through a web interface or Discord bot — no local setup, subscription-based. ### DALL·E **DALL·E** (OpenAI) integrates into ChatGPT and the API. High coherence, strong prompt following, hard to deviate from. ### Adobe Firefly **Adobe Firefly** is designed for commercial-safe use, trained entirely on licensed content. Embedded in the Creative Cloud suite. ## Open stacks Open-weight models run locally on your hardware — no subscription, no rate limits, full control over the generation pipeline. ^open-stacks ### Stable Diffusion **Stable Diffusion** pioneered the open-weight approach — freely downloadable model files that run on consumer GPU hardware. The foundation most open tools are built on. ### FLUX.1 **FLUX.1** from Black Forest Labs (built by the original Stable Diffusion team) is the current state-of-the-art for open image generation. The `[schnell]` variant ships under Apache 2.0; `[dev]` is open-weight for non-commercial use. Surpasses Midjourney v6.0 and DALL·E 3 on visual quality and prompt adherence benchmarks. ### ComfyUI **ComfyUI** is a node-based UI for building custom generation pipelines. Highly extensible — the standard for complex workflows like ControlNet chains, LoRA stacking, and multi-step refinement. ### AUTOMATIC1111 **AUTOMATIC1111** (A1111) is the traditional WebUI with the broadest extension ecosystem. Easier starting point than ComfyUI, widely documented. ### InvokeAI **InvokeAI** offers a streamlined UX with strong ControlNet and inpainting support out of the box. Good middle ground between A1111 and ComfyUI. ## Bundled model access platforms A new category emerged: developer platforms offering subscription access to multiple paid model APIs through a single interface. Instead of managing separate accounts with OpenAI, Anthropic, and Google, you route requests through a unified gateway — gaining access to dozens of models under one subscription or API key. ^bundled-access ### GitHub Copilot **GitHub Copilot** tiers grant access to multiple proprietary models: Pro ($10/month) includes Claude, GPT models, and Codex; Pro+ ($39/month) adds Claude Opus and the broadest model selection. ### GitHub Models **GitHub Models** provides a unified playground and API: "one API key, limitless possibilities." Compare and deploy across 40+ models from OpenAI, Mistral, Meta, Microsoft, DeepSeek, Cohere, and xAI — all in GitHub's interface. ### OpenRouter **OpenRouter** abstracts away provider fragmentation with a single API endpoint routing to hundreds of models. Handles fallbacks automatically and optimizes for cost — same interface whether you're calling Claude, GPT-5, or open-weight Mistral. ## Foundation models Influential model families that underpin the platform ecosystem — many of the tools above run on, fine-tune from, or integrate these architectures. ^foundation-models ### GPT **GPT** (Generative Pre-trained Transformer) is OpenAI's decoder-only transformer series. GPT-3 and beyond power ChatGPT and the OpenAI API; frontier versions (GPT-4o, GPT-5) are closed-weight. ### BERT **BERT** (Bidirectional Encoder Representations from Transformers) is Google's encoder-only transformer, pre-trained on masked language modelling. Foundational to text understanding, semantic search, and prompt-encoding stages inside image generation pipelines. ### Claude **Claude** (Anthropic) is a family of LLMs known for long-context handling and reduced harmful outputs. Available via API and bundled into GitHub Copilot Pro+. ### Vision Transformer (ViT) **Vision Transformer (ViT)** adapts the transformer to images by slicing them into fixed-size patches and treating each as a token — same self-attention mechanism, different domain. Introduced in "An Image is Worth 16×16 Words" (Dosovitskiy et al., 2020); now the backbone of most multimodal models. ### Diffusion Transformer (DiT) **Diffusion Transformer (DiT)** replaces the U-Net denoiser in diffusion models with a pure transformer operating on latent image patches. Introduced by Peebles & Xie (2022); underpins Stable Diffusion 3, [[flux1|FLUX.1]], and Sora-style video architectures.