Concept·AI Models & Capabilities·Added 1 month ago

Image generation

Also known as: AI image generation, generative image, AI art generation

The umbrella term for AI systems that create new images, whether from text prompts, reference images, or other inputs. Includes text-to-image, image-to-image, inpainting, and related techniques, all now running in production at scale.

AI image generation went from research curiosity to commercial infrastructure in roughly three years. The category includes several distinct modes: text-to-image (describe and generate), image-to-image (use one image to guide the generation of another), inpainting (fill in or edit specific regions of an existing image), outpainting (extend an image beyond its current borders), and more recently, style transfer and character-consistent generation across multiple outputs.

The dominant underlying technology through 2024 was diffusion models, which learn to generate images by learning to reverse a noise-adding process. In 2025, rectified flow transformers (used by Flux) and autoregressive approaches (used by GPT Image 2) emerged as serious alternatives, each with different tradeoffs in speed, controllability, and output character.

For builders, image generation is increasingly a commodity API call rather than a distinct product. The interesting work is in the layer above: what prompts do you construct programmatically, how do you evaluate and filter outputs, how do you maintain brand consistency, and how do you serve these images at scale without the costs spiraling. These are the real engineering problems that differentiate image-generation-powered products.

This definition is AI-generated and refreshed weekly. It may contain inaccuracies. Use your own judgment, especially for production decisions.

Related terms