Concept·Infrastructure·Added 1 month ago

GPU / TPU

Also known as: graphics processing unit, tensor processing unit, AI chip, accelerator, AI accelerator, AI hardware

The specialized chips that power AI. GPUs (Graphics Processing Units) were designed for rendering graphics but turned out to be perfect for the matrix math that runs neural networks. TPUs (Tensor Processing Units) are Google's custom-built AI chips. When someone talks about 'needing more compute,' they mean access to more of these.

Normal computer processors (CPUs) are fast but designed to handle a few complex tasks at a time. AI training and inference (the process of a model generating an answer) require doing billions of simple math operations simultaneously. GPUs were originally built for video games, which happen to need the same kind of parallel processing. Researchers discovered around 2012 that GPUs were also great for training neural networks, and that realization kicked off the modern AI era.

NVIDIA has dominated the GPU market for AI and built much of the underlying software ecosystem. Their H100 chip is the current industry workhorse for training large models. TPUs are Google's alternative: custom silicon optimized specifically for AI math. Google uses them to train and serve Gemini, and offers them through Google Cloud. Other companies like AWS, Microsoft, and startups like Groq and Cerebras are building their own AI chips, often optimized for inference (running a trained model) rather than training.

For most builders, the GPU/TPU question comes up in two places: choosing a cloud provider (AWS, GCP, Azure all offer GPU instances), and reading AI news about the chip supply, since access to GPUs constrains how fast labs can train and how cheaply you can run inference. The short version: GPUs are the oil wells of the AI economy right now.

This definition is AI-generated and refreshed weekly. It may contain inaccuracies. Use your own judgment, especially for production decisions.

Related terms