GPU cloud
Also known as: cloud GPU, GPU-as-a-service, compute cloud
Training and running large AI models requires a specific type of hardware: GPUs, which were originally built for rendering graphics but turned out to be ideal for the matrix math inside neural networks. GPU clouds give builders on-demand access to this hardware without upfront capital investment. You provision a server, the GPU is already there, and you pay for the time you use it.
The market splits roughly into two camps. Hyperscalers (AWS, Google Cloud, Azure) offer GPUs alongside their broader ecosystem of storage, networking, and managed services — convenient if you are already on their stack, but often expensive. Specialist GPU clouds (CoreWeave, Lambda, RunPod, Nebius) focus specifically on AI workloads, often offering better availability and lower prices for raw compute. The tradeoff is fewer non-GPU services bundled in.
For most builders, the choice comes down to workload type. Bursty experiment work fits pay-as-you-go serverless options. Training runs or high-throughput inference often benefit from reserved capacity. And for teams that just need to call a model without managing any GPU at all, hosted inference APIs (like Groq or Together AI) abstract the hardware layer away entirely.