Gemma
Also known as: Gemma models, Gemmaverse
Gemma launched in February 2024 as Google's response to the growing demand for open-weight models, notably Meta's Llama family. Unlike the Gemini models that require API access, Gemma models are freely downloadable: developers can run them locally on their own hardware, fine-tune them on proprietary data, and deploy them without per-token fees or internet connectivity. Since launch, the developer community has created more than 100,000 Gemma variants (the community calls this ecosystem the Gemmaverse), and the models have been downloaded more than 400 million times.
Gemma 4, released April 2, 2026, is the most capable open-weight model Google has shipped. It comes in four sizes: E2B and E4B (edge models with 2-4B effective parameters, designed to run on phones and Raspberry Pi), 26B (a mixture-of-experts model for high throughput), and 31B (a dense model for consumer GPUs with 16GB+ VRAM). All variants support multimodal input (text and images, with audio on edge models), function calling for agentic workflows, a 256K token context window, and configurable thinking mode for step-by-step reasoning. Gemma 4 ships under Apache 2.0, the most permissive license Google has used for any AI model.
Gemma 4 is built from Gemini 3 research but tuned for efficiency at smaller scales. The 31B model scores 89.2% on AIME 2026 math and 80% on LiveCodeBench v6, competitive with models that were considered frontier a year earlier. For builders who want capable AI without cloud dependency, privacy exposure, or API costs, Gemma 4 is the most accessible Google-quality option available.