Concept·Infrastructure·Added 1 month ago

Windows Agent Runtime

Also known as: WAR, Windows agent runtime service, Windows AI agent runtime

A system-level service built into Windows that lets lightweight AI agents run locally on the device, using the NPU (the dedicated neural processing chip now shipping in modern PCs) instead of sending every request to the cloud. Announced by Microsoft at Build 2026.

Most AI agent infrastructure lives in the cloud: you send a request, a server runs the model, you get a response. Windows Agent Runtime flips that for a class of tasks. It is a built-in OS service, arriving in the second half of 2026, that can execute small agents directly on the device using the NPU that ships in Copilot+ PCs. That means note summarization, document formatting, scheduling assistance, and similar tasks can run offline with no data leaving the machine.

At Build 2026, Microsoft shipped companion pieces: on-device models like MAI-3B (a 3-billion-parameter model optimized for agentic tasks like reasoning and tool use) that install as optional Windows updates, and a Windows AI Agent API that gives developers standard interfaces for memory, planning, and tool calling. Developers can slot in pre-validated models or bring fine-tuned versions of their own.

For builders, this matters because it creates a new deployment target: agents that run in the background on a user's laptop without needing a cloud subscription or constant connectivity. It also raises new questions about governance, since agents with broad OS access can take real actions on files, apps, and network resources. Microsoft's answer is role-based access controls and an Agent Activity Center that logs every step the agent takes.

This definition is AI-generated and refreshed weekly. It may contain inaccuracies. Use your own judgment, especially for production decisions.

Related terms