Windows Agent Runtime
Also known as: WAR, Windows agent runtime service, Windows AI agent runtime
Most AI agent infrastructure lives in the cloud: you send a request, a server runs the model, you get a response. Windows Agent Runtime flips that for a class of tasks. It is a built-in OS service, arriving in the second half of 2026, that can execute small agents directly on the device using the NPU that ships in Copilot+ PCs. That means note summarization, document formatting, scheduling assistance, and similar tasks can run offline with no data leaving the machine.
At Build 2026, Microsoft shipped companion pieces: on-device models like MAI-3B (a 3-billion-parameter model optimized for agentic tasks like reasoning and tool use) that install as optional Windows updates, and a Windows AI Agent API that gives developers standard interfaces for memory, planning, and tool calling. Developers can slot in pre-validated models or bring fine-tuned versions of their own.
For builders, this matters because it creates a new deployment target: agents that run in the background on a user's laptop without needing a cloud subscription or constant connectivity. It also raises new questions about governance, since agents with broad OS access can take real actions on files, apps, and network resources. Microsoft's answer is role-based access controls and an Agent Activity Center that logs every step the agent takes.