Red teaming
Also known as: AI red teaming, LLM red teaming, adversarial testing
Red teaming borrows from military and cybersecurity practice: you assign a team to act as adversaries and attack your own system. For AI, that means systematically testing whether the model can be tricked into ignoring its instructions, producing harmful content, leaking sensitive information, or taking actions it shouldn't.
The scope has expanded significantly as AI systems have gained more autonomy. Beyond simple prompt injection, red teams now test tool permissions, agent workflows, inter-agent communication, and memory persistence. The OWASP Top 10 for LLM Applications and agentic-specific extensions give red teamers a recognized taxonomy of attack surfaces to work through.
For builders shipping products, red teaming is as much a design input as a security check. Running adversarial tests early in development often reveals cases where your system prompt is too permissive, your tool permissions are too broad, or your output filters have obvious gaps. Regulators, including the EU AI Act framework, are increasingly expecting documented red teaming for higher-risk AI deployments.