AI Red Teamer course review banner

Hack The Box Academy: AI Red Teamer & COAE

HTB Certified Offensive AI Expert (COAE) | Course & Certification Review

Techniques:Prompt InjectionGradient DescentFGSMI-FGSMDeepFoolJSMAElasticNet Attack (EAD)Adversarial AttacksModel ExtractionData PoisoningMembership InferenceDifferential PrivacyAI Red TeamingLLM Security
Tech:Large Language ModelsMachine LearningPyTorchScikit-learnJupyter NotebookMinicondaGoogle SAIFPythonModel Context ProtocolVector Databases

Overview

HTB COAE certification banner art featuring a neural network brain in a cyberpunk containment unit

Course & Certification at a Glance

Provider: Hack The Box Academy
Course: AI Red Teamer Job Role Path
Certification: HTB Certified Offensive AI Expert (COAE)
Developed With: Google (SAIF Framework)
Format: OnDemand Self-Paced
Structure: 12 modules, 230 sections
Difficulty: Hard (path rated overall)
Exam Format: 7-day practical assessment + report
Course Completed: Mid-January 2026
Certified: April 9, 2026 (exam taken April 2; to my knowledge first blood, among the first COAE holders)
COAE Exam (standalone): $210 (includes 2 exam attempts)
Silver Annual: ~$490 (full Academy access to required course modules + exam voucher)

Hack The Box Academy's AI Red Teamer Job Role Path and its associated HTB Certified Offensive AI Expert (COAE) certification together form the most comprehensive machine-learning-focused offensive AI security training package available today. Developed in collaboration with Google and aligned with Google's Secure AI Framework (SAIF), the complete package takes learners from foundational AI/ML concepts through advanced adversarial techniques and validates those skills with a demanding 7-day practical exam. The course spans 12 modules covering the full spectrum of AI attack surfaces: prompt injection, model privacy attacks, adversarial evasion, supply chain risks, and deployment-level vulnerabilities. The COAE exam then puts all of that knowledge to the test in a simulated corporate AI environment, requiring a commercial-grade report as the final deliverable.

Overall, I thought the complete package was excellent. This is the first certification I have taken that incorporates high-level mathematics, and working through the course and then applying it under exam conditions genuinely transformed the way I think in a systematic way. Concepts like gradient descent, norm constraints, and adversarial perturbation theory are not the kind of material you encounter in traditional offensive security certifications, and engaging with them here gave me a fundamentally deeper understanding of how these systems work. My grasp of machine learning and AI is significantly greater than it was going in, at both the academic level (understanding the math and theory behind models) and the operational level (knowing how to exploit and assess real-world AI systems). If you are a security professional looking to develop genuine competency in AI red teaming rather than surface-level familiarity, this is the training and certification I would recommend.

This review covers both the course content in depth (module by module) and the COAE certification exam experience.

Part 1: The Course (AI Red Teamer Path)

HTB Academy AI Red Teamer job role path completion banner

Overall Course Review

The course is structured as a deliberate progression. The first three modules lay the groundwork: AI/ML fundamentals, hands-on model development in Python, and an introduction to the AI threat landscape through the OWASP and Google SAIF frameworks. From there, it moves into offensive techniques: prompt injection, LLM output exploitation, data poisoning, application and system-layer attacks, and the three-module evasion trilogy that covers increasingly advanced adversarial mathematics. The path closes with two defensive modules on AI privacy and AI defense, rounding out both sides of the equation.

This path fundamentally changed how I view machine learning systems. The early modules build intuition around how AI works, while the later modules, particularly the evasion trilogy and data attacks, dive deep into the mathematics of adversarial machine learning. Expect to hit walls with the math. I found ChatGPT invaluable for decoding unfamiliar mathematical notation and building the intuition needed to really understand these attacks. The struggle is worth it, by the end you'll understand not just how to break AI systems, but why the attacks work at a fundamental level and how to defend against them.

What sets this path apart is its dual focus on offense and defense. Throughout the modules, attack techniques are paired with mitigations, and the final modules on AI Privacy and AI Defense ensure you can secure the systems you've learned to compromise. The integration of Google's SAIF and OWASP Top 10 frameworks early on provides mental scaffolding that pays dividends as you progress through increasingly technical content.

The path drives home that AI systems are vulnerable at every layer, model, data, application, and system, and that traditional web vulnerabilities like XSS, SQLi, and command injection compound with AI-specific risks in ways that are often overlooked. You'll develop mathematical intuition around norms, gradients, and optimization that's essential for understanding adversarial ML at a fundamental level. While jailbreaks and prompt injection are the entry point that most people think of, data poisoning and evasion attacks are where the deep exploitation happens. You'll also learn that defensive measures like adversarial training, DP-SGD, and guardrails come with real tradeoffs against model utility, there's no free lunch in AI security.

If you're serious about AI security, this course is the real deal.

Module Reviews

Module 1: Fundamentals of AI

Created by PandaSt0rm, this foundational module clocks in at around 8 hours across 24 sections of pure theory, no hands-on exercises, no labs, just a deep dive into the concepts that underpin AI systems. It serves as both an approachable introduction and a reference you'll come back to throughout the path. The content spans five major domains:

Supervised Learning: Linear regression, logistic regression, decision trees, naive bayes, and support vector machines
Unsupervised Learning: K-means clustering, principal component analysis (PCA), and anomaly detection
Reinforcement Learning: Q-learning and SARSA algorithms for decision-making agents
Deep Learning: Perceptrons, multi-layer perceptrons, convolutional neural networks (CNNs), and recurrent neural networks (RNNs)
Generative AI: Large language models and diffusion models

The module touches on mathematical foundations but doesn't make them the focus. That said, having some background in statistics, linear algebra, and calculus will make the content click faster. If you're coming in cold on the math, expect to do some supplementary reading.

Rated Medium difficulty at Tier 0, I'd say that's spot on. No multivariable calculus or actual coding here, but the topics are certainly more advanced than your typical intro material. Overall an extremely solid survey of AI and machine learning and a wonderful place to start the learning journey.

Module 2: Applications of AI in InfoSec

Created by PandaSt0rm with vautia, this 25-section module spans roughly 8 hours and is where theory meets practice. You'll build a complete AI development environment from scratch using Miniconda for package management and JupyterLab for interactive experimentation, then work through the entire model development lifecycle from raw data to trained classifiers.

The module covers:

Environment Setup: Miniconda, JupyterLab, and dependency management
Python Libraries: Scikit-learn and PyTorch for model training and evaluation
Datasets: Structure, loading, inspection, and identifying potential issues
Data Preprocessing: Cleaning, imputing missing values, encoding categorical features, handling skewed distributions
Data Transformation: One-hot encoding, data splitting, preparing data for modeling
Spam Classification: Naive Bayes with text-to-numerical feature extraction
Network Anomaly Detection: Random forests on the NSL-KDD dataset
Malware Classification: Converting malware to images and classifying with ResNet50

The module recommends setting up your own environment rather than using the provided VM for better training performance. A machine with at least 4GB RAM and a reasonably modern multi-core CPU will serve you well here. GPU is optional but helpful.

Rated Medium and costed at Tier 0, I'd say that's spot on. The module guides you through each step in detail without leaving you stranded. What I thoroughly enjoyed was how applicable everything felt, by the end I had working proof-of-concept models for real-world cybersecurity use cases: spam detection, network anomaly detection, and malware classification. That's not just theory, that's something you can actually build on.

Module 3: Introduction to Red Teaming AI

Created by vautia, this 11-section module runs about 4 hours and shifts gears from building AI systems to breaking them. It's the bridge between the foundational modules and the attack-focused content ahead, giving you the threat landscape overview before diving into specific exploitation techniques.

The module covers three core frameworks:

ML OWASP Top 10: Security vulnerabilities specific to machine learning systems
LLM OWASP Top 10: Vulnerabilities in large language model deployments
Google's Secure AI Framework (SAIF): Industry-standard approach to AI security

From there, it breaks down attack surfaces by component:

Model Components: Manipulating the model itself
Data Components: Attacking the data pipeline
Application Components: Exploiting the application layer
System Components: Targeting the underlying infrastructure

Prerequisites are the previous two modules (Fundamentals of AI and Applications of AI in InfoSec), which makes sense - you need to understand how these systems work before you can break them.

Rated Medium and costed at Tier I, this is an excellent introduction to secure AI and AI red teaming. What I particularly enjoyed was the early introduction of the Google SAIF and OWASP Top 10 frameworks, having these in the back of your mind as you dive deeper into the technical topics gives you a mental scaffold to hang everything on.

Module 4: Prompt Injection Attacks

Created by vautia, this 12-section module runs about 8 hours and marks your entry into Tier II offensive content. It's a comprehensive deep dive into one of the most prominent attack vectors against large language models: prompt injection.

The module covers four main areas:

Direct Prompt Injection: Manipulating LLM behavior through user-controlled input
Indirect Prompt Injection: Attacks via external data sources the LLM processes
Jailbreaking: Techniques to bypass safety guardrails and content filters
Mitigations: Both traditional and LLM-based defensive approaches

You'll start with prompt engineering fundamentals, move through reconnaissance techniques, then work through increasingly sophisticated injection and jailbreak methods. The "Tools of the Trade" section introduces automation and tooling for prompt injection testing.

Prerequisites build on everything prior: Fundamentals of AI, Applications of AI in InfoSec, and Introduction to Red Teaming AI.

Rated Medium and costed at Tier II, I thoroughly enjoyed this module as it introduces the kind of thinking required to perform adversarial AI testing. The different jailbreaking techniques provide a lot of real-world value, and honestly, when people think of AI pentesting, jailbreaks are what they tend to think of first. This module delivers on expectations.

Module 5: LLM Output Attacks

Created by vautia, this 14-section module runs about 8 hours and flips the script from input manipulation to output exploitation. Where prompt injection focuses on what goes into an LLM, this module examines what comes out and how that output can compromise systems.

The module covers traditional web vulnerabilities in an LLM context:

Cross-Site Scripting (XSS) in LLM applications
SQL Injection through LLM-generated queries
Command Injection via LLM outputs
Function Calling attacks against LLM tool use
Exfiltration attacks to leak prompt contents
Security issues from LLM hallucinations

It also covers abuse attacks, the weaponization side of LLMs: hate speech campaigns, misinformation generation, and the detection and mitigation of these attacks. The module wraps up with safeguard case studies and legislative regulation, grounding the technical content in real-world policy.

Prerequisites are notably heavier here, requiring both the AI path modules (Fundamentals, Applications, Intro to Red Teaming, Prompt Injection) plus traditional web security knowledge (XSS, SQL Injection Fundamentals, Command Injections).

Rated Medium and costed at Tier II, this module is highly relevant to real-world work. It introduces the layer of traditional security vulnerabilities and demonstrates how they compound with AI technologies. While not overly technically deep, it continues introducing learners to the attack surface of AI systems and how it's exploited. The combination of classic web vulns with LLM contexts is exactly what you'll encounter in production AI applications. As with other modules in the path, it pairs offensive techniques with their corresponding defenses, reinforcing that red teaming is ultimately about improving security posture, not just finding holes.

Module 6: AI Data Attacks

Created by PandaSt0rm, this 25-section module is estimated at 3 days and marks the first Hard-rated content in the path. Where previous modules focused on prompt-level and output-level attacks, this one goes deeper into the data pipeline itself, targeting the foundation that AI systems are built on.

The module covers several sophisticated attack categories:

Data Poisoning: Label flipping and targeted label attacks to corrupt training data
Feature Attacks: Clean label attacks that poison data without changing labels
Trojan Attacks: Embedding backdoors into AI models that activate on specific triggers
Tensor Steganography: Hiding malicious payloads within model artifacts
Model Artifact Exploitation: Attacking the saved model files themselves (pickle vulnerabilities)

The hands-on content is substantial, walking through baseline model creation, attack implementation, and evaluation for each technique. You'll work with logistic regression, CNN architectures, and custom attack tooling.

Prerequisites include the core AI path modules plus strong Python skills and Jupyter Notebook familiarity. HTB highly recommends using your own machine for the practicals rather than the provided VM, as the training workloads benefit from local compute.

Rated Hard and costed at Tier II, I ran into the math like it was a brick wall. Luckily I was able to use ChatGPT to decode the mathematical symbols I didn't know and began developing the mathematical intuition required to really understand and exploit AI systems. The Hard rating is truly justified, this will be the first real roadblock in the course pathway for many learners. If you've been coasting on the Medium modules, expect to slow down here.

Module 7: Attacking AI - Application and System

Created by vautia, this 14-section module runs about 8 hours and zooms out from model and data attacks to examine the application and system layers of AI deployments. After the math-heavy data attacks module, this returns to Medium difficulty while covering equally critical attack surfaces.

The module covers application and system component vulnerabilities:

Model Reverse Engineering: Extracting model architecture and parameters
Denial of ML Service: Resource exhaustion and availability attacks
Insecure Integrated Components: Third-party library and dependency risks
Rogue Actions: Unauthorized model behaviors
Excessive Data Handling & Insecure Storage: Data exposure through poor practices
Model Deployment Tampering: Attacking the deployment pipeline
Vulnerable Framework Code: Exploiting ML framework vulnerabilities

A significant portion focuses on the Model Context Protocol (MCP), the AI orchestration protocol introduced in 2024. You'll get a practical introduction to how MCP works, then dive into attacking vulnerable MCP servers and the risks of malicious MCP servers, with mitigations to round it out.

Prerequisites include the AI path modules plus SQL Injection Fundamentals, Command Injections, and Web Attacks.

Rated Medium and costed at Tier II, this module is an excellent blend of classic attack vectors (command injection, SQLi, info disclosure) with modern AI exploitation via MCP. It clearly shows how legacy flaws and AI risks combine, making this a highly valuable and insightful module. A welcome return to Medium and a little less math after the Hard data attacks module.

Module 8: AI Evasion - Foundations

Created by PandaSt0rm, this 12-section module runs about 8 hours and kicks off the three-module evasion series. It introduces inference-time evasion attacks, techniques that manipulate inputs to bypass classifiers or force targeted misclassifications at prediction time.

The module establishes the evasion threat model:

White-box vs Black-box: Understanding attacker knowledge levels and their implications
Transferability: How attacks crafted on one model can transfer to others
Feature-obfuscation: The GoodWords attack methodology

You'll build and attack a spam filter using the UCI SMS dataset, implementing GoodWords attacks in both white-box and black-box settings. The black-box content covers operating under query limits, including candidate vocabulary construction, adaptive selection strategies, and small-combination testing to minimize detection.

Prerequisites build on the full path so far including the AI Data Attacks module. Strong Python skills and Jupyter Notebook familiarity are mandatory, and HTB highly recommends using your own machine for the practicals.

Rated Medium and costed at Tier II, this is when the course starts getting really into the weeds, readying you for deeper mathematical attacks and adversarial machine learning in the modules ahead. A great module for building intuition around machine learning and how to attack the models themselves, not just the applications around them.

Module 9: AI Evasion - First-Order Attacks

Created by PandaSt0rm, this 23-section module is estimated at 2 days and returns to Hard difficulty. It takes gradient-based adversarial techniques from theory to implementation, exploiting the differentiable structure of neural networks to craft perturbations that force misclassifications.

The module covers the core first-order attack methods:

Foundations: Norm constraints (L_inf, L_2), local linearity assumptions, high-dimensional accumulation effects
FGSM (Fast Gradient Sign Method): Single gradient step adversarial examples under L_inf budget
Targeted FGSM: Directing misclassifications toward specific target classes
I-FGSM (Iterative FGSM): Multiple smaller steps with projection for improved success rates
DeepFool: Finding minimal perturbations through iterative linearization and geometric projection onto decision boundaries

Each attack includes full implementation, evaluation metrics, and visualization. The module has two skills assessments, reflecting the depth of content.

Prerequisites include all previous AI modules plus basic understanding of neural networks and gradient computation. Strong Python and Jupyter skills are mandatory, and using your own machine is highly recommended. I also highly recommend using ChatGPT to help you understand the mathematical concepts, it's a powerful tool and will help you greatly.

Rated Hard and costed at Tier II, this was another module where I ran into a brick wall with the mathematics. At first much of the content was hieroglyphics to me, however I was eventually able to understand it and it fundamentally began to change the way I view the world and systems. The mathematical intuition gained here is second to none in regards to how machine learning actually works and how it's exploitable. Worth the struggle.

Module 10: AI Evasion - Sparsity Attacks

Created by PandaSt0rm, this 28-section module is estimated at 3 days and is the most technically dense of the evasion trilogy. Where first-order attacks focus on minimizing perturbation magnitude, sparsity attacks flip the constraint: minimize how many features change, not how much they change.

The module covers the mathematical foundations and implementations:

L0 Budgets: The pseudo-norm counting modified coordinates
L1-Induced Sparsity: Using regularization to promote sparse solutions
Saliency-Based Feature Selection: Identifying the most impactful features to modify
ElasticNet Attack (EAD): Combining L1 and L2 regularization for sparse yet smooth perturbations
FISTA Optimization: Proximal gradient descent with momentum for solving non-smooth objectives
JSMA (Jacobian-based Saliency Map Attack): Explicit L0 budgets by modifying one or two features per step
Single-Pixel and Pairwise Variants: Balancing attack efficiency with modification counts

The hands-on content is extensive, covering environment setup, implementing FISTA components, loss gradients, binary search, attack execution, visualizations, and sparsity analysis for both ElasticNet and JSMA approaches.

Prerequisites include all previous AI modules through First-Order Attacks, plus understanding of neural networks, gradient computation, and optimization methods. Own machine highly recommended.

Rated Hard and costed at Tier II, this one is much like the last, very tough on the mathematics. But the intuition gained is transformative, especially around the different norms (L0, L1, L2, L_inf) and how they interact. It also gave insight into how adversarial algorithms are conceived and developed, not just how to use them. You'll leave understanding why these attacks work, not just that they work.

Module 11: AI Privacy

Created by PandaSt0rm, this 21-section module is estimated at 2 days and is notably the first Defensive-focused module in the path. It explores the privacy dimension of AI security, both the attacks that extract sensitive information from trained models and the defenses that protect against them.

The attack side covers Membership Inference Attacks (MIA):

Shadow Model Methodology: Training attack classifiers to detect membership based on prediction confidence patterns
Exploiting Overfitting: How models behave differently on training data vs unseen data, creating detectable fingerprints
Privacy Implications: If a model trained exclusively on cancer patients reveals someone as a training member, their medical status is exposed

The defense side covers differential privacy approaches:

Differential Privacy Fundamentals: The mathematical framework for quantifying privacy guarantees
DP-SGD (Differentially Private Stochastic Gradient Descent): Per-sample gradient clipping and calibrated noise injection to limit individual influence
PATE (Private Aggregation of Teacher Ensembles): Architectural separation with multiple teachers on disjoint data, noisy vote aggregation for student training
Privacy-Utility Tradeoffs: Understanding what you sacrifice in model performance to gain privacy guarantees

Prerequisites include all previous AI modules, plus solid PyTorch familiarity and understanding of neural network training and optimization. Own machine is highly recommended over Pwnbox.

Rated Medium and costed at Tier II, I found this module informative in showing how membership inference attacks work and how models can be secured through training methods. Honestly, I thought it could have been rated Hard, the math and algorithms pushed the limits of what I'd consider Medium territory. What I found particularly valuable was seeing the ML training side in more depth, understanding how different training approaches (standard vs DP-SGD vs PATE) produce models with fundamentally different security properties, and the concrete tradeoffs between privacy guarantees and model utility.

Module 12: AI Defense

Created by vautia with PandaSt0rm, this 21-section capstone module is estimated at 2 days and brings everything full circle. After spending the path learning to attack AI systems, you now learn to defend them, understanding both sides of the adversarial equation.

The module covers three main defensive approaches:

LLM Guardrails (application-layer defenses):

Character-based Validation: Input filtering and sanitization
Traditional Content-based Validation: Pattern matching and blocklists
AI-based Guardrails: Using models to detect malicious inputs
Guardrail Libraries and Services: Production-ready defensive tooling

Model-level Defenses:

Adversarial Training: Incorporating adversarial examples during training to build robust models
Adversarial Tuning: Fine-tuning existing models for safety through supervised learning on jailbreak/refusal pairs
Understanding and mitigating jailbreak attacks

The module also covers advanced prompt injection tactics like priming attacks and how to evade guardrails, giving you both the offensive and defensive perspective.

Prerequisites include the core AI modules plus the evasion modules. Note that running adversarial training and tuning code requires powerful hardware, but it's optional, you can complete the module without running it yourself.

Rated Medium and costed at Tier II, this was a wonderful capstone to the course. It covers advanced techniques while tying everything together and pushing the path over the line from educational to practically useful. This is the module that enables effective AI red teaming against real systems with safeguards in place. What I particularly appreciated was how it equips you not just to attack AI systems and find vulnerabilities, but to secure them and defend against the very attacks you've learned. That dual perspective is what separates a red teamer from just a hacker.

Part 2: The Certification (HTB COAE)

HTB Certified Offensive AI Expert (COAE) certificate awarded to Jacob Krell on April 9, 2026

Certification Overview

The HTB Certified Offensive AI Expert (COAE) is Hack The Box's professional-grade certification for AI red teaming, launched on April 2, 2026. It serves as the capstone credential for the AI Red Teamer Job Role Path and is designed to validate a candidate's ability to assess complex AI environments in a real-world setting. This is not a theoretical exercise or a multiple-choice quiz. The COAE requires you to demonstrate that you can actually perform offensive AI assessments under pressure.

The certification was created alongside the AI Red Teamer path, which was developed in collaboration with Google and aligned with Google's Secure AI Framework (SAIF). HTB positions the COAE as a credential that separates the curious from the experts, going far beyond basic prompt injection to cover the full spectrum of AI and ML vulnerabilities. As organizations rush to integrate LLMs and AI into their operations, the gap in specialized AI security talent is growing rapidly, and the COAE is designed to fill that gap.

As HTB puts it: "When you hold the HTB COAE, you are telling the world that you don't just understand the theory of AI security, but that you have the hands-on experience to defend the next generation of technology."

Exam Format & Objectives

Exam Details

Format: Practical hands-on assessment
Duration: 7 days
Deliverable: Commercial-grade technical report
Environment: Simulated corporate AI infrastructure
Attempts: 2 per registration (standalone exam; Silver Annual includes an exam voucher)
Prerequisites: Complete AI Red Teamer Job Role Path

The COAE exam places candidates in a simulated corporate environment where they must perform a full-scale AI offensive assessment. The course syllabus spans the following domains, all of which are fair game for the exam:

AI/ML Fundamentals: Supervised and unsupervised learning, deep learning architectures (CNNs, RNNs), reinforcement learning, generative AI, and the model development lifecycle
AI Threat Landscape: OWASP ML Top 10, OWASP LLM Top 10, Google SAIF framework, and AI-specific attack surface mapping
LLM Exploitation: Prompt injection, jailbreaking, output attacks, function calling abuse, exfiltration, and abuse attack vectors
Data Poisoning & Supply Chain: Training data manipulation, backdoor attacks, model artifact exploitation, and tensor-level attacks
AI Application & System Security: Model reverse engineering, denial of service, insecure components, deployment tampering, framework vulnerabilities, and orchestration protocol risks
Adversarial Machine Learning: Evasion attacks, gradient-based and sparsity-based perturbation methods, transferability, and black-box attack methodology
AI Privacy: Membership inference, differential privacy mechanisms, and privacy-utility tradeoffs
AI Defense: Guardrails, adversarial training, adversarial tuning, guardrail evasion, and legislative/regulatory considerations

Successful completion requires submitting a commercial-grade technical report documenting findings, exploitation steps, and recommended mitigations. This mirrors the deliverables expected in professional AI security assessments.

Exam Experience & Tips

I completed the AI Red Teamer course path in mid-January 2026, which gave me a couple of months to review my notes and refine my tooling before the COAE exam launched on April 2nd. That preparation time made a significant difference. When the exam went live on April 2nd, I moved fast with the goal of achieving first blood on the certification. I wrapped the core engagement in about 10 hours of focused work. I left some points on the table, finishing at 85 out of 100 exam points. The certification was officially issued on April 9, 2026. To my knowledge I was first blood on the exam, and among the first people to hold the COAE.

COAE exam dashboard showing 85 of 100 exam points and 6 days 14 hours remaining on the 7-day clock

Exam progress after the main push: 85/100 points with most of the 7-day window still available (6 days, 14 hours remaining on the timer).

HTB Certified Offensive AI Expert badge showing one user has earned this certificationPlatform badge (to my knowledge, first blood; among the first COAE earners on Hack The Box)

The exam maps closely to what is taught in the course. If you have genuinely worked through all 12 modules and understood the material, there should be no surprises in terms of the techniques and concepts tested. The difficulty was about what I expected based on the course content.

Here is the key insight that I want to emphasize: concepts that felt impossibly difficult the first time through the course become recognizable patterns during the exam. Things like gradient descent, norm constraints, saliency maps, and adversarial perturbation techniques were brick walls when I first encountered them in the Hard-rated modules. But during the exam, I was able to quickly recognize what I was looking at and knew exactly what approach to take. The struggle of learning these concepts the first time is what builds the intuition needed to apply them under pressure.

Top Tip: Use Jupyter Notebooks Throughout the Course

My single biggest recommendation for anyone planning to take the COAE: use Jupyter notebooks to solve the course exercises rather than just following along. Build reusable code blocks for every technique you learn.

During the exam, having those notebooks ready was invaluable. When I recognized a pattern, I already had working, tested code on hand to execute quickly. The difference between "I understand this concept" and "I have a working implementation I can adapt in minutes" is the difference between finishing in hours versus days.

Comparisons & Recommendations

HTB COAE vs OffSec OSAI+

The most direct comparison for the HTB COAE is OffSec's AI-300: Advanced AI Red Teaming (OSAI+), which launched around the same time and targets the same emerging discipline. I have not taken the OSAI+ at the time of writing, so this comparison is based on publicly available course descriptions, syllabus details, and exam format information rather than firsthand experience. That said, both certifications validate hands-on offensive AI skills through practical exams rather than multiple-choice tests. Here is how they compare on paper:

 HTB COAEOffSec OSAI+
ProviderHack The Box AcademyOffSec
Course LevelJob Role Path (12 modules, 230 sections)300-level (50-100 hours of content)
Developed WithGoogle (SAIF Framework)OffSec internal
Exam Format7-day practical + report24-hour proctored practical
Exam StyleFull-scale AI offensive assessment in simulated corporate environmentRed team engagement against AI-enabled enterprise environment
Key TopicsAdversarial ML (FGSM, DeepFool, JSMA, EAD), data poisoning, prompt injection, evasion math (norms, gradients), AI privacy (MIA, DP-SGD, PATE), AI defenseLLM attacks, multi-agent AI systems, RAG pipelines, embeddings, AI infrastructure, cloud security for AI
Math DepthHeavy (gradient computation, norm constraints, optimization, saliency maps)Moderate (applied, less theoretical math emphasis)
EmphasisFull AI/ML attack lifecycle: model, data, application, system, and defensePractical red teaming of production AI infrastructure and agent systems
Pricing$210 standalone exam (2 attempts); ~$490 Silver Annual (full Academy access to required modules + exam voucher)Starting at $1,749 (course + cert bundle) or $2,749/year (Learn One)
Cert ValidityLifetime (HTB standard)3 years (OSAI+)

Based on the published syllabi, the two certifications have meaningfully different emphasis. The HTB COAE goes deep into the mathematics of adversarial machine learning: you will spend serious time on gradient computation, norm constraints, FGSM, DeepFool, JSMA, and EAD. The course builds genuine mathematical intuition around how and why these attacks work at a fundamental level. It also covers data poisoning, membership inference, and differential privacy in significant depth, and pairs every offensive module with defensive strategies.

From what OffSec has published, the OSAI+ appears to lean more heavily into practical red teaming of modern AI infrastructure: multi-agent systems, RAG pipelines, embeddings, and cloud environments supporting AI deployments. The 24-hour proctored format is classic OffSec, and the course benefits from OffSec's long track record of building practical offensive certifications. I plan to take this one as well and will update this comparison with firsthand impressions when I do.

On paper, the two look complementary rather than competitive. If you want to deeply understand how adversarial ML works at a mathematical level and cover the full attack lifecycle from model to defense, the HTB path and COAE are a strong choice. If you are more focused on red teaming production AI infrastructure (agents, RAG, cloud) with less emphasis on the underlying math, the OffSec OSAI+ may be a better fit. Ideally, pursuing both would give you the theoretical foundation from one and the applied infrastructure skills from the other.

Who Is This For?

The AI Red Teamer path and COAE certification are designed for:

Penetration testers and red teamers looking to expand into AI/ML attack surfaces
Security engineers and architects responsible for securing AI-integrated systems
AI/ML engineers and data scientists who want to understand the offensive perspective on their systems
Security researchers interested in adversarial machine learning and AI vulnerability research
Security leaders and managers who need to understand AI risk to make informed decisions about securing AI deployments

A strong foundation in Python, basic machine learning concepts, and traditional web application security will make the path significantly more approachable. The Hard-rated modules (AI Data Attacks, First-Order Attacks, Sparsity Attacks) require comfort with mathematical notation and optimization concepts, though the course does build toward these progressively.

Final Verdict

The AI Red Teamer path delivers on its promise. At an estimated 19 days of content, it took me about 2 months to complete while balancing other commitments. That time investment was worth every hour. The path takes you from zero AI security knowledge to being capable of performing meaningful offensive assessments against AI systems, while also equipping you to recommend and implement defenses. The collaboration with Google shows in the quality and real-world relevance of the content.

This is not an easy path. The Medium-rated modules are genuinely Medium, and the Hard-rated modules (AI Data Attacks, First-Order Attacks, Sparsity Attacks) will challenge anyone without a strong mathematical background. But that challenge is precisely what makes the learning valuable. You'll emerge with skills that are genuinely rare in the security industry.

Ready to become an AI Red Teamer? Check out the AI Red Teamer path on HTB Academy and the HTB COAE certification page.

AI red teaming concept art showing robotic arms probing a neural network brain in a cyberpunk environment