INPUT BLOCKED · OUTPUT FLAGGED · PERSONA REJECTED · RATE LIMITED · PII SCRUBBED · INJECTION DETECTED · SCOPE ENFORCED · CLAUSE TRIGGERED · INPUT BLOCKED · OUTPUT FLAGGED · PERSONA REJECTED · RATE LIMITED · PII SCRUBBED · INJECTION DETECTED · SCOPE ENFORCED · CLAUSE TRIGGERED ·
JAILBREAK ATTEMPT · SYSTEM PROMPT INTACT · DIRECT OVERRIDE BLOCKED · MULTI-TURN ESCALATION · FICTIONAL WRAPPER · ENCODING BYPASS · THRESHOLD CALIBRATION · BEHAVIOURAL GUARDRAIL · JAILBREAK ATTEMPT · SYSTEM PROMPT INTACT · DIRECT OVERRIDE BLOCKED · MULTI-TURN ESCALATION · FICTIONAL WRAPPER · ENCODING BYPASS ·
DEFECT LOGGED · SEVERITY HIGH · AUDIT FINDING · REGULATORY GAP · FALSE POSITIVE · FALSE NEGATIVE · HALLUCINATION DETECTED · CLAUSE VAGUE · IMPLEMENTATION BUG · LOGIC ERROR · DEFECT LOGGED · SEVERITY HIGH · AUDIT FINDING · REGULATORY GAP · FALSE POSITIVE · FALSE NEGATIVE ·
AI ACT ARTICLE 52 · GDPR VIOLATION · PII LEAK · OUTPUT SCANNER · RATE LIMITER · TOKEN CAP · PERSONA STABILITY · SCOPE BOUNDARY · AI ACT ARTICLE 52 · GDPR VIOLATION · PII LEAK · OUTPUT SCANNER · RATE LIMITER · TOKEN CAP · PERSONA STABILITY · SCOPE BOUNDARY ·
✓ GUARDRAIL PASSED · ✗ GUARDRAIL FAILED · ⚠ THRESHOLD EXCEEDED · ✓ CLAUSE TESTABLE · ✗ SPEC VAGUE · ⚡ FIRE GUARDRAIL · ✓ GUARDRAIL PASSED · ✗ GUARDRAIL FAILED · ⚠ THRESHOLD EXCEEDED · ✓ CLAUSE TESTABLE · ✗ SPEC VAGUE · ⚡ FIRE GUARDRAIL ·
Can you catch a prompt injection in the wild?
Guardrails Matter
Most teams ship AI without ever testing their guardrails. In 5 rounds, you'll fix that.
5 attack patterns to name
3 real code bugs to find
1 audit report at the end
Pick your round
Created by Rahul Parwal · TestingTitbits.com
🔥 0
Audit Report
0%
across 5 guardrail skill areas
Skill Breakdown — what you tested & why it matters
Severity Grid — your weakest & strongest findings
Why guardrails matter — the engineering view
Guardrails are not UX polish. They are the load-bearing safety layer between an LLM's raw capability and real-world harm. Every skill you practised in this game maps directly to a failure mode that has occurred in production AI systems.
Guardrails are system properties, not model properties
The LLM itself has no inherent knowledge of your product's rules. Guardrails are engineered constraints — input filters, output scanners, behavioural clauses, rate limiters — layered around the model. If those layers are absent or misconfigured, the model behaves as designed: helpfully, and without your constraints.
Calibration is an engineering decision with real stakes
A threshold set too tight produces false positives — users blocked from legitimate requests, trust eroded, support costs rising. Too loose and harmful content passes. Calibration is not a dial to set once; it is a test condition to verify continuously. Every deployment decision moves the threshold.
Adversarial inputs are the regression suite for safety
Jailbreak patterns — persona swaps, fictional wrappers, multi-turn escalation — are known attack vectors with names. A tester who can name them can write test cases for them. A system that hasn't been tested against named patterns has an unknown safety posture, not a safe one.
Vague specifications produce untestable systems
"Be helpful" is not a guardrail. A clause only becomes a guardrail when it names a trigger condition and an expected response — because only then can a tester verify it. Specification thinking converts intent into enforceable, auditable behaviour. Without it, you cannot tell whether a guardrail exists or just appears to.
This is what makes AI sustainable
AI systems that leak PII, give dangerous medical advice, or impersonate humans get shut down — by regulators, by press coverage, or by users. Guardrails are not a constraint on AI capability. They are what allows capable AI to remain deployed. Testers who understand this are not gatekeepers — they are enablers of sustainable AI.
