Guardrails

Test a guardrail before you trust it

The Test dialog runs your profile against any phrase you paste. You see which rule fired, the action, and the matched span — all in under a second.

What you'll learn

Open the Test dialog

1
Open the profile
In Guardrails, click the profile you want to test.
2
Click Test
A dialog opens with a text area, a Run button, and a results panel.
3
Paste a phrase
Paste a sample prompt, tool argument, or response. Realistic strings test better than synthetic ones — pull from past runs if you have them.

Each run returns a verdict and a breakdown.

1
Verdict
Pass, Warn, Redact, or Block. The verdict is the strongest action triggered by any matching rule.
2
Matching rules
Every rule that fired is listed with its category (topic, regex, sensitive-data, plugin), the matched span, and its individual action.
3
Latency
Inspection time in milliseconds, broken down per rule. Use this to spot slow plugins before they hit production.

1
Phrase should block but does not
Add a closer example phrase to the matching topic, or tighten the regex. Re-run. Keep iterating until the verdict is Block.
2
Phrase blocks but should not
You have a false positive. Loosen the regex, or remove the over-broad example phrase from the topic. Test a benign neighboring phrase to confirm.
3
Build a regression list
Keep a list of representative phrases — known-bad, known-good, edge cases. Re-run them every time you edit the profile. Catches regressions instantly.

Does the Test dialog count against my LLM quota?: Inspection runs locally inside the guardrail engine. Topic detection uses the platform classifier; only plugins that call external moderation APIs incur a per-call cost. Plugin cost is reported in the latency breakdown.
Can I batch-test many phrases at once?: The dialog accepts one phrase at a time today. For batch evaluation, use Eval with a guardrail dataset — it scores the profile across an entire set and reports precision and recall.
How is the Test dialog different from agent testing?: Agent testing runs the full agent loop — model, tools, guardrails — against a prompt. The guardrail Test dialog isolates just the guardrail layer, so you can iterate on rules without paying for model calls.
Do I need to re-test after attaching the profile to an agent?: Yes. Run the agent in Test mode with a few representative prompts to confirm the guardrail behaves the same in the full execution path. Tool responses and model outputs sometimes trigger rules that an isolated input did not.