Guardrails
Test a guardrail before you trust it
The Test dialog runs your profile against any phrase you paste. You see which rule fired, the action, and the matched span — all in under a second.
What you'll learn
- Where the Test dialog lives and what it shows
- How to verify a block, redact, or warn
- How to iterate when a phrase should match but does not
- How to test edge cases before promoting to agents
Open the Test dialog
- 1
Open the profile
In Guardrails, click the profile you want to test. - 2
Click Test
A dialog opens with a text area, a Run button, and a results panel. - 3
Paste a phrase
Paste a sample prompt, tool argument, or response. Realistic strings test better than synthetic ones — pull from past runs if you have them.
Read the result
Each run returns a verdict and a breakdown.
- 1
Verdict
Pass, Warn, Redact, or Block. The verdict is the strongest action triggered by any matching rule. - 2
Matching rules
Every rule that fired is listed with its category (topic, regex, sensitive-data, plugin), the matched span, and its individual action. - 3
Latency
Inspection time in milliseconds, broken down per rule. Use this to spot slow plugins before they hit production.
Refine until the verdict is right
- 1
Phrase should block but does not
Add a closer example phrase to the matching topic, or tighten the regex. Re-run. Keep iterating until the verdict is Block. - 2
Phrase blocks but should not
You have a false positive. Loosen the regex, or remove the over-broad example phrase from the topic. Test a benign neighboring phrase to confirm. - 3
Build a regression list
Keep a list of representative phrases — known-bad, known-good, edge cases. Re-run them every time you edit the profile. Catches regressions instantly.
Frequently asked questions
- Does the Test dialog count against my LLM quota?
- Inspection runs locally inside the guardrail engine. Topic detection uses the platform classifier; only plugins that call external moderation APIs incur a per-call cost. Plugin cost is reported in the latency breakdown.
- Can I batch-test many phrases at once?
- The dialog accepts one phrase at a time today. For batch evaluation, use Eval with a guardrail dataset — it scores the profile across an entire set and reports precision and recall.
- How is the Test dialog different from agent testing?
- Agent testing runs the full agent loop — model, tools, guardrails — against a prompt. The guardrail Test dialog isolates just the guardrail layer, so you can iterate on rules without paying for model calls.
- Do I need to re-test after attaching the profile to an agent?
- Yes. Run the agent in Test mode with a few representative prompts to confirm the guardrail behaves the same in the full execution path. Tool responses and model outputs sometimes trigger rules that an isolated input did not.