Guardrails

Create a guardrail

Five components compose a guardrail: topics, regex patterns, sensitive-data patterns, plugins, and the actions they trigger. Build them in any order.

What you'll learn
  • How to name and seed a new guardrail profile
  • How to write topics with example phrases
  • How to add regex and sensitive-data patterns
  • How to wire external plugins like OpenAI Moderation

Start a new profile

  1. 1

    Open Guardrails

    In the sidebar, click Guardrails. Click + New Profile.
  2. 2

    Name and describe

    Give it a clear name like "Support agent runtime safety" or "Finance PII v2". Add a one-sentence description explaining the intent — this appears in the agent builder when picking profiles.
  3. 3

    Pick a preset or blank

    Choose Standard, PII Redaction, Content Filter, or Custom. Preset rules are pre-populated and editable. Blank gives you an empty canvas.

Compose the rule set

A guardrail is the union of these four rule types. Add as many of each as you need.
  1. 1

    Topics

    A topic is a category label plus 3–10 example phrases that exemplify it. The platform uses the examples to detect semantic matches at runtime — not literal substrings. Use topics for fuzzy categories like "gambling", "competitor names", "off-scope requests".
  2. 2

    Regex patterns

    Literal pattern matching for known string shapes. Useful for project codes, internal URLs, ticket-id formats. Test the regex in the inline tester before saving.
  3. 3

    Sensitive-data patterns

    Pre-built detectors for SSN, credit card, bank account, phone, email, passport, license, IP address, JWT, API key. Toggle on the ones that apply. These run faster and more accurately than custom regex for these formats.
  4. 4

    Plugins

    External moderation services that augment your detection. Examples: OpenAI Moderation, custom HTTP webhook, internal classifier. Each plugin runs as part of the inspection chain and contributes its verdict.

Set actions

For each rule, pick the action: block stops the run, redact masks the match, warn only records. Default to block for hard-violation categories and redact for PII so agents can continue with masked content.

Save and attach

  1. 1

    Save the profile

    The profile is immediately available to agents. It is not yet enforcing — it must be attached to an agent first.
  2. 2

    Attach to an agent

    In the agent builder, go to Guardrails Configuration and add this profile. From the next run forward, it is enforced.
  3. 3

    Test before relying on it

    Open the Test dialog and verify a few representative phrases are blocked or redacted as expected. See Testing guardrails.

Frequently asked questions

How many example phrases per topic do I need?
Three is the minimum, ten is plenty. Quality over quantity — pick varied phrasings that capture the semantic range, not paraphrases of one sentence.
Can I share a profile across multiple agents?
Yes. A profile is workspace-level by default. Attach it to as many agents as you want. Edits propagate to every attached agent on the next run.
What happens if two rules match the same span?
The strongest action wins: block beats redact beats warn. The trace records every matching rule for audit, even if only the strongest action is applied.
How do I disable a guardrail temporarily?
Detach the profile from the agent in the builder. The profile itself stays intact — you can re-attach it later. Avoid editing the profile to neutralize rules; that wipes audit trail.