Skip to content

Guardrails

Every MCP tool call passes through a guardrail evaluation layer before execution. This layer — called the Policy Enforcement Point (PEP) and Policy Decision Point (PDP) — evaluates your configured rules deterministically against the tool call’s arguments and returns a decision: allow, deny, require approval, or mask fields from the result.

Guardrails are the primary safety mechanism between oHallo’s AI agents and your MCP servers. They are not probabilistic or LLM-based. Every rule is a deterministic condition that either matches or does not match.

When an agent calls a tool, MCP Hub evaluates guardrails in this order:

  1. Account isolation — hard deny if the connection does not belong to the requesting account.
  2. Rate limit check — deny if the account has exceeded the rate limit for this tool.
  3. Guardrail rules — evaluate all enabled guardrails that match the tool name. Deny rules take precedence over all other types (forbid-overrides-permit).
  4. Default risk policy — if no guardrail matched and the tool is classified as destructive, require approval automatically.
  5. Default allow — if nothing blocked the call, allow it.

On any evaluation error (malformed condition, missing field, unexpected type), the system denies the call. This is fail-closed by design.

Each guardrail has a policyType that determines what happens when the rule’s condition matches.

Policy typeExecutionBehavior
denyBlockedTool call is rejected immediately. The agent receives an error and cannot retry.
approval_gateBlockedTool call is paused. An approval request is created and the conversation workflow waits for a human to approve or reject.
maskAllowedTool call executes normally. Specified fields are redacted from the result before it reaches the agent.
escalateAllowedTool call executes normally. After execution, the result is evaluated against the condition. If it matches, an escalation signal is attached for post-execution review.

When multiple guardrails match the same tool call, the evaluation follows forbid-overrides-permit: a single deny match blocks the call regardless of any allow or approval_gate rules. An approval_gate match blocks the call even if other rules would allow it.

All guardrail endpoints require Kinde JWT authentication via Authorization: Bearer <token>. Guardrails are scoped to an MCP connection — each guardrail applies to a specific tool (or all tools) on a specific connection.

Returns all guardrails for a connection.

Terminal window
curl -X GET https://api.ohallo.eu/api/mcp-connections/{connectionId}/guardrails \
-H "Authorization: Bearer <token>"
[
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"tenantId": "t1234567-abcd-efgh-ijkl-mnopqrstuvwx",
"mcpConnectionId": "c9876543-dcba-fedc-ba98-765432109876",
"toolName": "submit_return",
"policyType": "approval_gate",
"config": {
"condition": "args.refund_amount > 500",
"description": "Require approval for returns over 500 EUR"
},
"enabled": true,
"createdAt": "2026-04-10T14:30:00.000Z",
"updatedAt": "2026-04-10T14:30:00.000Z"
}
]

Creates a new guardrail on a connection. If a condition is provided, it is validated as a filtrex expression at creation time — invalid expressions return a 422 error.

Terminal window
curl -X POST https://api.ohallo.eu/api/mcp-connections/{connectionId}/guardrails \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"toolName": "submit_return",
"policyType": "approval_gate",
"condition": "args.refund_amount > 500",
"config": {
"description": "Require approval for returns over 500 EUR"
},
"enabled": true
}'

Request body:

FieldTypeRequiredDescription
toolNamestringYesThe tool this guardrail applies to. Use * to match all tools on the connection.
policyTypestringYesOne of: approval_gate, deny, mask, escalate.
conditionstringNoA filtrex expression evaluated against the tool call arguments. If omitted, the guardrail matches every call to the tool.
configobjectNoAdditional configuration. Use description to explain the rule, maskFields for mask type, escalationType for escalate type.
enabledbooleanNoDefaults to true. Set to false to disable the guardrail without deleting it.

Response (201):

{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"tenantId": "t1234567-abcd-efgh-ijkl-mnopqrstuvwx",
"mcpConnectionId": "c9876543-dcba-fedc-ba98-765432109876",
"toolName": "submit_return",
"policyType": "approval_gate",
"config": {
"condition": "args.refund_amount > 500",
"description": "Require approval for returns over 500 EUR"
},
"enabled": true,
"createdAt": "2026-04-10T14:30:00.000Z",
"updatedAt": "2026-04-10T14:30:00.000Z"
}

Error (422) — invalid condition:

{
"error": "Invalid condition expression",
"details": "Unexpected token at position 12"
}

Updates an existing guardrail. Only the fields you include in the request body are changed.

Terminal window
curl -X PATCH https://api.ohallo.eu/api/mcp-call-guardrails/{guardrailId} \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"condition": "args.refund_amount > 1000",
"config": {
"description": "Require approval for returns over 1000 EUR"
}
}'

Request body (all fields optional):

FieldTypeDescription
toolNamestringChange the tool this guardrail applies to.
policyTypestringChange the policy type.
conditionstringChange the condition expression. Validated at update time.
configobjectReplace the configuration object.
enabledbooleanEnable or disable the guardrail.

Response (200): The updated guardrail object.

Error (404): Guardrail not found or does not belong to your account.

Terminal window
curl -X DELETE https://api.ohallo.eu/api/mcp-call-guardrails/{guardrailId} \
-H "Authorization: Bearer <token>"

Response (200):

{
"ok": true
}

Replaces all guardrails for a connection at once. Existing guardrails on the connection are deleted and the provided list is inserted atomically. This is the endpoint the oHallo dashboard uses when saving guardrail configuration.

Terminal window
curl -X PUT https://api.ohallo.eu/api/mcp-connections/{connectionId}/policies \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"guardrails": [
{
"toolName": "submit_return",
"policyType": "approval_gate",
"config": {
"condition": "args.refund_amount > 500",
"description": "Require approval for returns over 500 EUR"
},
"enabled": true
},
{
"toolName": "delete_customer",
"policyType": "deny",
"config": {
"description": "Never allow customer deletion via AI"
}
}
]
}'

Response (200): The full list of created guardrail objects.

Guardrail conditions use filtrex, a sandboxed expression language that evaluates against the tool call arguments. No eval or function constructors are used — expressions are compiled to a safe AST.

The oHallo dashboard provides a visual condition builder that compiles to filtrex automatically. Tenants enter a field name, select an operator, and type a value — the args. prefix and expression syntax are handled by the UI. The API accepts the compiled filtrex expression directly.

Conditions access tool call arguments through the args namespace using dot notation:

ExpressionDescription
args.refund_amount > 500Matches when the refund amount exceeds 500
args.quantity > 100Matches when quantity exceeds 100
args.country == "US"Matches when the country is US
args.priority == "critical"Matches when priority is critical

You can combine conditions with logical operators:

OperatorExample
andargs.refund_amount > 500 and args.currency == "EUR"
orargs.priority == "critical" or args.priority == "urgent"
notnot (args.is_verified == 1)

Comparisons: ==, !=, <, >, <=, >=.

Arithmetic: +, -, *, /.

If a guardrail has no condition (the field is omitted), it matches every call to the specified tool. Use this for blanket rules like “never allow deletion” or “always require approval for this tool.”

If a condition references a field that does not exist in the tool call arguments, or if the expression is malformed, the evaluation fails and the tool call is denied. This is fail-closed — a broken condition blocks execution rather than silently allowing it.

Every tool on an MCP connection has a risk classification that determines default behavior when no guardrail matches. The three levels are:

ClassificationDefault behavior
readAllow. No approval required.
writeAllow. No approval required (but guardrails can still block).
destructiveRequire approval, even if no approval_gate guardrail exists.

If you do not set a classification for a tool, it defaults to write.

Terminal window
curl -X PATCH https://api.ohallo.eu/api/mcp-connections/{connectionId}/tool-risk-classifications \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"classifications": {
"get_order": "read",
"update_order": "write",
"delete_order": "destructive",
"submit_return": "write",
"cancel_subscription": "destructive"
}
}'

Response (200): The full connection object with the updated classifications.

Risk classifications and guardrails work together. A read tool with a deny guardrail will still be blocked when the condition matches. A destructive tool with an existing approval_gate guardrail uses that guardrail’s condition — the default “require approval” only applies when no guardrail matched at all.

When a tool call triggers an approval_gate guardrail (or the default destructive-tool policy), the following sequence occurs:

The agent receives a response indicating that the tool call needs human approval:

{
"success": false,
"error_type": "approval_required",
"reason": "Require approval for returns over 500 EUR",
"policy_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"approval_request_id": "req-5678-abcd-efgh-ijkl-9012",
"expires_at": "2026-04-11T14:30:00.000Z",
"policy_decision": "require_approval"
}

The conversation workflow creates an attention item visible in the oHallo dashboard and waits for a human decision. The customer is not sent a reply while approval is pending.

A team member reviews the pending tool call in the dashboard. They see the tool name, arguments, the guardrail that triggered, and the reason. They approve or reject via the dashboard UI or the API:

Terminal window
curl -X POST https://api.ohallo.eu/api/conversations/{conversationId}/approve \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"approvalRequestId": "req-5678-abcd-efgh-ijkl-9012",
"approved": true,
"reviewerNote": "Confirmed with warehouse, return is valid"
}'

Request body:

FieldTypeRequiredDescription
approvalRequestIdstringYesThe ID from the approval_required response.
approvedbooleanYestrue to approve, false to reject.
reviewerNotestringNoOptional note explaining the decision. Stored for audit.

Response (200):

{
"ok": true
}

After approval, the conversation workflow resumes. The agent retries the tool call with the same arguments. MCP Hub finds the approved request record and allows the call through without triggering the guardrail again.

After rejection, the workflow resumes and the agent is informed that the action was denied. It composes a response to the customer accordingly.

Approval requests expire after 24 hours. If no decision is made within that window, the request expires and the workflow treats it as a rejection.

If the agent retries a tool call that already has a pending approval request with the same arguments, MCP Hub returns the existing request instead of creating a duplicate. Arguments are matched by a SHA-256 hash of the sorted argument keys and values.

Prevent the AI from ever calling delete_customer, regardless of arguments:

Terminal window
curl -X POST https://api.ohallo.eu/api/mcp-connections/{connectionId}/guardrails \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"toolName": "delete_customer",
"policyType": "deny",
"config": {
"description": "Customer deletion is never allowed via AI agents"
}
}'

Require human approval for refunds over 500 EUR:

Terminal window
curl -X POST https://api.ohallo.eu/api/mcp-connections/{connectionId}/guardrails \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"toolName": "submit_return",
"policyType": "approval_gate",
"condition": "args.refund_amount > 500",
"config": {
"description": "Require approval for returns over 500 EUR"
}
}'

Allow the tool call but redact Social Security numbers and bank account details from the result:

Terminal window
curl -X POST https://api.ohallo.eu/api/mcp-connections/{connectionId}/guardrails \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"toolName": "get_customer",
"policyType": "mask",
"config": {
"description": "Redact sensitive financial data",
"maskFields": ["ssn", "bank_account", "credit_card"]
}
}'

Allow the tool call but flag orders above 10,000 EUR for post-execution review:

Terminal window
curl -X POST https://api.ohallo.eu/api/mcp-connections/{connectionId}/guardrails \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{
"toolName": "create_order",
"policyType": "escalate",
"condition": "args.total > 10000",
"config": {
"description": "Flag high-value orders for review",
"escalationType": "high_value_order"
}
}'
  • Build an MCP Server — create the server your guardrails will protect
  • Tool Schema — define the input schemas that guardrail conditions evaluate against