Documentation — SafeClaw

Overview

SafeClaw is a neurosymbolic governance layer for autonomous AI agents. It validates every tool call, message, and action against OWL ontologies and SHACL constraints before execution.

The system has two components:

openclaw-safeclaw-plugin — a TypeScript plugin that intercepts OpenClaw events and forwards them to the SafeClaw service via HTTP.
safeclaw-service — a Python FastAPI service that runs the constraint pipeline against a knowledge graph of ontologies, policies, and user preferences.

The plugin is a thin client with no governance logic of its own. All validation happens server-side, making it easy to update policies without redeploying the agent.

How SafeClaw Controls OpenClaw

The plugin registers OpenClaw event hooks and forwards lifecycle events to the SafeClaw service. Each hook intercepts a specific lifecycle event:

Hook	Endpoint	Purpose	Behavior
`before_tool_call`	`/evaluate/tool-call`	THE GATE — validates every tool call	Blocks execution on violation
`before_prompt_build`	`/context/build`	Context injection into system prompt	Prepends governance context
`message_sending`	`/evaluate/message`	Outbound message governance	Cancels message on violation
`message_received`	`/evaluate/inbound-message`	Inbound prompt-injection assessment	Records risk flags and warnings
`subagent_spawning`	`/evaluate/subagent-spawn`	Subagent spawn governance	Blocks delegation bypass attempts
`subagent_ended`	`/record/subagent-ended`	Subagent lifecycle tracking	Records completion for audit context
`session_start`	`/session/start`	Session lifecycle	Initializes session-scoped governance state
`session_end`	`/session/end`	Session lifecycle	Cleans up per-session state
`after_tool_call`	`/record/tool-result`	Feedback loop for dependency tracking	Fire-and-forget
`llm_input`	`/log/llm-input`	Audit logging of LLM inputs	Fire-and-forget
`llm_output`	`/log/llm-output`	Audit logging of LLM outputs	Fire-and-forget

Enforcement Modes

Mode	Behavior
`enforce`	Block actions that violate constraints (default)
`warn-only`	Log warnings but allow execution
`audit-only`	Log server-side only, no client-side action
`disabled`	Completely disable SafeClaw checks

Fail Modes

Mode	Behavior
`open`	Allow on service failure (default). More available.
`closed`	Block on service unavailability. Safer.

OWL (Web Ontology Language) defines a formal class hierarchy of actions your agent can take. SHACL (Shapes Constraint Language) defines structural constraints that validate the shape of each action's data.

Together, they give SafeClaw a machine-readable understanding of what each action means, how risky it is, and what constraints apply — not just pattern matching, but genuine semantic reasoning.

At startup, SafeClaw pre-computes the full rdfs:subClassOf hierarchy via pure-Python SPARQL traversal (no Java required). This enables hierarchy-aware policy checking — blocking ShellAction automatically blocks all subclasses like GitPush and ForcePush. Custom ontology extensions inherit risk levels and constraints without Python code changes.

Action Class Hierarchy

Action
├── FileAction
│   ├── ReadFile          (Low, reversible, LocalOnly)
│   ├── WriteFile         (Medium, reversible, LocalOnly)
│   ├── EditFile          (Medium, reversible, LocalOnly)
│   ├── DeleteFile        (Critical, irreversible, LocalOnly)
│   ├── ListFiles
│   └── SearchFiles
├── ShellAction
│   ├── ExecuteCommand    (High, reversible, LocalOnly)
│   ├── GitCommit         (Medium, reversible, LocalOnly)
│   ├── GitPush           (High, irreversible, SharedState)
│   ├── ForcePush         (Critical, irreversible, SharedState)
│   ├── GitResetHard      (Critical, irreversible, LocalOnly)
│   ├── RunTests          (Low, reversible, LocalOnly)
│   ├── DockerCleanup     (High, irreversible, LocalOnly)
│   └── PackagePublish    (Critical, irreversible, ExternalWorld)
├── NetworkAction
│   ├── WebFetch          (Medium, reversible, ExternalWorld)
│   ├── WebSearch         (Low, reversible, ExternalWorld)
│   └── NetworkRequest
├── MessageAction
│   └── SendMessage       (High, irreversible, ExternalWorld)
└── BrowserAction         (Medium, reversible, ExternalWorld)

Risk Levels

Level	Meaning	Examples
LowRisk	Safe, read-only or easily undone	ReadFile, RunTests, WebSearch
MediumRisk	Modifies local state but reversible	WriteFile, EditFile, GitCommit, WebFetch
HighRisk	Affects shared state or hard to reverse	ExecuteCommand, GitPush, SendMessage
CriticalRisk	Irreversible or affects external systems	DeleteFile, ForcePush, GitResetHard, PackagePublish

Scopes

Scope	Meaning
`LocalOnly`	Affects only the local filesystem
`SharedState`	Affects shared resources (e.g., git remotes)
`ExternalWorld`	Reaches external systems (APIs, messages, network)

The 11-Step Constraint Pipeline

Every tool call passes through these gates in order. Execution blocks at the first violation.

0

Agent Governance

Token authentication, kill switch check, delegation bypass detection.

Example block reasons:

Token invalid
Agent killed
Delegation bypass detected

1

Action Classification

Maps tool name + parameters to an ontology class. Assigns risk level, reversibility, and scope. Covers all common tool variants: read, write, edit, delete, remove, unlink, trash, shell, and more.

Example block reasons:

Unknown tool type
Unclassifiable action

2

Role-Based Access Control

Checks if the agent's role allows the classified action and resource path. Resource paths are extracted from multiple param key variants (file_path, path, filepath, dest, target, source, and more). Uses ontology hierarchy: denying a parent class denies all subclasses. Temporary permissions are checked first.

Example block reasons:

Role 'researcher' does not allow action 'WriteFile'
Access to /secrets/** denied

3

SHACL Validation

Validates the action's RDF graph against shape constraints from the shapes/ directory.

Example block reasons:

ForbiddenCommandShape violated
CriticalPathShape violated

4

Policy Check

Evaluates against policy rules from the knowledge graph (prohibitions, obligations). NemoClaw network and filesystem rules are enforced here when loaded. Hierarchy-aware: blocking a parent class blocks all subclasses.

Example block reasons:

Environment files may contain secrets
Force push can destroy shared history
Not in NemoClaw network allowlist

5

Preference Check

User-specific preferences like 'confirm before delete' or 'never modify paths'.

Example block reasons:

User requires confirmation before delete
Path matches neverModifyPaths

6

Dependency Check

Validates prerequisites are met. E.g., tests must pass before git push.

Example block reasons:

RunTests must succeed before GitPush

7

Temporal Check

Time-based constraints such as not-before and not-after windows.

Example block reasons:

Action not permitted at this time
Deploy window has closed

8

Rate Limit Check

Per-session rate limiting backed by persistent counters.

Example block reasons:

Rate limit exceeded: 100 actions/hour

9

Derived Rules

Combined rules from multiple constraints. May require user confirmation rather than a hard block.

Example block reasons:

Cumulative risk threshold exceeded
Transitive prohibition applies

10

Hierarchy Rate Limit

Multi-agent hierarchy-wide rate limiting across parent and child agents.

Example block reasons:

Hierarchy rate limit exceeded
Child agents exceeded parent budget

Built-in Policies

Default prohibitions and obligations defined in the ontology:

Prohibitions

Policy	Type	Pattern	Reason
`NoEnvFiles`	Path	`.\.env.`	Environment files may contain secrets
`NoCredentialFiles`	Path	`.(credentials\|secrets\|tokens).`	Credential files contain sensitive data
`NoForcePush`	Command	`git push.*--force`	Force push can destroy shared history
`NoRootDelete`	Command	`rm\s+-rf\s+/`	Recursive deletion of root paths is prohibited
`NoResetHard`	Command	`git reset --hard`	Hard reset can destroy uncommitted work

Obligations

Policy	Action	Requires	Reason
`TestBeforePush`	GitPush	RunTests (must succeed first)	All pushes must pass tests

NemoClaw Sandbox Support

SafeClaw can ingest NemoClaw YAML sandbox policies and translate them into RDF triples in the knowledge graph. The policy checker then uses those triples alongside ordinary SafeClaw policies.

What Gets Enforced

Network allowlists — outbound requests are checked against NemoClaw host, port, and protocol rules.
Filesystem boundaries — file actions are checked against NemoClaw path prefixes and access modes such as read-only, read-write, and denied.
Sandbox policy shape — /evaluate/sandbox-policy validates sandbox configs for required tool and filesystem sections.

Policy Loading

NemoClaw activates when SAFECLAW_NEMOCLAW_ENABLED=true is set, when SAFECLAW_NEMOCLAW_POLICY_DIR points to a policy directory, when NEMOCLAW_POLICY_PATH points to a policy file or directory, or when OPENSHELL_SANDBOX exposes a policies/ directory. Policies are re-ingested on POST /reload.

Plugin Behavior

The OpenClaw plugin detects NemoClaw sandboxes with OPENSHELL_SANDBOX, rewrites localhost service URLs to the container-to-host bridge when needed, and ships a bundled egress policy for allowing SafeClaw API traffic.

Roles & Permissions

SafeClaw ships with three roles. Custom roles can be defined in Turtle files.

Role	Autonomy	Enforcement	Allowed Actions	Denied Actions	Denied Paths
Admin	full	warn-only	All	None	None
Developer	moderate	enforce	All (except denied)	ForcePush, DeleteFile, GitResetHard	`/secrets/`, `/etc/`
Researcher	supervised	enforce	ReadFile, ListFiles, SearchFiles	WriteFile, EditFile, DeleteFile, GitPush, ForcePush, ShellAction, SendMessage	N/A

User Preferences

Per-user preferences are stored as RDF triples and loaded from ~/.safeclaw/ or the ontology's users/ directory.

Safety Preferences

Property	Type	Default	Effect
`autonomyLevel`	string	moderate	Controls agent independence: full \| high \| moderate \| cautious \| supervised
`confirmBeforeDelete`	boolean	true	Require confirmation before any delete action
`confirmBeforePush`	boolean	true	Require confirmation before git push
`confirmBeforeSend`	boolean	true	Require confirmation before sending messages
`neverModifyPaths`	string	(none)	Glob patterns for paths the agent must never modify

Coding Preferences

Property	Type	Effect
`preferredLanguage`	string	Programming language preference for agent suggestions
`preferredTestFramework`	string	Testing framework preference
`maxFilesPerCommit`	integer	Maximum files allowed per commit

Communication Preferences

Property	Type	Effect
`toneOfVoice`	string	Communication tone/style preference
`maxMessageLength`	integer	Maximum message length constraint
`neverContactList`	string	Email/contact patterns to never contact

Optional LLM Assist

SafeClaw includes an optional OpenAI-compatible LLM layer. Set SAFECLAW_LLM_PROVIDER and SAFECLAW_LLM_API_KEY to use providers such as Mistral, OpenAI, Gemini, Groq, xAI, DeepSeek, Kimi, Qwen, Together AI, OpenRouter, or a custom endpoint. Legacy SAFECLAW_MISTRAL_API_KEY still works for backward compatibility.

The LLM layer is purely passive and advisory — it never makes the allow/block decision. Symbolic constraints remain the enforcement path.

Component	When	Purpose
Security Reviewer	After an action is allowed by symbolic checks	Background review for semantic risks: obfuscation, multi-step evasion chains, encoded payloads, script injection patterns
Classification Observer	When action classifier falls back to generic 'Action'	Suggests improved ontology classifications for unknown tools — logged to classification_suggestions.jsonl
Decision Explainer	When a constraint violation occurs	Generates plain-English explanations of why an action was blocked

The Security Reviewer specifically watches for:

Base64/hex-encoded payloads hiding malicious commands
Multi-step evasion chains (e.g., write script then execute)
URL-based payload delivery (curl | sh patterns)
Flag reordering and command aliases
Environment variable manipulation

Context Injection

Before each agent session starts, SafeClaw injects governance context into the agent's system prompt via the before_prompt_build hook. This gives the LLM awareness of its constraints before it even proposes an action.

Injected Context Includes

Active user preferences — autonomy level, confirmation requirements, forbidden paths
Active domain policies — summarized constraints with reasoning
Recent violations — last 5 blocked actions and reasons, so the agent avoids retrying the same approach
Session history — last 10 actions with outcomes, files modified, and violation summary
Agent role info — role name, autonomy level, denied action classes

This context injection improves agent self-regulation: the LLM learns what's prohibited and adjusts its behavior, reducing the number of blocked actions over the course of a session.

State Persistence

SafeClaw persists critical governance state in SQLite at ~/.safeclaw/governance_state.db through the StateStore class. In-memory structures remain the fast path, but restart-sensitive state is durable.

Agent kills — killed agents stay killed after service restarts.
Rate-limit counters — restart does not reset enforcement budgets.
Temporary permission grants — time-limited or task-scoped grants survive restarts until they expire or are revoked.

Session locks, active session context, and delegation history are intentionally ephemeral session state.

Audit Trail

Every decision — allow or block — is recorded as a DecisionRecord in append-only JSONL files at ~/.safeclaw/audit/.

DecisionRecord Structure

{
  "id": "uuid",
  "timestamp": "ISO-8601",
  "session_id": "...",
  "user_id": "...",
  "agent_id": "...",
  "action": {
    "tool_name": "write_file",
    "params": { ... },
    "ontology_class": "WriteFile",
    "risk_level": "MediumRisk",
    "is_reversible": true,
    "affects_scope": "LocalOnly"
  },
  "decision": "allowed",
  "justification": {
    "constraints_checked": [
      { "constraint_uri": "...",
        "constraint_type": "...",
        "result": "satisfied",
        "reason": "..." }
    ],
    "preferences_applied": [ ... ],
    "elapsed_ms": 12.3
  }
}

The dashboard at /admin provides a web interface for browsing audit logs, viewing decision history, and managing the system.

Message Governance

Outbound messages are governed by the message_sending hook with three checks:

1. Never-Contact List

Recipients on the never-contact list (configured via neverContactList preference) are unconditionally blocked.

2. Sensitive Data Detection

Message content is scanned against credential and secret patterns:

Pattern	Detects
Base64 strings (40+ chars)	Encoded secrets
`api_key\|secret_key\|access_token\|auth_token`	API keys and tokens
`password\|passwd\|pwd`	Passwords
`ghp_\|gho_\|ghu_\|ghs_\|ghr_`	GitHub tokens
`github_pat_`	GitHub fine-grained personal access tokens
`sk-...`	OpenAI / Stripe secret keys
`AKIA...`	AWS Access Key IDs
`-----BEGIN PRIVATE KEY-----`	PEM private keys

3. Rate Limiting

Default: 50 messages per session per hour.

Inbound Message Risk

Inbound messages are evaluated by /evaluate/inbound-message for prompt-injection risk. SafeClaw combines channel trust levels from safeclaw-channels.ttl with content-pattern detection and sender metadata, returning a risk level plus flags and warnings.

Error Handling

The SafeClaw service returns structured error responses with machine-readable codes and human-readable hints for every failure:

{
  "error": "ENGINE_NOT_READY",
  "detail": "Engine not initialized — the service is still starting up.",
  "hint": "Wait a moment and retry, or check service logs."
}

Error Codes

Code	HTTP	Meaning
`ENGINE_NOT_READY`	503	Service is starting up, engine not yet initialized
`INTERNAL_ERROR`	500	Unhandled exception — check service logs
`INVALID_REQUEST`	400	Malformed request body or missing fields

Plugin Error Handling

The TypeScript plugin (v0.1.3+) parses structured errors and provides context-specific warnings:

Timeout — logs timeout duration and service URL
Connection refused — suggests checking if the service is running
HTTP errors — parses and displays the detail and hint fields from the response
Fail-closed blocks — include the service URL in the block reason for easier debugging

CLI & TUI

SafeClaw provides CLI commands and an interactive terminal UI for managing the service, diagnosing issues, and controlling OpenClaw.

safeclaw tui

Opens the interactive settings TUI. The Status tab shows live health for both SafeClaw and the OpenClaw daemon:

  Service      ● Connected (localhost:8420)
  OpenClaw     ● Running
  Enforcement  enforce
  Fail Mode    closed
  Enabled      ON

  Service v0.1.0
  Last check: 14:32:05

  Press r to restart OpenClaw daemon

The status auto-refreshes every 10 seconds. Press r to restart the OpenClaw daemon directly from the TUI.

safeclaw connect

Links your local agent to the SafeClaw service by writing your API key to ~/.safeclaw/config.json:

$ safeclaw connect sc_abc123...
Connected! Your API key has been saved to ~/.safeclaw/config.json

The config file is created with 0600 permissions (owner read/write only). If a config file already exists, only the remote.apiKey and remote.serviceUrl fields are updated — other settings are preserved.

By default the service URL is set to https://api.safeclaw.eu/api/v1. To connect to a self-hosted instance:

$ safeclaw connect sc_abc123... --service-url http://localhost:8420/api/v1

safeclaw restart-openclaw

Restarts the OpenClaw daemon from the command line without opening the TUI:

$ safeclaw restart-openclaw
OpenClaw daemon restarted successfully.

safeclaw status check

Pings the running service and displays component-level health:

$ safeclaw status check
Service: ok
Version: 0.1.0
Engine ready: True
Uptime: 1234s

┌───────────────┬────────────────────────┐
│ Component     │ Detail                 │
├───────────────┼────────────────────────┤
│ Knowledge Graph│ 847 triples           │
│ LLM           │ not configured         │
│ Sessions      │ 3 active               │
│ Agents        │ 2 registered, 1 active │
└───────────────┴────────────────────────┘

If the service is not running, it shows a clear error with the suggested fix.

safeclaw status diagnose

Runs offline checks without requiring the service to be running:

Config file at ~/.safeclaw/config.json
Ontology .ttl files present
Audit directory exists
LLM provider key set (optional)
NemoClaw policy path or sandbox environment detected when present

Each check prints OK or ISSUE with remediation hints.

safeclaw serve

Starts the SafeClaw service:

$ safeclaw serve
$ safeclaw serve --host 0.0.0.0 --port 8420 --reload

safeclaw init

Generates a default ~/.safeclaw/config.json with all sections:

$ safeclaw init --user-id myname
$ safeclaw init --user-id myname --mode remote --service-url http://localhost:8420/api/v1

safeclaw audit

View and query audit records:

$ safeclaw audit show --last 20
$ safeclaw audit show --blocked
$ safeclaw audit report <session-id> --format markdown
$ safeclaw audit stats --last 100
$ safeclaw audit compliance
$ safeclaw audit explain <audit-id>

safeclaw policy

Manage governance policies:

$ safeclaw policy list
$ safeclaw policy add NoSecrets --type prohibition --reason "Secrets must not be committed" --path-pattern ".*\.secret.*"
$ safeclaw policy remove NoSecrets
$ safeclaw policy add-nl "Never allow deploys on weekends"

safeclaw pref

View or set user preferences:

$ safeclaw pref show --user-id myuser
$ safeclaw pref set autonomyLevel cautious --user-id myuser
$ safeclaw pref set confirmBeforeDelete true

safeclaw llm

View LLM security findings and classification suggestions (requires a configured LLM provider):

$ safeclaw llm suggestions
$ safeclaw llm findings

Real-Time Events (SSE)

SafeClaw provides a Server-Sent Events (SSE) endpoint for real-time visibility into governance decisions. The endpoint requires admin authentication.

$ curl -N -H 'X-Admin-Password: yourpass' http://localhost:8420/api/v1/events

event: safeclaw
data: {"event_type":"blocked","severity":"warning","title":"Blocked: shell","detail":"[SafeClaw] Force push can destroy shared history","metadata":{"tool_name":"shell","ontology_class":"ForcePush"}}

Event Types

Type	Severity	When
`blocked`	warning	A tool call was blocked by the constraint pipeline
`security_finding`	warning / critical	The LLM security reviewer flagged a concern

Event Bus Limits

Max 100 concurrent SSE subscribers
Max 100 queued events per subscriber (oldest dropped on overflow)
No external dependencies — uses asyncio.Queue internally

Admin Dashboard

The admin dashboard is available at /admin and provides a web interface for monitoring and managing SafeClaw.

Live Features

Toast notifications — real-time pop-up alerts when actions are blocked or security findings are detected, powered by the SSE event stream
Auto-refresh — home page stats update every 5 seconds via HTMX
Agent monitoring — cards showing registered and active agent counts

Audit Log Detail

Clicking 'Details' on any audit record expands to show:

Action parameters (truncated to 500 chars)
Constraint checks with type, result, and reason
Preferences applied
Session action history (last 5 entries)
Block reason for early-exit blocks (agent governance, role checks)

Dashboard Pages

Page	Purpose
Home	System health, decision stats, recent activity
Audit	Filterable audit log with detail expansion
Agents	Agent management — register, kill, revive
Settings	Configuration overview, ontology reload

User Dashboard

The user dashboard at /dashboard provides a self-service web interface for managing your SafeClaw integration. Sign in with GitHub OAuth to access it.

Authentication Flow

Click Sign In in the nav bar to start GitHub OAuth
Authorize the SafeClaw app on GitHub
First-time users are redirected to the onboarding wizard to set preferences and get their API key
Returning users go straight to /dashboard
All /dashboard/* routes are protected by Beforeware

Dashboard Pages

Page	Path	Purpose
Overview	`/dashboard`	Service health check, API key count, getting started guide
API Keys	`/dashboard/keys`	Generate and revoke API keys for service authentication
Agents	`/dashboard/agents`	View registered agents, kill/revive via service API
Preferences	`/dashboard/prefs`	Set autonomy level, confirmation rules, file limits, and AI provider keys

API Keys

API keys authenticate your agent's plugin against the SafeClaw service. Each key has a sc_ prefix and is shown only once at creation time.

Label — a human-readable name for the key
Scope — full (all endpoints) or evaluate (evaluation endpoints only)
Keys are stored hashed; the raw key cannot be recovered
Revoked keys are immediately invalidated

Governance Preferences

Setting	Options	Effect
Autonomy Level	`cautious / moderate / autonomous`	Controls how strictly SafeClaw enforces constraints
Confirm before delete	on / off	Require user confirmation before file deletions
Confirm before push	on / off	Require user confirmation before git push
Confirm before send	on / off	Require confirmation before sending messages
Max files per commit	number	Limit files changed in a single commit
AI provider keys	password	Provider keys for optional LLM-powered features

Self-Hosting

To run the landing site with user management locally:

cd safeclaw-landing
pip install -r requirements.txt
export GITHUB_CLIENT_ID=your_id
export GITHUB_CLIENT_SECRET=your_secret
python main.py  # starts on port 5002

The SQLite database is created automatically in ~/.safeclaw-landing/safeclaw.db.

Demonstration Flows

These walkthroughs show what happens inside SafeClaw when an agent attempts different actions. Each flow traces the request through the constraint pipeline from prompt to final decision.

Flow 1: Blocking a File Deletion

Prompt: "Delete /tmp/safeclaw-test.txt"

Tool Call sent by the agent:

{
  "tool": "delete",
  "params": { "path": "/tmp/safeclaw-test.txt" }
}

Classification: DeleteFile / CriticalRisk — the action classifier maps the delete tool to the sc:DeleteFile ontology class.

Decision: The developer role does not include DeleteFile in its allowed action classes. Pipeline step 2 (Role-Based Access) blocks the call.

Response:

{
  "block": true,
  "decision": "blocked",
  "reason": "Role developer does not permit DeleteFile actions",
  "constraintStep": "role_check",
  "riskLevel": "CriticalRisk"
}

Flow 2: Blocking a Force Push

Prompt: "Push my changes with --force"

Tool Call sent by the agent:

{
  "tool": "exec",
  "params": { "command": "git push --force" }
}

Classification: ForcePush / CriticalRisk — the shell-command pattern matcher recognises git push --force and maps it to sc:ForcePush.

Decision: The developer role explicitly denies ForcePush. Pipeline step 2 (Role-Based Access) blocks the call.

Response:

{
  "block": true,
  "decision": "blocked",
  "reason": "Role developer does not permit ForcePush actions",
  "constraintStep": "role_check",
  "riskLevel": "CriticalRisk"
}

Flow 3: Allowing a Safe Read

Prompt: "Read the config file"

Tool Call sent by the agent:

{
  "tool": "read",
  "params": { "path": "./config.json" }
}

Classification: ReadFile / LowRisk — a simple read operation classified as sc:ReadFile.

Decision: All 11 pipeline steps pass. The developer role permits ReadFile, SHACL shapes validate, no policies or preferences restrict reading, and rate limits are within bounds.

Response:

{
  "block": false,
  "decision": "allowed",
  "reason": "All constraints satisfied",
  "constraintStep": "",
  "riskLevel": "LowRisk"
}

API Reference

All endpoints are under /api/v1. Admin endpoints require the X-Admin-Password header or API key authentication.

Evaluation & Context

Method	Path	Purpose
POST	`/evaluate/tool-call`	Main constraint gate — validates tool calls; supports dryRun
POST	`/evaluate/message`	Message governance (content, recipients)
POST	`/evaluate/inbound-message`	Inbound prompt-injection risk assessment
POST	`/evaluate/subagent-spawn`	Subagent spawn governance and delegation bypass detection
POST	`/evaluate/sandbox-policy`	Sandbox policy validation for tool and filesystem sections
POST	`/context/build`	Build governance context for agent system prompt
POST	`/session/start`	Initialize session-scoped governance state
POST	`/session/end`	Clean up per-session state

Recording & Logging

Method	Path	Purpose
POST	`/record/tool-result`	Record action outcomes for dependency tracking
POST	`/record/subagent-ended`	Record subagent completion for audit context
POST	`/log/llm-input`	Audit log LLM input
POST	`/log/llm-output`	Audit log LLM output

Audit & Reporting (admin)

Method	Path	Purpose
GET	`/audit`	Query audit records (filters: sessionId, blocked, limit)
GET	`/audit/statistics`	Aggregate audit statistics
GET	`/audit/report/{session_id}`	Generate session report (markdown/JSON/CSV)
GET	`/audit/compliance`	Generate compliance report
GET	`/audit/{audit_id}/explain`	LLM-powered decision explanation (requires configured provider)

Ontology (admin)

Method	Path	Purpose
POST	`/reload`	Hot-reload ontologies and reinitialize checkers
GET	`/ontology/graph`	D3-compatible knowledge graph visualization data
GET	`/ontology/search`	Fuzzy search for ontology nodes

Preferences (admin)

Method	Path	Purpose
GET	`/preferences/{user_id}`	Get user preferences
POST	`/preferences/{user_id}`	Update user preferences (writes Turtle file)

Agent Management (admin)

Method	Path	Purpose
POST	`/agents/register`	Register a new agent with role and token
GET	`/agents`	List all registered agents with metadata
POST	`/agents/{agent_id}/kill`	Kill switch — block all actions from agent
POST	`/agents/{agent_id}/revive`	Revive a killed agent
POST	`/agents/{agent_id}/temp-grant`	Grant time-limited or task-scoped permission
DELETE	`/agents/{agent_id}/temp-grant/{grant_id}`	Revoke a temporary permission grant
POST	`/tasks/{task_id}/complete`	Mark task complete, revoke associated grants

Health & Connection

Method	Path	Purpose
GET	`/health`	Service health (version, uptime, component status)
POST	`/heartbeat`	Plugin heartbeat with agent-token verification and config drift detection
POST	`/handshake`	Validate API key and log connection event

LLM Features

Method	Path	Purpose
POST	`/policies/compile`	Compile natural language policy to Turtle (admin, requires configured provider)
GET	`/llm/findings`	Query LLM security findings
GET	`/llm/suggestions`	Get classification suggestions from observation log

Real-Time

Method	Path	Purpose
GET	`/events`	SSE stream of governance events (admin)

Configuration Reference

Plugin Environment Variables

Variable	Default	Description
`SAFECLAW_URL`	`https://api.safeclaw.eu/api/v1`	SafeClaw service URL
`SAFECLAW_API_KEY`	(none)	Bearer token for service authentication
`SAFECLAW_TIMEOUT_MS`	`5000`	HTTP timeout for service calls (ms)
`SAFECLAW_ENABLED`	`true`	Enable/disable the plugin
`SAFECLAW_ENFORCEMENT`	`enforce`	Enforcement mode
`SAFECLAW_FAIL_MODE`	`open`	Behavior on service failure
`SAFECLAW_AGENT_ID`	(none)	Agent identifier
`SAFECLAW_AGENT_TOKEN`	(none)	Agent authentication token

Service Environment Variables

Variable	Default	Description
`SAFECLAW_HOST`	`127.0.0.1`	Bind address
`SAFECLAW_PORT`	`8420`	Service port
`SAFECLAW_DATA_DIR`	`~/.safeclaw`	Data directory
`SAFECLAW_AUDIT_DIR`	`~/.safeclaw/audit`	Audit log directory
`SAFECLAW_REQUIRE_AUTH`	`false`	Require API key authentication
`SAFECLAW_LOG_LEVEL`	`INFO`	Log level
`SAFECLAW_ADMIN_PASSWORD`	(none)	Dashboard admin password
`SAFECLAW_NEMOCLAW_ENABLED`	`false`	Explicitly enable NemoClaw policy loading
`SAFECLAW_NEMOCLAW_POLICY_DIR`	(none)	Directory containing NemoClaw YAML policy files
`SAFECLAW_LLM_PROVIDER`	(none)	OpenAI-compatible provider ID, e.g. mistral, openai, gemini, groq, qwen, custom
`SAFECLAW_LLM_API_KEY`	(none)	API key for the selected provider
`SAFECLAW_LLM_MODEL`	(provider default)	Optional model override for lightweight LLM tasks
`SAFECLAW_LLM_MODEL_LARGE`	(provider default)	Optional model override for complex LLM tasks
`SAFECLAW_LLM_BASE_URL`	(provider default)	Custom OpenAI-compatible base URL when provider is custom
`SAFECLAW_LLM_TIMEOUT_MS`	`3000`	LLM call timeout
`SAFECLAW_MISTRAL_API_KEY`	(none)	Legacy Mistral key fallback
`SAFECLAW_MISTRAL_MODEL`	`mistral-small-latest`	Legacy Mistral model for fast tasks
`SAFECLAW_MISTRAL_MODEL_LARGE`	`mistral-large-latest`	Legacy Mistral model for complex tasks
`SAFECLAW_MISTRAL_TIMEOUT_MS`	`3000`	Legacy Mistral timeout
`SAFECLAW_CORS_ORIGIN_REGEX`	`https?://localhost:\d+$`	CORS allowed origin regex
`SAFECLAW_DB_PATH`	(none)	SQLite path for multi-tenant API key storage
`SAFECLAW_LLM_SECURITY_REVIEW_ENABLED`	`true`	Enable LLM security review observer
`SAFECLAW_LLM_CLASSIFICATION_OBSERVE`	`true`	Enable LLM classification observer

Config File (~/.safeclaw/config.json)

The safeclaw connect command writes a JSON config file that the plugin reads at startup. The safeclaw init command generates a full config with all sections:

{
  "enabled": true,
  "userId": "",
  "mode": "embedded | remote | hybrid",
  "remote": {
    "serviceUrl": "https://api.safeclaw.eu/api/v1",
    "apiKey": "sc_abc123...",
    "timeoutMs": 500
  },
  "hybrid": {
    "circuitBreaker": {
      "failureThreshold": 3,
      "resetTimeoutSec": 30,
      "fallbackMode": "local-only"
    }
  },
  "enforcement": {
    "mode": "enforce",
    "blockMessage": "[SafeClaw] Action blocked: {reason}",
    "maxReasonerTimeMs": 200
  },
  "contextInjection": {
    "enabled": true,
    "includePreferences": true,
    "includePolicies": true,
    "includeSessionFacts": true,
    "includeRecentViolations": true,
    "maxContextChars": 2000
  },
  "audit": {
    "enabled": true,
    "logLlmIO": true,
    "logAllowedActions": true,
    "logBlockedActions": true,
    "retentionDays": 90,
    "format": "jsonl"
  },
  "roles": {
    "defaultRole": "developer"
  },
  "agents": {
    "delegationPolicy": "configurable",
    "requireTokenAuth": true
  },
  "llm": {
    "provider": "",
    "model": "",
    "timeoutMs": 3000
  }
}

The remote section takes precedence over environment variables for SAFECLAW_API_KEY and SAFECLAW_URL. The file is created with 0600 permissions.

SaaS Onboarding

SafeClaw is available as a hosted service at safeclaw.eu. No server setup required.

1. Create an account

Click Get Started on the landing page. Sign in with your GitHub account.

2. Onboarding wizard

First-time users are guided through a three-step wizard:

Step 1 — Autonomy level — choose how much control SafeClaw has (cautious, moderate, or autonomous)
Step 2 — AI provider key (optional) — provide a key for Mistral, Gemini, Groq, Qwen, or another supported OpenAI-compatible provider to enable LLM-powered features such as semantic action classification. You can skip this step and add it later from the dashboard.
Step 3 — API key & connection — a SafeClaw API key is generated automatically. The wizard shows the safeclaw connect command to run in your terminal.

3. Connect your agent

Install the plugin and connect using the command shown in the wizard:

$ npm install -g openclaw-safeclaw-plugin
$ safeclaw connect sc_your_key_here

safeclaw connect writes the key to ~/.safeclaw/config.json and verifies the connection to https://api.safeclaw.eu/api/v1. No manual environment variables needed.

4. Manage from the dashboard

After onboarding, the dashboard at safeclaw.eu/dashboard lets you:

Create and revoke API keys
Set preferences (confirm before delete, max files per commit)
View connected agents
Add or update AI provider keys

Contents

Enforcement Modes

Fail Modes

Action Class Hierarchy

Risk Levels

Scopes

Agent Governance

Action Classification

Role-Based Access Control

SHACL Validation

Policy Check

Preference Check

Dependency Check

Temporal Check

Rate Limit Check

Derived Rules

Hierarchy Rate Limit

Prohibitions

Obligations

What Gets Enforced

Policy Loading

Plugin Behavior

Safety Preferences

Coding Preferences

Communication Preferences

Injected Context Includes

DecisionRecord Structure

1. Never-Contact List

2. Sensitive Data Detection

3. Rate Limiting

Inbound Message Risk

Error Codes

Plugin Error Handling

safeclaw tui

safeclaw connect

safeclaw restart-openclaw

safeclaw status check

safeclaw status diagnose

safeclaw serve

safeclaw init

safeclaw audit

safeclaw policy

safeclaw pref

safeclaw llm

Event Types

Event Bus Limits

Live Features

Audit Log Detail

Dashboard Pages

Authentication Flow

Dashboard Pages

API Keys

Governance Preferences

Self-Hosting

Flow 1: Blocking a File Deletion

Flow 2: Blocking a Force Push

Flow 3: Allowing a Safe Read

Evaluation & Context

Recording & Logging

Audit & Reporting (admin)

Ontology (admin)

Preferences (admin)

Agent Management (admin)

Health & Connection

LLM Features

Real-Time

Plugin Environment Variables

Service Environment Variables

Config File (~/.safeclaw/config.json)

1. Create an account

2. Onboarding wizard

3. Connect your agent

4. Manage from the dashboard