Overview

SafeClaw is a neurosymbolic governance layer for autonomous AI agents. It validates every tool call, message, and action against OWL ontologies and SHACL constraints before execution.

The system has two components:

  • openclaw-safeclaw-plugin β€” a TypeScript plugin that intercepts OpenClaw events and forwards them to the SafeClaw service via HTTP.
  • safeclaw-service β€” a Python FastAPI service that runs the constraint pipeline against a knowledge graph of ontologies, policies, and user preferences.

The plugin is a thin client (~275 lines) with no governance logic of its own. All validation happens server-side, making it easy to update policies without redeploying the agent.

How SafeClaw Controls OpenClaw

The plugin registers 6 event hooks with the OpenClaw runtime. Each hook intercepts a specific lifecycle event:

Hook Endpoint Purpose Behavior
before_tool_call /evaluate/tool-call THE GATE β€” validates every tool call Blocks execution on violation
before_agent_start /context/build Context injection into system prompt Prepends governance context
message_sending /evaluate/message Outbound message governance Cancels message on violation
after_tool_call /record/tool-result Feedback loop for dependency tracking Fire-and-forget
llm_input /log/llm-input Audit logging of LLM inputs Fire-and-forget
llm_output /log/llm-output Audit logging of LLM outputs Fire-and-forget

Enforcement Modes

Mode Behavior
enforce Block actions that violate constraints (default)
warn-only Log warnings but allow execution
audit-only Log server-side only, no client-side action
disabled Completely disable SafeClaw checks

Fail Modes

Mode Behavior
open Allow on service failure (default). More available.
closed Block on service unavailability. Safer.

The Ontology

OWL (Web Ontology Language) defines a formal class hierarchy of actions your agent can take. SHACL (Shapes Constraint Language) defines structural constraints that validate the shape of each action's data.

Together, they give SafeClaw a machine-readable understanding of what each action means, how risky it is, and what constraints apply β€” not just pattern matching, but genuine semantic reasoning.

At startup, SafeClaw pre-computes the full rdfs:subClassOf hierarchy via pure-Python SPARQL traversal (no Java required). This enables hierarchy-aware policy checking β€” blocking ShellAction automatically blocks all subclasses like GitPush and ForcePush. Custom ontology extensions inherit risk levels and constraints without Python code changes.

Action Class Hierarchy

Action
β”œβ”€β”€ FileAction
β”‚   β”œβ”€β”€ ReadFile          (Low, reversible, LocalOnly)
β”‚   β”œβ”€β”€ WriteFile         (Medium, reversible, LocalOnly)
β”‚   β”œβ”€β”€ EditFile          (Medium, reversible, LocalOnly)
β”‚   β”œβ”€β”€ DeleteFile        (Critical, irreversible, LocalOnly)
β”‚   β”œβ”€β”€ ListFiles
β”‚   └── SearchFiles
β”œβ”€β”€ ShellAction
β”‚   β”œβ”€β”€ ExecuteCommand    (High, reversible, LocalOnly)
β”‚   β”œβ”€β”€ GitCommit         (Medium, reversible, LocalOnly)
β”‚   β”œβ”€β”€ GitPush           (High, irreversible, SharedState)
β”‚   β”œβ”€β”€ ForcePush         (Critical, irreversible, SharedState)
β”‚   β”œβ”€β”€ GitResetHard      (Critical, irreversible, LocalOnly)
β”‚   β”œβ”€β”€ RunTests          (Low, reversible, LocalOnly)
β”‚   β”œβ”€β”€ DockerCleanup     (High, irreversible, LocalOnly)
β”‚   └── PackagePublish    (Critical, irreversible, ExternalWorld)
β”œβ”€β”€ NetworkAction
β”‚   β”œβ”€β”€ WebFetch          (Medium, reversible, ExternalWorld)
β”‚   β”œβ”€β”€ WebSearch         (Low, reversible, ExternalWorld)
β”‚   └── NetworkRequest
β”œβ”€β”€ MessageAction
β”‚   └── SendMessage       (High, irreversible, ExternalWorld)
└── BrowserAction         (Medium, reversible, ExternalWorld)

Risk Levels

Level Meaning Examples
LowRisk Safe, read-only or easily undone ReadFile, RunTests, WebSearch
MediumRisk Modifies local state but reversible WriteFile, EditFile, GitCommit, WebFetch
HighRisk Affects shared state or hard to reverse ExecuteCommand, GitPush, SendMessage
CriticalRisk Irreversible or affects external systems DeleteFile, ForcePush, GitResetHard, PackagePublish

Scopes

Scope Meaning
LocalOnly Affects only the local filesystem
SharedState Affects shared resources (e.g., git remotes)
ExternalWorld Reaches external systems (APIs, messages, network)

The 9-Step Constraint Pipeline

Every tool call passes through these gates in order. Execution blocks at the first violation.

1

Agent Governance

Token authentication, kill switch check, delegation bypass detection.

Example block reasons:

  • Token invalid
  • Agent killed
  • Delegation bypass detected
2

Action Classification

Maps tool name + parameters to an ontology class. Assigns risk level, reversibility, and scope. Covers all common tool variants: read, write, edit, delete, remove, unlink, trash, shell, and more.

Example block reasons:

  • Unknown tool type
  • Unclassifiable action
3

Role-Based Access Control

Checks if the agent's role allows the classified action and resource path. Resource paths are extracted from multiple param key variants (file_path, path, filepath, dest, target, source, and more). Uses ontology hierarchy: denying a parent class denies all subclasses. Temporary permissions are checked first.

Example block reasons:

  • Role 'researcher' does not allow action 'WriteFile'
  • Access to /secrets/** denied
4

SHACL Validation

Validates the action's RDF graph against shape constraints from the shapes/ directory.

Example block reasons:

  • ForbiddenCommandShape violated
  • CriticalPathShape violated
5

Policy Check

Evaluates against policy rules from the knowledge graph (prohibitions, obligations). Hierarchy-aware: blocking a parent class blocks all subclasses.

Example block reasons:

  • Environment files may contain secrets
  • Force push can destroy shared history
6

Preference Check

User-specific preferences like 'confirm before delete' or 'never modify paths'.

Example block reasons:

  • User requires confirmation before delete
  • Path matches neverModifyPaths
7

Dependency Check

Validates prerequisites are met. E.g., tests must pass before git push.

Example block reasons:

  • RunTests must succeed before GitPush
8

Temporal + Rate Limits

Time-based constraints and per-session rate limiting.

Example block reasons:

  • Action not permitted at this time
  • Rate limit exceeded: 100 actions/hour
9

Derived Rules

Combined rules from multiple constraints. May require user confirmation rather than a hard block.

Example block reasons:

  • Cumulative risk threshold exceeded
  • Transitive prohibition applies

Built-in Policies

Default prohibitions and obligations defined in the ontology:

Prohibitions

Policy Type Pattern Reason
NoEnvFiles Path .*\.env.* Environment files may contain secrets
NoCredentialFiles Path .*(credentials|secrets|tokens).* Credential files contain sensitive data
NoForcePush Command git push.*--force Force push can destroy shared history
NoRootDelete Command rm\s+-rf\s+/ Recursive deletion of root paths is prohibited
NoResetHard Command git reset --hard Hard reset can destroy uncommitted work

Obligations

Policy Action Requires Reason
TestBeforePush GitPush RunTests (must succeed first) All pushes must pass tests

Roles & Permissions

SafeClaw ships with three roles. Custom roles can be defined in Turtle files.

Role Autonomy Enforcement Allowed Actions Denied Actions Denied Paths
Admin full warn-only All None None
Developer moderate enforce All (except denied) ForcePush, DeleteFile, GitResetHard /secrets/**, /etc/**
Researcher supervised enforce ReadFile, ListFiles, SearchFiles WriteFile, EditFile, DeleteFile, GitPush, ForcePush, ShellAction, SendMessage N/A

User Preferences

Per-user preferences are stored as RDF triples and loaded from ~/.safeclaw/ or the ontology's users/ directory.

Safety Preferences

Property Type Default Effect
autonomyLevel string moderate Controls agent independence: full | high | moderate | cautious | supervised
confirmBeforeDelete boolean true Require confirmation before any delete action
confirmBeforePush boolean true Require confirmation before git push
confirmBeforeSend boolean true Require confirmation before sending messages
neverModifyPaths string (none) Glob patterns for paths the agent must never modify

Coding Preferences

Property Type Effect
preferredLanguage string Programming language preference for agent suggestions
preferredTestFramework string Testing framework preference
maxFilesPerCommit integer Maximum files allowed per commit

Communication Preferences

Property Type Effect
toneOfVoice string Communication tone/style preference
maxMessageLength integer Maximum message length constraint
neverContactList string Email/contact patterns to never contact

How Mistral AI Enhances SafeClaw

SafeClaw includes an optional LLM layer powered by Mistral AI. It is activated by setting SAFECLAW_MISTRAL_API_KEY. This layer is purely passive and advisory β€” it never blocks the constraint pipeline.

Component When Purpose
Security Reviewer After an action is allowed by symbolic checks Background review for semantic risks: obfuscation, multi-step evasion chains, encoded payloads, script injection patterns
Classification Observer When action classifier falls back to generic 'Action' Suggests improved ontology classifications for unknown tools β€” logged to classification_suggestions.jsonl
Decision Explainer When a constraint violation occurs Generates plain-English explanations of why an action was blocked

The Security Reviewer specifically watches for:

  • Base64/hex-encoded payloads hiding malicious commands
  • Multi-step evasion chains (e.g., write script then execute)
  • URL-based payload delivery (curl | sh patterns)
  • Flag reordering and command aliases
  • Environment variable manipulation

Context Injection

Before each agent session starts, SafeClaw injects governance context into the agent's system prompt via the before_agent_start hook. This gives the LLM awareness of its constraints before it even proposes an action.

Injected Context Includes

  • Active user preferences β€” autonomy level, confirmation requirements, forbidden paths
  • Active domain policies β€” summarized constraints with reasoning
  • Recent violations β€” last 5 blocked actions and reasons, so the agent avoids retrying the same approach
  • Session history β€” last 10 actions with outcomes, files modified, and violation summary
  • Agent role info β€” role name, autonomy level, denied action classes

This context injection improves agent self-regulation: the LLM learns what's prohibited and adjusts its behavior, reducing the number of blocked actions over the course of a session.

Audit Trail

Every decision β€” allow or block β€” is recorded as a DecisionRecord in append-only JSONL files at ~/.safeclaw/audit/.

DecisionRecord Structure

{
  "id": "uuid",
  "timestamp": "ISO-8601",
  "session_id": "...",
  "user_id": "...",
  "agent_id": "...",
  "action": {
    "tool_name": "write_file",
    "params": { ... },
    "ontology_class": "WriteFile",
    "risk_level": "MediumRisk",
    "is_reversible": true,
    "affects_scope": "LocalOnly"
  },
  "decision": "allowed",
  "justification": {
    "constraints_checked": [
      { "constraint_uri": "...",
        "constraint_type": "...",
        "result": "satisfied",
        "reason": "..." }
    ],
    "preferences_applied": [ ... ],
    "elapsed_ms": 12.3
  }
}

The dashboard at /admin provides a web interface for browsing audit logs, viewing decision history, and managing the system.

Message Governance

Outbound messages are governed by the message_sending hook with three checks:

1. Never-Contact List

Recipients on the never-contact list (configured via neverContactList preference) are unconditionally blocked.

2. Sensitive Data Detection

Message content is scanned against 7 regex patterns:

Pattern Detects
Base64 strings (40+ chars) Encoded secrets
api_key|secret_key|access_token|auth_token API keys and tokens
password|passwd|pwd Passwords
ghp_|gho_|ghu_|ghs_|ghr_ GitHub tokens
sk-... OpenAI / Stripe secret keys
AKIA... AWS Access Key IDs
-----BEGIN PRIVATE KEY----- PEM private keys

3. Rate Limiting

Default: 50 messages per session per hour.

Error Handling

The SafeClaw service returns structured error responses with machine-readable codes and human-readable hints for every failure:

{
  "error": "ENGINE_NOT_READY",
  "detail": "Engine not initialized β€” the service is still starting up.",
  "hint": "Wait a moment and retry, or check service logs."
}

Error Codes

Code HTTP Meaning
ENGINE_NOT_READY 503 Service is starting up, engine not yet initialized
INTERNAL_ERROR 500 Unhandled exception β€” check service logs
INVALID_REQUEST 400 Malformed request body or missing fields

Plugin Error Handling

The TypeScript plugin (v0.1.3+) parses structured errors and provides context-specific warnings:

  • Timeout β€” logs timeout duration and service URL
  • Connection refused β€” suggests checking if the service is running
  • HTTP errors β€” parses and displays the detail and hint fields from the response
  • Fail-closed blocks β€” include the service URL in the block reason for easier debugging

CLI & TUI

SafeClaw provides CLI commands and an interactive terminal UI for managing the service, diagnosing issues, and controlling OpenClaw.

safeclaw tui

Opens the interactive settings TUI. The Status tab shows live health for both SafeClaw and the OpenClaw daemon:

  Service      ● Connected (localhost:8420)
  OpenClaw     ● Running
  Enforcement  enforce
  Fail Mode    closed
  Enabled      ON

  Service v0.1.0
  Last check: 14:32:05

  Press r to restart OpenClaw daemon

The status auto-refreshes every 10 seconds. Press r to restart the OpenClaw daemon directly from the TUI.

safeclaw connect

Links your local agent to the SafeClaw service by writing your API key to ~/.safeclaw/config.json:

$ safeclaw connect sc_abc123...
Connected! Your API key has been saved to ~/.safeclaw/config.json

The config file is created with 0600 permissions (owner read/write only). If a config file already exists, only the remote.apiKey and remote.serviceUrl fields are updated β€” other settings are preserved.

By default the service URL is set to https://api.safeclaw.eu/api/v1. To connect to a self-hosted instance:

$ safeclaw connect sc_abc123... --service-url http://localhost:8420/api/v1

safeclaw restart-openclaw

Restarts the OpenClaw daemon from the command line without opening the TUI:

$ safeclaw restart-openclaw
OpenClaw daemon restarted successfully.

safeclaw status check

Pings the running service and displays component-level health:

$ safeclaw status check
Service: ok
Version: 0.1.0
Engine ready: True
Uptime: 1234s

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Component     β”‚ Detail                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Knowledge Graphβ”‚ 847 triples           β”‚
β”‚ LLM           β”‚ not configured         β”‚
β”‚ Sessions      β”‚ 3 active               β”‚
β”‚ Agents        β”‚ 2 registered, 1 active β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

If the service is not running, it shows a clear error with the suggested fix.

safeclaw status diagnose

Runs offline checks without requiring the service to be running:

  • Config file at ~/.safeclaw/config.json
  • Ontology .ttl files present
  • Audit directory exists
  • Mistral API key set (optional)

Each check prints OK or ISSUE with remediation hints.

safeclaw serve

Starts the SafeClaw service:

$ safeclaw serve
$ safeclaw serve --host 0.0.0.0 --port 8420 --reload

safeclaw init

Generates a default ~/.safeclaw/config.json with all sections:

$ safeclaw init --user-id myname
$ safeclaw init --user-id myname --mode remote --service-url http://localhost:8420/api/v1

safeclaw audit

View and query audit records:

$ safeclaw audit show --last 20
$ safeclaw audit show --blocked
$ safeclaw audit report <session-id> --format markdown
$ safeclaw audit stats --last 100
$ safeclaw audit compliance
$ safeclaw audit explain <audit-id>

safeclaw policy

Manage governance policies:

$ safeclaw policy list
$ safeclaw policy add NoSecrets --type prohibition --reason "Secrets must not be committed" --path-pattern ".*\.secret.*"
$ safeclaw policy remove NoSecrets
$ safeclaw policy add-nl "Never allow deploys on weekends"

safeclaw pref

View or set user preferences:

$ safeclaw pref show --user-id myuser
$ safeclaw pref set autonomyLevel cautious --user-id myuser
$ safeclaw pref set confirmBeforeDelete true

safeclaw llm

Manage LLM features (requires Mistral API key):

$ safeclaw llm suggestions
$ safeclaw llm findings

Real-Time Events (SSE)

SafeClaw provides a Server-Sent Events (SSE) endpoint for real-time visibility into governance decisions. The endpoint requires admin authentication.

$ curl -N -H 'X-Admin-Password: yourpass' http://localhost:8420/api/v1/events

event: safeclaw
data: {"event_type":"blocked","severity":"warning","title":"Blocked: shell","detail":"[SafeClaw] Force push can destroy shared history","metadata":{"tool_name":"shell","ontology_class":"ForcePush"}}

Event Types

Type Severity When
blocked warning A tool call was blocked by the constraint pipeline
security_finding warning / critical The LLM security reviewer flagged a concern

Event Bus Limits

  • Max 100 concurrent SSE subscribers
  • Max 100 queued events per subscriber (oldest dropped on overflow)
  • No external dependencies β€” uses asyncio.Queue internally

Admin Dashboard

The admin dashboard is available at /admin and provides a web interface for monitoring and managing SafeClaw.

Live Features

  • Toast notifications β€” real-time pop-up alerts when actions are blocked or security findings are detected, powered by the SSE event stream
  • Auto-refresh β€” home page stats update every 5 seconds via HTMX
  • Agent monitoring β€” cards showing registered and active agent counts

Audit Log Detail

Clicking 'Details' on any audit record expands to show:

  • Action parameters (truncated to 500 chars)
  • Constraint checks with type, result, and reason
  • Preferences applied
  • Session action history (last 5 entries)
  • Block reason for early-exit blocks (agent governance, role checks)

Dashboard Pages

Page Purpose
Home System health, decision stats, recent activity
Audit Filterable audit log with detail expansion
Agents Agent management β€” register, kill, revive
Settings Configuration overview, ontology reload

User Dashboard

The user dashboard at /dashboard provides a self-service web interface for managing your SafeClaw integration. Sign in with GitHub OAuth to access it.

Authentication Flow

  • Click Sign In in the nav bar to start GitHub OAuth
  • Authorize the SafeClaw app on GitHub
  • First-time users are redirected to the onboarding wizard to set preferences and get their API key
  • Returning users go straight to /dashboard
  • All /dashboard/* routes are protected by Beforeware

Dashboard Pages

Page Path Purpose
Overview /dashboard Service health check, API key count, getting started guide
API Keys /dashboard/keys Generate and revoke API keys for service authentication
Agents /dashboard/agents View registered agents, kill/revive via service API
Preferences /dashboard/prefs Set autonomy level, confirmation rules, file limits, Mistral API key

API Keys

API keys authenticate your agent's plugin against the SafeClaw service. Each key has a sc_ prefix and is shown only once at creation time.

  • Label β€” a human-readable name for the key
  • Scope β€” full (all endpoints) or evaluate (evaluation endpoints only)
  • Keys are stored as SHA-256 hashes; the raw key cannot be recovered
  • Revoked keys are immediately invalidated

Governance Preferences

Setting Options Effect
Autonomy Level conservative / moderate / autonomous Controls how strictly SafeClaw enforces constraints
Confirm before delete on / off Require user confirmation before file deletions
Confirm before push on / off Require user confirmation before git push
Confirm before send on / off Require confirmation before sending messages
Max files per commit number Limit files changed in a single commit
Mistral API key password Your personal Mistral key for LLM-powered features

Self-Hosting

To run the landing site with user management locally:

cd safeclaw-landing
pip install -r requirements.txt
export GITHUB_CLIENT_ID=your_id
export GITHUB_CLIENT_SECRET=your_secret
python main.py  # starts on port 5002

The SQLite database is created automatically in ~/.safeclaw-landing/safeclaw.db.

Demonstration Flows

These walkthroughs show what happens inside SafeClaw when an agent attempts different actions. Each flow traces the request through the constraint pipeline from prompt to final decision.

Flow 1: Blocking a File Deletion

Prompt: "Delete /tmp/safeclaw-test.txt"

Tool Call sent by the agent:

{
  "tool": "delete",
  "params": { "path": "/tmp/safeclaw-test.txt" }
}

Classification: DeleteFile / CriticalRisk β€” the action classifier maps the delete tool to the sc:DeleteFile ontology class.

Decision: The developer role does not include DeleteFile in its allowed action classes. Pipeline step 3 (Role-Based Access) blocks the call.

Response:

{
  "decision": "block",
  "reason": "Role developer does not permit DeleteFile actions",
  "constraintId": "role-access-check",
  "riskLevel": "critical",
  "pipelineStep": 3
}

Flow 2: Blocking a Force Push

Prompt: "Push my changes with --force"

Tool Call sent by the agent:

{
  "tool": "exec",
  "params": { "command": "git push --force" }
}

Classification: ForcePush / CriticalRisk β€” the shell-command pattern matcher recognises git push --force and maps it to sc:ForcePush.

Decision: The developer role explicitly denies ForcePush. Pipeline step 3 (Role-Based Access) blocks the call.

Response:

{
  "decision": "block",
  "reason": "Role developer does not permit ForcePush actions",
  "constraintId": "role-access-check",
  "riskLevel": "critical",
  "pipelineStep": 3
}

Flow 3: Allowing a Safe Read

Prompt: "Read the config file"

Tool Call sent by the agent:

{
  "tool": "read",
  "params": { "path": "./config.json" }
}

Classification: ReadFile / LowRisk β€” a simple read operation classified as sc:ReadFile.

Decision: All 9 pipeline steps pass. The developer role permits ReadFile, SHACL shapes validate, no policies or preferences restrict reading, and rate limits are within bounds.

Response:

{
  "decision": "allow",
  "reason": "All constraints satisfied",
  "riskLevel": "low",
  "pipelineStep": 9
}

API Reference

All endpoints are under /api/v1. Admin endpoints require the X-Admin-Password header or API key authentication.

Evaluation & Context

Method Path Purpose
POST /evaluate/tool-call Main constraint gate β€” validates tool calls
POST /evaluate/message Message governance (content, recipients)
POST /context/build Build governance context for agent system prompt
POST /session/end Clean up per-session state

Recording & Logging

Method Path Purpose
POST /record/tool-result Record action outcomes for dependency tracking
POST /log/llm-input Audit log LLM input
POST /log/llm-output Audit log LLM output

Audit & Reporting (admin)

Method Path Purpose
GET /audit Query audit records (filters: sessionId, blocked, limit)
GET /audit/statistics Aggregate audit statistics
GET /audit/report/{session_id} Generate session report (markdown/JSON/CSV)
GET /audit/compliance Generate compliance report
GET /audit/{audit_id}/explain LLM-powered decision explanation (requires Mistral)

Ontology (admin)

Method Path Purpose
POST /reload Hot-reload ontologies and reinitialize checkers
GET /ontology/graph D3-compatible knowledge graph visualization data
GET /ontology/search Fuzzy search for ontology nodes

Preferences (admin)

Method Path Purpose
GET /preferences/{user_id} Get user preferences
POST /preferences/{user_id} Update user preferences (writes Turtle file)

Agent Management (admin)

Method Path Purpose
POST /agents/register Register a new agent with role and token
GET /agents List all registered agents with metadata
POST /agents/{agent_id}/kill Kill switch β€” block all actions from agent
POST /agents/{agent_id}/revive Revive a killed agent
POST /agents/{agent_id}/temp-grant Grant time-limited or task-scoped permission
DELETE /agents/{agent_id}/temp-grant/{grant_id} Revoke a temporary permission grant
POST /tasks/{task_id}/complete Mark task complete, revoke associated grants

Health & Connection

Method Path Purpose
GET /health Service health (version, uptime, component status)
POST /heartbeat Plugin heartbeat with config drift detection
POST /handshake Validate API key and log connection event

LLM Features

Method Path Purpose
POST /policies/compile Compile natural language policy to Turtle (admin, requires Mistral)
GET /llm/findings Query LLM security findings
GET /llm/suggestions Get classification suggestions from observation log

Real-Time

Method Path Purpose
GET /events SSE stream of governance events (admin)

Configuration Reference

Plugin Environment Variables

Variable Default Description
SAFECLAW_URL https://api.safeclaw.eu/api/v1 SafeClaw service URL
SAFECLAW_API_KEY (none) Bearer token for service authentication
SAFECLAW_TIMEOUT_MS 5000 HTTP timeout for service calls (ms)
SAFECLAW_ENABLED true Enable/disable the plugin
SAFECLAW_ENFORCEMENT enforce Enforcement mode
SAFECLAW_FAIL_MODE open Behavior on service failure
SAFECLAW_AGENT_ID (none) Agent identifier
SAFECLAW_AGENT_TOKEN (none) Agent authentication token

Service Environment Variables

Variable Default Description
SAFECLAW_HOST 127.0.0.1 Bind address
SAFECLAW_PORT 8420 Service port
SAFECLAW_DATA_DIR ~/.safeclaw Data directory
SAFECLAW_AUDIT_DIR ~/.safeclaw/audit Audit log directory
SAFECLAW_REQUIRE_AUTH false Require API key authentication
SAFECLAW_LOG_LEVEL INFO Log level
SAFECLAW_ADMIN_PASSWORD (none) Dashboard admin password
SAFECLAW_MISTRAL_API_KEY (none) Enable LLM layer (Mistral)
SAFECLAW_MISTRAL_MODEL mistral-small-latest Mistral model for fast tasks
SAFECLAW_MISTRAL_MODEL_LARGE mistral-large-latest Mistral model for complex tasks
SAFECLAW_MISTRAL_TIMEOUT_MS 3000 LLM call timeout
SAFECLAW_CORS_ORIGIN_REGEX https?://localhost:\d+$ CORS allowed origin regex
SAFECLAW_DB_PATH (none) SQLite path for multi-tenant API key storage
SAFECLAW_LLM_SECURITY_REVIEW_ENABLED true Enable LLM security review observer
SAFECLAW_LLM_CLASSIFICATION_OBSERVE true Enable LLM classification observer

Config File (~/.safeclaw/config.json)

The safeclaw connect command writes a JSON config file that the plugin reads at startup. The safeclaw init command generates a full config with all sections:

{
  "enabled": true,
  "userId": "",
  "mode": "embedded | remote | hybrid",
  "remote": {
    "serviceUrl": "https://api.safeclaw.eu/api/v1",
    "apiKey": "sc_abc123...",
    "timeoutMs": 500
  },
  "hybrid": {
    "circuitBreaker": {
      "failureThreshold": 3,
      "resetTimeoutSec": 30,
      "fallbackMode": "local-only"
    }
  },
  "enforcement": {
    "mode": "enforce",
    "blockMessage": "[SafeClaw] Action blocked: {reason}",
    "maxReasonerTimeMs": 200
  },
  "contextInjection": {
    "enabled": true,
    "includePreferences": true,
    "includePolicies": true,
    "includeSessionFacts": true,
    "includeRecentViolations": true,
    "maxContextChars": 2000
  },
  "audit": {
    "enabled": true,
    "logLlmIO": true,
    "logAllowedActions": true,
    "logBlockedActions": true,
    "retentionDays": 90,
    "format": "jsonl"
  },
  "roles": {
    "defaultRole": "developer"
  }
}

The remote section takes precedence over environment variables for SAFECLAW_API_KEY and SAFECLAW_URL. The file is created with 0600 permissions.

SaaS Onboarding

SafeClaw is available as a hosted service at safeclaw.eu. No server setup required.

1. Create an account

Click Get Started on the landing page. Sign in with your GitHub account.

2. Onboarding wizard

First-time users are guided through a three-step wizard:

  • Step 1 β€” Autonomy level β€” choose how much control SafeClaw has (cautious, moderate, or autonomous)
  • Step 2 β€” Mistral API key (optional) β€” provide your own Mistral API key to enable LLM-powered features such as semantic action classification. You can skip this step and add it later from the dashboard.
  • Step 3 β€” API key & connection β€” a SafeClaw API key is generated automatically. The wizard shows the safeclaw connect command to run in your terminal.

3. Connect your agent

Install the plugin and connect using the command shown in the wizard:

$ npm install -g openclaw-safeclaw-plugin
$ safeclaw connect sc_your_key_here

safeclaw connect writes the key to ~/.safeclaw/config.json and verifies the connection to https://api.safeclaw.eu/api/v1. No manual environment variables needed.

4. Manage from the dashboard

After onboarding, the dashboard at safeclaw.eu/dashboard lets you:

  • Create and revoke API keys
  • Set preferences (confirm before delete, max files per commit)
  • View connected agents
  • Add or update your Mistral API key