Tripwire profile 01

A control map for agentic work that can bite back.

This opener is reserved for Samuel's own failure story. The page structure below is already built around the method: make every risky moment explicit as a trigger, an action, and an actor.

Concept

A tripwire is not topology. It is a control-transfer rule.

A topology map says what exists. A tripwire map says when normal continuation must change.

This page does not map every tool in the HAI stack. A topology view asks what exists, where it lives, and how components connect. The tripwire map asks a narrower control question: when must work change state? A tripwire has three parts: Trigger + Action + Actor. The trigger is the observable condition that makes normal continuation unsafe, unclear, or wasteful. The action is the required shift, such as stopping, escalating, switching mode, requesting review, or handing work back to the owner. The actor is the responsible party that must perform or authorize that shift. Without all three parts, a tripwire is only advice. With all three, it becomes an operating boundary: this happened, therefore this change is required, and this person or system owns the next move.

Tripwire = Trigger + Action + Actor If one of those three is missing, the rule is still only advice. It becomes operational when the signal, response, and owner are all explicit.

Taxonomy

Five categories keep the map readable.

The categories are not abstract governance labels. Each one names a concrete failure mode in Samuel's current profile and becomes an audit question, not a universal claim.

Permission Tripwires

Stops an agent from silently becoming an uncontrolled actor; background work must be explicit before it leaves the main session.

Example: Background-Agent-Gate

Resource Tripwires

Protects the expensive thinking layer from bulk file reading; Opus stays on strategy, orchestration, and short prompts.

Example: Opus-Token-Regel

Scope Tripwires

Blocks automatic fanout by making Conserve the default and requiring one explicit mode when deeper work is justified.

Example: Conserve-Default

Verification Tripwires

Keeps completion claims tied to evidence while preserving Samuel's separate authority over commits.

Example: no commit without approval

Communication Tripwires

Converts ambiguity about target, component, or approach into a direct question before files are touched.

Example: ask when ambiguous

The map

Source-backed tripwires from Samuel's current setup.

This is profile 01, not the final universe. Additional .claude systems can become additional profiles using the same trigger-action-actor contract.

Profile 01

Samuel's current Claude, Sidecar-NG, and Codex control surface. Draft sources include user-level Claude rules, hooks, skills, commands, Sidecar modes, Sidecar knowledge, Sidecar evolution notes, and Codex memory rules.

Category Tripwire Trigger Action Actor
Permission Read-before-edit gate Editing or overwriting an existing file. Read the exact target file first; block edits when no prior read exists in recent history. Claude/Codex plus Sidecar safety gate
Permission Config-schema gate Changing a config file or adding a config key. Inspect the existing config format and do not invent keys that are not already represented or documented. Implementer
Communication Ambiguity gate Goal, project, component, or approach is ambiguous. Ask Samuel instead of guessing; do not continue on a hidden assumption. Main agent
Verification Done-means-evidence gate Reporting that a change is complete. Verify with tests, smoke checks, live evidence, or concrete proof before saying done. Builder or verifier
Permission Commit-approval gate Staging, committing, or pushing changes. Scan for sensitive artifacts, group atomic changes, run tests, show staged files and message, then wait for Samuel's explicit approval. Commit skill/main agent
Permission Destructive-command gate rm -r, force-push, apt install/remove, or secret-like content in sensitive files. Block or require explicit confirmation; do not expose or write secrets. Safety gate and main agent
Resource Conserve default gate Broad research, fanout, or deeper mode is not explicitly required. Stay narrow; activate deeper modes only explicitly and keep only one active mode. Main agent
Resource Opus-token gate Large code context, bulk exploration, or implementation is needed while Opus is orchestrating. Use targeted reads or a Sonnet scout/builder instead of making Opus bulk-read or bulk-code. Main agent, Sonnet scout, Sonnet builder
Resource Background-agent gate An Agent/Subagent is spawned. Require run_in_background: true; block non-background agent calls so the main session stays responsive. PreToolUse hook and main agent
Scope BuildMode scope-lock gate buildmode is active. Perform only actions directly serving the task/DoD; no research, meta-analysis, or side tasks; stop drift and return to task. Claude in BuildMode
Scope ReviewMode no-edit gate reviewmode is active. Read, analyze, and report findings only; no code edits, refactors, or fixes. Claude in ReviewMode
Verification VerifyMode requirement gate A build is complete or a commit proposal is near. Check every requirement one by one; produce evidence or explicit FAIL; do not write new code or fix during verification. Claude in VerifyMode
Verification Independent-verifier gate The same agent that built something is also declaring it correct. Route to a separate verifier/mode because the builder is structurally biased toward completion. VerifyMode or sonnet-pruefer
Verification Green-tests-are-not-proof gate Tests pass, commits exist, or todos are checked, but the core requirement has not been proven. Ask for requirement-level evidence such as a real API call, live input, integration run, or direct grep target. Verifier
Communication ResearchMode source gate researchmode is active or a research claim is being made. Use research tools, document sources, summarize structure, and do not edit code. Claude in ResearchMode
Scope HAI GoalBinding owner gate HAI workflow, UI, or control surface risks mutating goals or becoming a dashboard. Keep the surface human-owned, narrow, repo-native, and decision-focused; keep L0/L1 read-only in normal sessions and route drift to requirement review. HAI/Codex owner surface
Verification Runtime-source-of-truth gate Docs and runtime text disagree, or exact mode wording is requested. Check runtime sources such as hook_inject.py and mode files before trusting stale command docs. Main agent

The onboarding test

Could a new hire operate the control surface in a day?

This is the completeness test for the map: transfer, not confidence.

Could a new hire onboard onto this system in a day? Use that as a hard completeness check, not a morale check. "Under control" can mean the current owner has habits, private context, and recovery moves that nobody else can see. A one-day onboarding test exposes whether the control surface is legible outside the builder's head. The map passes only if someone new can answer three questions without folklore: what trigger matters, what action follows, and which actor owns it. That makes completeness observable instead of reassuring.

Soft next step

Make weak points discussable before they compound.

The map is meant to make weak points discussable, not to sell a tool. If the pattern feels familiar, /product/ describes how we close gaps together: a practical sequence for surfacing failure modes, protecting human decisions, and installing lightweight controls around the moments where agents usually drift.