Proof Gate Patterns

A pattern library for skill authors who want to add agent-authored proof gates to their LLM agents.

RRM Academy · Code on GitHub · MIT License

What this is

A proof gate is deterministic verification code that an LLM agent writes per task, runs against its own work product, and quotes the observable output as proof, with hard fix-before-pass-through semantics at a workflow boundary. The agent does not just run a pre-existing test, lint, or hook (those are operator-authored sibling patterns). The agent itself writes the specific verification snippet for the work in front of it.

This repository documents the discipline through annotated prompt fragments, three canonical pattern types (grep, quote, read), worked examples drawn from production deployment, and operator-facing background docs covering design principles, the four-step gate-design process, observed failure modes, and the necessary-vs-sufficient framing for incomplete rule sets.

It is the reproducibility artifact for the paper Proof Gates: Sycophancy-Resistant Self-Verification via Agent-Authored Postconditions (Whittaker 2026, preprint on GitHub). The paper documents what the discipline is and why it works; this repository documents how to apply it.

What you'll find here

Three pattern types. Grep gates (pattern presence or absence), quote gates (verbatim sentence as proof of a content rule), read gates (named excerpt as proof a prerequisite was consulted). Each pattern documented with skill-prompt specification, proof-artifact format, what the gate catches, and what it does not catch.
Three worked examples. Verbatim agent-output transcripts showing each pattern firing in production: a code-review sibling-scan grep, a clinical-voice IVF-contrast-context quote, and a voice-profile-loaded read gate.
Four background docs. Five design principles every proof-gated agent must satisfy, the four-step operator design discipline for building a gate-set, observed failure modes (too-narrow regex, paraphrased quote, unjustified N/A, gate-of-the-gate regress), and the necessary-vs-sufficient framing for gate-set coverage when the rule set is not closeable.

Quick-start for skill authors

If you maintain an agent's skill or system prompt and want to add proof gates, three operator decisions get you started:

Enumerate the work-product classes. A fix, a paragraph, a database write, a multi-step plan. Each work-product class is a candidate gate trigger.
List the domain rules that bind on each class. Rules without gates are not gated. Generic test-and-lint passing is not a gate of a domain rule.
Choose a gate-class per rule. Grep for pattern presence or absence, quote for sentence-level content rules, read for prerequisite-loading rules.

Then encode in the skill: when the gate fires, what proof artifact must travel with the response, what happens on FAIL or N/A. The full process is documented in the operator-discipline.md background doc.

Cite the paper

If this library or the discipline it documents is useful in your work, please cite the paper:

Whittaker, B. (2026). Proof Gates: Sycophancy-Resistant
Self-Verification via Agent-Authored Postconditions.
Preprint forthcoming.

The paper develops the dual-failure-mode argument (sycophancy + overtraining/undertraining distribution skew), the fabrication-vs-gaming distinction that scopes the discipline, and the six-property positioning matrix locating proof gates among adjacent verification regimes.

Last updated June 11, 2026

What this is

What you'll find here

Quick-start for skill authors

Cite the paper

Related