GitHub

HARNESS Starter Kit

Turn repeated chat instructions into durable rules, checks, and memory in your repo - so coding agents stop guessing from scratch every session.

WHAT IS HARNESS ENGINEERING

Design the repository for the agent.

Harness engineering shapes a repository so coding agents work inside durable project context instead of one-off chat prompts. The target repo is always the source of truth - a harness is useful only when it reflects the real architecture, commands, risks, and review habits.

Harness = Instructions+Constraints+Feedback+Memory+Evaluation+Governance

THE WORKING MODEL · 6 PARTS

Six parts of a harness

01

Instructions

Tell agents what the project is, how to work, and what to avoid.

AGENTS.mdCLAUDE.mdcontribution docs
02

Constraints

Make important rules enforceable where practical.

linttype checksimport rulesCI gates
03

Feedback

Give fast, concrete signals after a change.

testscheck_*.pyCI
04

Memory

Preserve decisions, failures, conventions, and domain context across sessions.

docs/decisionsdocs/failuresdocs/conventions
05

Evaluation

Measure how ready the harness is and how agents actually perform.

Harness Doctoreffectiveness reportstask outcomes
06

Governance

Keep the harness maintained without making it an automatic rewrite system.

/harness update/harness refresh/harness review/harness review sub-agentsource.json

USE IT AS A DECISION TABLE

When a gap appears, add just one more thing

When several rows apply, prefer enforceable constraints and feedback over more written rules. Keep instructions minimal.

Situation Add / update Review question
Agent doesn't know the project shape, commands, or forbidden actions Instructions Is it short, durable, and specific to this repo?
Agents can break a rule that tooling could catch Constraints Can lint, types, imports, drift, or CI enforce it?
A fast pass/fail signal is needed after a change Feedback Is there a local or CI check with clear failure output?
The same decision, convention, term, or failure keeps reappearing Memory Should it become a decision, failure, convention, or domain record?
You want evidence that agent work improved Evaluation Is there a comparable task outcome, not just a health score?
Guidance is stale, duplicated, untrusted, or source changed Governance Is this /harness refresh, /harness update, or manual review?

THE IMPROVEMENT LOOP

Five lightweight steps

  1. 1
    Map the work

    Identify the task type, expected file boundary, local commands, and likely failure modes.

  2. 2
    Add or refine

    Add or refine instructions, constraints, feedback, memory, evaluation, or governance.

  3. 3
    Measure health

    Run local checks and, when useful, Harness Doctor.

  4. 4
    Record outcomes

    Capture wrong-file edits, repeated mistakes, verification results, drift detections, and rework.

  5. 5
    Manage the next change

    Update rules, checks, and records only where the evidence justifies it.

HOW TO USE

Open your repo, then paste.

Open the target repository with a coding agent and pass the prompt below as-is. The agent clones and reads the kit, inspects your repo first, then applies only the smallest useful harness.

Adoption prompt
Use this kit to apply harness engineering to this repository:

https://github.com/harnessworks/harness-starter-kit

Clone the kit into ./harness-starter-kit if it is not already present, read it,
then apply its prompt-first harness engineering workflow to this repository.

Requirements:
- Treat the current working directory as the target repository.
- Treat ./harness-starter-kit as read-only reference material after cloning.
- Inspect this repository before editing.
- Preserve existing architecture, tools, package manager, commands, docs, and
  conventions.
- Do not blindly copy templates.
- Add only the minimum useful harness pieces.
- Prefer updating existing docs/configs over duplicating them.
- Do not overwrite or delete existing files without explaining why.
- If I ask for /harness doctor, use
  ./harness-starter-kit/commands/harness-doctor.md.
- If I ask for /harness update after adoption, use
  ./harness-starter-kit/commands/harness-update.md to refresh the kit reference,
  record .harness/source.json, and selectively update target harness files
  without blindly overwriting existing files.
- If I ask for /harness refresh after adoption, use
  ./harness-starter-kit/commands/harness-refresh.md to review existing harness
  docs, rules, knowledge records, and checks for stale or duplicated guidance.
  Do not delete, archive, move, or rename files without my explicit approval for
  the specific files.
- If I ask for /harness review sub-agent, use
  ./harness-starter-kit/commands/harness-review.md and treat the request as
  explicit permission to use a read-only reviewer subagent when available and
  permitted by the active runtime and tool instructions. If unavailable,
  blocked, not permitted, or failed, report the fallback reason.
- If I ask for /harness review, use
  ./harness-starter-kit/commands/harness-review.md to review the current change
  set from an opposing harness-engineering perspective. Report findings,
  missing checks, overreach, durable memory gaps, and follow-up recommendations
  without modifying files unless I explicitly ask you to apply fixes after the
  review.

Expected result:
- project-specific AGENTS.md or updated existing agent instructions
- knowledge store if no equivalent exists
- lightweight drift checks based on this repo's real rules
- local verification commands using existing tools
- adoption report with files changed, checks to run, assumptions, remaining
  manual steps, failure memory, effectiveness measurement plan,
  normal/focused/manual gate placement, and whether
  ./harness-starter-kit should be removed, ignored, or kept before commit