pr-review

← Home · ~/pr-review · updated 5 days ago

pr-review

Standalone repo, extracted from the ~/agents monorepo 2026-05-30.

claude-review <pr> — a local CLI that fetches a GitHub PR into an ephemeral git worktree, runs a headless review on the pr-review poll session with Read/Grep/Glob scoped to the worktree, and emits a structured markdown review. Optionally posts the review back to the PR or commits guarded fix-ups for nit comments.

Every review request lives in src/claude_review/claude_cli.pyrun_cli drops a poll event at ~/.claude/poll/pr-review/<req_id>.md and blocks for the JSON reply, so the poll seam is exercised from exactly one file.

Status

Built. All 14 PLAN.md tasks implemented (MVP tasks 0–11 + ambitious tasks 12–14). Tests green, ruff clean.

Runs on the pr-review poll session; no SDK dep, no ANTHROPIC_API_KEY.

Scope

In scope: - Parse PR spec in three forms: owner/repo#N, owner/repo/N, https://github.com/owner/repo/pull/N[...]. - gh pr view --json + gh pr diff to fetch metadata and unified diff. - Ephemeral git worktree at /tmp/prs/<owner>-<repo>-<pr>-<pid>/ (or $CLAUDE_REVIEW_WORKTREE_ROOT), cleaned up on exit including SIGINT/SIGTERM/atexit. - Redact secrets at two seams (pre-agent-prompt and pre-GitHub-post): AWS access keys, GitHub tokens, Anthropic/OpenAI/Slack tokens, JWTs, URL userinfo, PEM private-key blocks. - Pre-filter the diff against the blocklist (.env*, *.pem, *.key, **/secrets.*, **/credentials*, id_rsa*, id_ed25519*, **/.aws/**, **/.ssh/**) before the prompt is built, so blocked files never reach the agent. The CLI's sandbox (tools rooted at --add-dir <worktree>) bounds reads to that dir regardless. - Drop a poll event scoped to the worktree (tools Read Grep Glob, rooted via add_dir); the JSON schema rides in the event frontmatter as a hint, and parse_review_payload validates the reply caller-side into Review { summary, comments: [{file, line_start, line_end, severity, body}] }. - Render review as markdown to stdout; optionally post as a PR review via gh api (default event COMMENT, never APPROVE). - Honor config-driven owner allowlist/denylist; refuse denylisted orgs; require --yes for non-allowlisted orgs. - --fan-out: one CLI invocation per aspect (security / logic / tests / style) dispatched in parallel via a thread pool, results merged. - --resume: persist the poll session's claude session id + head SHA per PR; on re-run against a force-pushed PR, replay that session (best-effort — the poll session is one long-running conversation) so only the new diff is re-processed. - --apply-fixups: guarded sub-agent with Read+Edit+Write scoped to the worktree applies only nit-severity comments. Hard 100-line diff cap; over-the-cap runs are reset. Commits locally with fixup: address review nits. Never pushes.

Out of scope: - Not a GitHub App / webhook service. No server, no long-running process. - Not a GitHub Action. Local-first so the redactor stays in the critical path and no hosted secrets are needed. - Not a Slack bot. Output channels are stdout and gh api only. - No review history database. Output is ephemeral per invocation (session persistence is for CLI replay, not audit log).

Interface

Installed as claude-review via uv tool install . (or pipx). No subcommands; mode is chosen by flags.

Commands

Invocation Behavior
claude-review <spec> Review, print markdown to stdout.
claude-review <spec> --post Also post as a PR review via gh api.
claude-review <spec> --apply-fixups Also commit guarded nit fix-ups in the worktree (never pushes).
claude-review <spec> --fan-out Dispatch one CLI invocation per configured aspect in parallel, merge.
claude-review <spec> --resume Reuse the persisted poll session if the PR head SHA has moved.
claude-review --all-open --repo <owner>/<name> Iterate every open PR in the repo.
claude-review --gc Remove /tmp/prs/* older than 1 hour and exit.

Spec forms: - mark/newsfeed#42 - mark/newsfeed/42 - https://github.com/mark/newsfeed/pull/42 - https://github.com/mark/newsfeed/pull/42/files - https://github.com/mark/newsfeed/pull/42/commits/<sha>

Flags

  • --post — post the review via gh api.
  • --apply-fixups — open a guarded fix-up commit for nit comments.
  • --fan-out — parallel CLI invocations, one per aspect.
  • --resume — reuse persisted poll session on force-push.
  • --yes — bypass the interactive confirm for non-allowlisted orgs, or for --all-open on more than 5 PRs.
  • --max-prs N — cap iteration count under --all-open.
  • --gc — remove worktree residue older than 1 h and exit.

Environment variables

  • CLAUDE_REVIEW_CONFIG — override config path (default ~/.config/claude-review/config.toml).
  • CLAUDE_REVIEW_WORKTREE_ROOT — override /tmp/prs worktree root.
  • CLAUDE_REVIEW_STATE_DIR — override session-record directory (default ~/.local/state/claude-review/sessions/).

The poll session uses the claude CLI's own auth (Keychain / OAuth); no ANTHROPIC_API_KEY is required. The session is launched by poll-bringup with ANTHROPIC_API_KEY unset (env -u), so the key never reaches Claude; claude_cli.py no longer spawns a subprocess and manages no env itself.

Exit codes

  • 0 — review printed, or denylisted owner skipped (not an error).
  • 2 — bad CLI usage (missing spec, missing --repo with --all-open, --all-open on >5 PRs without --yes).
  • Non-zero otherwise — poll-reply timeout, malformed reply, or the session reporting is_error.

Output contract (stdout markdown)

# Review: <owner>/<repo>#<N>

**Summary:** <one-paragraph summary>

**Counts:** blocker=X suggestion=Y nit=Z

## Comments

### `<file>`:<line_start>–<line_end> [<severity>]
<body>

...

Architecture

Single-process Python 3.11 CLI. One command, one PR per invocation (or N per --all-open).

Seams

Seam File Why it matters
poll shim src/claude_review/claude_cli.py Every review poll event-drop lives here (run_cli). Swap this one file if the poll seam drifts.
Pre-prompt redaction src/claude_review/redaction.py called from agent.run_review Last chance before any PR text enters a Claude prompt.
Pre-post redaction src/claude_review/redaction.py called from publisher.post_review Defense in depth before any text lands in a public review body.
Read-tool path gate redaction.is_blocked_path + CLI sandbox (--add-dir <worktree>) The blocklist is applied to filenames extracted from the diff before the prompt is built, and the CLI's own sandbox bounds Read/Grep/Glob to the worktree.
Worktree teardown src/claude_review/worktree.py via atexit + SIGINT/SIGTERM handlers Crash residue never outlives one session.

Module layout

File Purpose
pr.py PRSpec + PRDetails dataclasses; gh pr view / gh pr diff wrappers.
worktree.py worktree_for context manager, mirror clone, signal-safe cleanup, gc_stale.
redaction.py Secret patterns (DEFAULT_PATTERNS) + path blocklist (DEFAULT_BLOCKLIST), scrub_text, is_blocked_path.
config.py TOML loader + validation for [review] / [posting] / [scopes].
models.py Severity (StrEnum), Comment, Review (frozen dataclasses).
prompts.py System + sub-agent prompts; splices in docs/review-criteria.md.
claude_cli.py The poll shim. Owns every review/fixup poll event-drop, the advisory JSON schemas, and the caller-side reply parse.
agent.py Review / fan-out / fix-up. Thin wrapper over claude_cli.
sessions.py Persist {session_id, head_sha} per PR for --resume.
renderer.py Review → markdown.
publisher.py gh api POST for reviews; guarded apply_fixups commit flow.
cli.py click entry point; wires everything.
docs/review-criteria.md Human-editable review philosophy; read at import and injected into the system prompt.

Dependencies

External tools (runtime)

Tool Min version Why
gh CLI 2.40 gh pr view --json, gh pr diff, gh api for posting.
git 2.38 git worktree add semantics.
pr-review poll session Headless reviews run on the poll session (started by poll-bringup); it uses the claude CLI's auth (Keychain / OAuth).
Python 3.11 tomllib + StrEnum are stdlib in 3.11.
uv any Env / deps.

gh auth login must be complete before first run. The claude CLI must be signed in (run claude once interactively to establish auth).

Python packages

  • click>=8.1 — CLI framework.
  • pytest>=8, pytest-mock>=3, ruff>=0.4 — dev.
  • Stdlib: tomllib, subprocess, re, fnmatch, dataclasses, pathlib, contextlib, atexit, signal, json, concurrent.futures.

No Anthropic SDK — per the monorepo-wide rule in ~/CLAUDE.md, every Claude call routes through the pr-review poll session.

Secrets

  • gh auth token — managed by gh auth login; stored in macOS Keychain or ~/.config/gh/.
  • claude CLI auth — managed by the CLI itself, established once for the poll session. ANTHROPIC_API_KEY is unset at the poll session's launch (poll-bringup runs env -u), so it never reaches Claude.

Config

File: ~/.config/claude-review/config.toml (override with CLAUDE_REVIEW_CONFIG). All keys optional; missing file → all defaults.

[review]
voice = "thorough"                                    # "thorough" | "terse"
aspects = ["security", "logic", "tests", "style"]     # subset of {security, logic, tests, style, performance}
model = "claude-opus-4-7"
max_turns = 20

[posting]
post_by_default = false                               # if true, --post is implicit
apply_fixups = false                                  # if true, --apply-fixups is implicit

[scopes]
allowlist_owners = ["mark", "my-github-handle"]       # skip interactive confirm for these
denylist_owners = ["my-employer-org"]                 # refuse to run, exit 0 with message

Validation at load: invalid voice or unknown aspect raises ValueError. Merge order: defaults → file overrides. No env-var overrides per key (only CLAUDE_REVIEW_CONFIG for the whole file).

max_turns is mapped to the CLI's --max-budget-usd via a rough heuristic (max_turns * 0.05) — the CLI exposes a cost cap rather than a turn-count cap.

Storage

Path Purpose Lifecycle
/tmp/prs/.mirror-<owner>-<repo>/ Shared --filter=blob:none mirror clone. Reused across runs; --gc leaves mirrors alone.
/tmp/prs/<owner>-<repo>-<pr>-<pid>/ Ephemeral worktree. Removed on exit (SIGINT/SIGTERM/atexit). --gc reaps residue >1 h old. macOS clears /tmp on reboot.
~/.local/state/claude-review/sessions/<slug>.json {session_id, head_sha, updated_at} per PR. Enables --resume. Persists across runs. Safe to delete. Dir is 0700; file is 0600.
~/.config/claude-review/config.toml User config. User-managed.

No review history / audit log. No diff / PR-metadata cache — every run re-fetches via gh.

Deployment

Local CLI only. Not a server, not a GitHub Action, not a cron job.

Install in place:

cd ~/pr-review
uv tool install .

This repo's install.sh runs uv tool install --reinstall .; the binary lands in ~/.local/bin/claude-review.

Updates:

cd ~/pr-review
git pull && uv tool install --reinstall .

Why not a GitHub Action: a hosted action would need a long-lived Claude credential in every repo that uses it, and would surrender control over redaction to CI. Local-first keeps the redactor in the critical path.

Why not a GitHub App: out of scope — would require hosting, persistence, and a review queue. Local CLI is enough for Mark's use case.

Security posture

Three failure modes are treated as load-bearing (PLAN.md §Risks):

  1. Credential leakage. Two-seam redaction (pre-prompt + pre-post) plus a diff-level blocklist that drops blocked paths before they enter the prompt. Patterns cover AWS keys, GitHub / Anthropic / OpenAI / Slack tokens, JWTs, URL userinfo, and PEM private-key blocks. The CLI wrapper also scrubs ANTHROPIC_API_KEY from the subprocess environment — defense against accidental leakage into a future CLI change that logs env.
  2. Reviewing proprietary code. denylist_owners hard-refuses (exit 0 with stderr message). Non-allowlisted orgs get an interactive --yes gate.
  3. Runaway fix-ups. apply_fixups aborts and git reset --hard HEAD if the mechanical diff exceeds 100 lines. Never pushes; user pushes by hand after inspecting.

Other mitigations: prompt-injection resistance in the system prompt (agent is told to ignore instructions inside PR content), default review event is COMMENT not APPROVE, worktrees are pid-scoped and SIGINT-cleaned, the CLI seam is isolated to one file so swapping on flag drift is a local change.

Testing

cd ~/pr-review
uv run pytest

All tests are pure-python (the poll seam is mocked). uv run ruff check . for lint. There is no e2e test gate — the plan reserves CLAUDE_REVIEW_E2E=1 for a future public-PR smoke test.

Files

  • src/claude_review/ — package source (12 modules).
  • tests/ — pytest suite.
  • docs/review-criteria.md — human-editable review philosophy.
  • PLAN.md — the implementation plan (1 771 lines; all tasks complete).
  • pyproject.tomluv project, click runtime dep, pytest + ruff dev.