pr-review

Standalone repo, extracted from the ~/agents monorepo 2026-05-30.

claude-review <pr> — a local CLI that fetches a GitHub PR into an ephemeral git worktree, runs a headless review on the pr-review poll session with Read/Grep/Glob scoped to the worktree, and emits a structured markdown review. Optionally posts the review back to the PR or commits guarded fix-ups for nit comments.

Every review request lives in src/claude_review/claude_cli.py — run_cli drops a poll event at ~/.claude/poll/pr-review/<req_id>.md and blocks for the JSON reply, so the poll seam is exercised from exactly one file.

Status

Built. All 14 PLAN.md tasks implemented (MVP tasks 0–11 + ambitious tasks 12–14). Tests green, ruff clean.

Runs on the pr-review poll session; no SDK dep, no ANTHROPIC_API_KEY.

Scope

In scope: - Parse PR spec in three forms: owner/repo#N, owner/repo/N, https://github.com/owner/repo/pull/N[...]. - gh pr view --json + gh pr diff to fetch metadata and unified diff. - Ephemeral git worktree at /tmp/prs/<owner>-<repo>-<pr>-<pid>/ (or $CLAUDE_REVIEW_WORKTREE_ROOT), cleaned up on exit including SIGINT/SIGTERM/atexit. - Redact secrets at two seams (pre-agent-prompt and pre-GitHub-post): AWS access keys, GitHub tokens, Anthropic/OpenAI/Slack tokens, JWTs, URL userinfo, PEM private-key blocks. - Pre-filter the diff against the blocklist (.env*, *.pem, *.key, **/secrets.*, **/credentials*, id_rsa*, id_ed25519*, **/.aws/**, **/.ssh/**) before the prompt is built, so blocked files never reach the agent. The CLI's sandbox (tools rooted at --add-dir <worktree>) bounds reads to that dir regardless. - Drop a poll event scoped to the worktree (tools Read Grep Glob, rooted via add_dir); the JSON schema rides in the event frontmatter as a hint, and parse_review_payload validates the reply caller-side into Review { summary, comments: [{file, line_start, line_end, severity, body}] }. - Render review as markdown to stdout; optionally post as a PR review via gh api (default event COMMENT, never APPROVE). - Honor config-driven owner allowlist/denylist; refuse denylisted orgs; require --yes for non-allowlisted orgs. - --fan-out: one CLI invocation per aspect (security / logic / tests / style) dispatched in parallel via a thread pool, results merged. - --resume: persist the poll session's claude session id + head SHA per PR; on re-run against a force-pushed PR, replay that session (best-effort — the poll session is one long-running conversation) so only the new diff is re-processed. - --apply-fixups: guarded sub-agent with Read+Edit+Write scoped to the worktree applies only nit-severity comments. Hard 100-line diff cap; over-the-cap runs are reset. Commits locally with fixup: address review nits. Never pushes.

Out of scope: - Not a GitHub App / webhook service. No server, no long-running process. - Not a GitHub Action. Local-first so the redactor stays in the critical path and no hosted secrets are needed. - Not a Slack bot. Output channels are stdout and gh api only. - No review history database. Output is ephemeral per invocation (session persistence is for CLI replay, not audit log).

Interface

Installed as claude-review via uv tool install . (or pipx). No subcommands; mode is chosen by flags.

Commands

Invocation	Behavior
`claude-review <spec>`	Review, print markdown to stdout.
`claude-review <spec> --post`	Also post as a PR review via `gh api`.
`claude-review <spec> --apply-fixups`	Also commit guarded nit fix-ups in the worktree (never pushes).
`claude-review <spec> --fan-out`	Dispatch one CLI invocation per configured aspect in parallel, merge.
`claude-review <spec> --resume`	Reuse the persisted poll session if the PR head SHA has moved.
`claude-review --all-open --repo <owner>/<name>`	Iterate every open PR in the repo.
`claude-review --gc`	Remove `/tmp/prs/*` older than 1 hour and exit.

Spec forms: - mark/newsfeed#42 - mark/newsfeed/42 - https://github.com/mark/newsfeed/pull/42 - https://github.com/mark/newsfeed/pull/42/files - https://github.com/mark/newsfeed/pull/42/commits/<sha>

Flags

--post — post the review via gh api.
--apply-fixups — open a guarded fix-up commit for nit comments.
--fan-out — parallel CLI invocations, one per aspect.
--resume — reuse persisted poll session on force-push.
--yes — bypass the interactive confirm for non-allowlisted orgs, or for --all-open on more than 5 PRs.
--max-prs N — cap iteration count under --all-open.
--gc — remove worktree residue older than 1 h and exit.

Environment variables

CLAUDE_REVIEW_CONFIG — override config path (default ~/.config/claude-review/config.toml).
CLAUDE_REVIEW_WORKTREE_ROOT — override /tmp/prs worktree root.
CLAUDE_REVIEW_STATE_DIR — override session-record directory (default ~/.local/state/claude-review/sessions/).

The poll session uses the claude CLI's own auth (Keychain / OAuth); no ANTHROPIC_API_KEY is required. The session is launched by poll-bringup with ANTHROPIC_API_KEY unset (env -u), so the key never reaches Claude; claude_cli.py no longer spawns a subprocess and manages no env itself.

Exit codes

0 — review printed, or denylisted owner skipped (not an error).
2 — bad CLI usage (missing spec, missing --repo with --all-open, --all-open on >5 PRs without --yes).
Non-zero otherwise — poll-reply timeout, malformed reply, or the session reporting is_error.

Output contract (stdout markdown)

# Review: <owner>/<repo>#<N>

**Summary:** <one-paragraph summary>

**Counts:** blocker=X suggestion=Y nit=Z

## Comments

### `<file>`:<line_start>–<line_end> [<severity>]
<body>

...

Architecture

Single-process Python 3.11 CLI. One command, one PR per invocation (or N per --all-open).

Seams

Seam	File	Why it matters
poll shim	`src/claude_review/claude_cli.py`	Every review poll event-drop lives here (`run_cli`). Swap this one file if the poll seam drifts.
Pre-prompt redaction	`src/claude_review/redaction.py` called from `agent.run_review`	Last chance before any PR text enters a Claude prompt.
Pre-post redaction	`src/claude_review/redaction.py` called from `publisher.post_review`	Defense in depth before any text lands in a public review body.
Read-tool path gate	`redaction.is_blocked_path` + CLI sandbox (`--add-dir <worktree>`)	The blocklist is applied to filenames extracted from the diff before the prompt is built, and the CLI's own sandbox bounds Read/Grep/Glob to the worktree.
Worktree teardown	`src/claude_review/worktree.py` via `atexit` + SIGINT/SIGTERM handlers	Crash residue never outlives one session.

Module layout

File	Purpose
`pr.py`	`PRSpec` + `PRDetails` dataclasses; `gh pr view` / `gh pr diff` wrappers.
`worktree.py`	`worktree_for` context manager, mirror clone, signal-safe cleanup, `gc_stale`.
`redaction.py`	Secret patterns (`DEFAULT_PATTERNS`) + path blocklist (`DEFAULT_BLOCKLIST`), `scrub_text`, `is_blocked_path`.
`config.py`	TOML loader + validation for `[review] / [posting] / [scopes]`.
`models.py`	`Severity` (StrEnum), `Comment`, `Review` (frozen dataclasses).
`prompts.py`	System + sub-agent prompts; splices in `docs/review-criteria.md`.
`claude_cli.py`	The poll shim. Owns every review/fixup poll event-drop, the advisory JSON schemas, and the caller-side reply parse.
`agent.py`	Review / fan-out / fix-up. Thin wrapper over `claude_cli`.
`sessions.py`	Persist `{session_id, head_sha}` per PR for `--resume`.
`renderer.py`	`Review` → markdown.
`publisher.py`	`gh api` POST for reviews; guarded `apply_fixups` commit flow.
`cli.py`	`click` entry point; wires everything.
`docs/review-criteria.md`	Human-editable review philosophy; read at import and injected into the system prompt.

Dependencies

External tools (runtime)

Tool	Min version	Why
`gh` CLI	2.40	`gh pr view --json`, `gh pr diff`, `gh api` for posting.
`git`	2.38	`git worktree add` semantics.
`pr-review` poll session	—	Headless reviews run on the poll session (started by `poll-bringup`); it uses the `claude` CLI's auth (Keychain / OAuth).
Python	3.11	`tomllib` + `StrEnum` are stdlib in 3.11.
`uv`	any	Env / deps.

gh auth login must be complete before first run. The claude CLI must be signed in (run claude once interactively to establish auth).

Python packages

click>=8.1 — CLI framework.
pytest>=8, pytest-mock>=3, ruff>=0.4 — dev.
Stdlib: tomllib, subprocess, re, fnmatch, dataclasses, pathlib, contextlib, atexit, signal, json, concurrent.futures.

No Anthropic SDK — per the monorepo-wide rule in ~/CLAUDE.md, every Claude call routes through the pr-review poll session.

Secrets

gh auth token — managed by gh auth login; stored in macOS Keychain or ~/.config/gh/.
claude CLI auth — managed by the CLI itself, established once for the poll session. ANTHROPIC_API_KEY is unset at the poll session's launch (poll-bringup runs env -u), so it never reaches Claude.

Config

File: ~/.config/claude-review/config.toml (override with CLAUDE_REVIEW_CONFIG). All keys optional; missing file → all defaults.

[review]
voice = "thorough"                                    # "thorough" | "terse"
aspects = ["security", "logic", "tests", "style"]     # subset of {security, logic, tests, style, performance}
model = "claude-opus-4-7"
max_turns = 20

[posting]
post_by_default = false                               # if true, --post is implicit
apply_fixups = false                                  # if true, --apply-fixups is implicit

[scopes]
allowlist_owners = ["mark", "my-github-handle"]       # skip interactive confirm for these
denylist_owners = ["my-employer-org"]                 # refuse to run, exit 0 with message

Validation at load: invalid voice or unknown aspect raises ValueError. Merge order: defaults → file overrides. No env-var overrides per key (only CLAUDE_REVIEW_CONFIG for the whole file).

max_turns is mapped to the CLI's --max-budget-usd via a rough heuristic (max_turns * 0.05) — the CLI exposes a cost cap rather than a turn-count cap.

Storage

Path	Purpose	Lifecycle
`/tmp/prs/.mirror-<owner>-<repo>/`	Shared `--filter=blob:none` mirror clone.	Reused across runs; `--gc` leaves mirrors alone.
`/tmp/prs/<owner>-<repo>-<pr>-<pid>/`	Ephemeral worktree.	Removed on exit (SIGINT/SIGTERM/atexit). `--gc` reaps residue >1 h old. macOS clears `/tmp` on reboot.
`~/.local/state/claude-review/sessions/<slug>.json`	`{session_id, head_sha, updated_at}` per PR. Enables `--resume`.	Persists across runs. Safe to delete. Dir is 0700; file is 0600.
`~/.config/claude-review/config.toml`	User config.	User-managed.

No review history / audit log. No diff / PR-metadata cache — every run re-fetches via gh.

Deployment

Local CLI only. Not a server, not a GitHub Action, not a cron job.

Install in place:

cd ~/pr-review
uv tool install .

This repo's install.sh runs uv tool install --reinstall .; the binary lands in ~/.local/bin/claude-review.

Updates:

cd ~/pr-review
git pull && uv tool install --reinstall .

Why not a GitHub Action: a hosted action would need a long-lived Claude credential in every repo that uses it, and would surrender control over redaction to CI. Local-first keeps the redactor in the critical path.

Why not a GitHub App: out of scope — would require hosting, persistence, and a review queue. Local CLI is enough for Mark's use case.

Security posture

Three failure modes are treated as load-bearing (PLAN.md §Risks):

Credential leakage. Two-seam redaction (pre-prompt + pre-post) plus a diff-level blocklist that drops blocked paths before they enter the prompt. Patterns cover AWS keys, GitHub / Anthropic / OpenAI / Slack tokens, JWTs, URL userinfo, and PEM private-key blocks. The CLI wrapper also scrubs ANTHROPIC_API_KEY from the subprocess environment — defense against accidental leakage into a future CLI change that logs env.
Reviewing proprietary code. denylist_owners hard-refuses (exit 0 with stderr message). Non-allowlisted orgs get an interactive --yes gate.
Runaway fix-ups. apply_fixups aborts and git reset --hard HEAD if the mechanical diff exceeds 100 lines. Never pushes; user pushes by hand after inspecting.

Other mitigations: prompt-injection resistance in the system prompt (agent is told to ignore instructions inside PR content), default review event is COMMENT not APPROVE, worktrees are pid-scoped and SIGINT-cleaned, the CLI seam is isolated to one file so swapping on flag drift is a local change.

Testing

cd ~/pr-review
uv run pytest

All tests are pure-python (the poll seam is mocked). uv run ruff check . for lint. There is no e2e test gate — the plan reserves CLAUDE_REVIEW_E2E=1 for a future public-PR smoke test.

Files

src/claude_review/ — package source (12 modules).
tests/ — pytest suite.
docs/review-criteria.md — human-editable review philosophy.
PLAN.md — the implementation plan (1 771 lines; all tasks complete).
pyproject.toml — uv project, click runtime dep, pytest + ruff dev.