Build an AI Code Reviewer

Overview

Code review is one of the highest-leverage activities in software engineering — and one of the most time-consuming. An AI code reviewer does not replace human judgment, but it catches the mechanical issues (bugs, security flaws, style violations) so human reviewers can focus on architecture, design, and intent.

In this tutorial, you will build a complete AI-powered code review tool. You will parse git diffs into structured data, enrich them with file context, send them to an LLM with carefully crafted prompts, parse the model's response into inline comments with severity levels, and generate a summary report. The tool works as a CLI that integrates into any git workflow.

What you will build:

A git diff parser that extracts structured change data
A context enricher that provides surrounding code for each change
An LLM prompt system with structured output parsing
Severity classification (critical, warning, suggestion, praise)
A summary report generator
A CLI interface for local and CI usage

Step 1: Git Diff Parser

Raw git diffs are text blobs. Parse them into structured objects that represent each file's changes with line numbers, hunks, and change types.

Step 2: Context Enrichment

The LLM needs more than just the changed lines — it needs the surrounding code to understand what the changes mean. Read the full file and extract context windows around each hunk.

Step 3: Language Detection and Rule Loading

Different languages have different review concerns. TypeScript reviews should flag any types; Python reviews should check type hints; SQL should flag injection risks. Detect the language and load appropriate review rules.

Step 4: Prompt Engineering

The prompt is the most critical piece. Structure it so the LLM produces consistent, parseable output with clear severity levels.

Step 5: LLM Client with Retry Logic

Call the LLM API with retry logic, timeout handling, and response validation. The client must handle rate limits gracefully since large PRs may require many API calls.

Step 6: Review Orchestrator

Coordinate the full review pipeline: parse the diff, enrich each file, send to the LLM, and collect all comments.

Step 7: Severity Classification and Scoring

Aggregate comments into a review score. Critical issues block the review, warnings accumulate, and suggestions are informational.

Step 8: Report Generation

Format the review into a readable Markdown report suitable for PR comments or terminal output.

Step 9: CLI Interface

Wrap the tool in a CLI that reads diffs from stdin or directly from git. Support flags for controlling behavior.

Step 10: CI Integration and Git Hooks

Deploy the reviewer as a pre-push git hook or a CI pipeline step. The exit code (0 for pass, 1 for critical issues) integrates naturally with CI systems.

For the git hook, add to .git/hooks/pre-push so the review runs automatically before every push. Set a file-size threshold to skip the LLM call on trivially small changes (under 5 lines) and keep API costs predictable. The combination of automated CI reviews and optional local hooks gives teams fast feedback without blocking workflow.

Overview

What you will build:

A git diff parser that extracts structured change data

A context enricher that provides surrounding code for each change

An LLM prompt system with structured output parsing

Severity classification (critical, warning, suggestion, praise)

A summary report generator

A CLI interface for local and CI usage

Step 10: CI Integration and Git Hooks

Deploy the reviewer as a pre-push git hook or a CI pipeline step. The exit code (0 for pass, 1 for critical issues) integrates naturally with CI systems.

// src/parser/diff.ts interface DiffHunk { oldStart: number; oldCount: number; newStart: number; newCount: number; lines: DiffLine[]; } interface DiffLine { type: "add" | "remove" | "context"; content: string; oldLineNumber: number | null; newLineNumber: number | null; } interface FileDiff { path: string; oldPath: string | null; status: "added" | "modified" | "deleted" | "renamed"; hunks: DiffHunk[]; additions: number; deletions: number; } function parseDiff(rawDiff: string): FileDiff[] { const files: FileDiff[] = []; const fileChunks = rawDiff.split(/^diff --git/m).filter(Boolean); for (const chunk of fileChunks) { const lines = chunk.split("\n"); const pathMatch = lines[0].match(/b\/(.+)$/); if (!pathMatch) continue; const path = pathMatch[1]; const status = chunk.includes("new file") ? "added" : chunk.includes("deleted file") ? "deleted" : chunk.includes("rename from") ? "renamed" : "modified"; const hunks = parseHunks(lines); const additions = hunks.reduce( (sum, h) => sum + h.lines.filter((l) => l.type === "add").length, 0 ); const deletions = hunks.reduce( (sum, h) => sum + h.lines.filter((l) => l.type === "remove").length, 0 ); files.push({ path, oldPath: null, status, hunks, additions, deletions }); } return files; } function parseHunks(lines: string[]): DiffHunk[] { const hunks: DiffHunk[] = []; let currentHunk: DiffHunk | null = null; let oldLine = 0; let newLine = 0; for (const line of lines) { const hunkMatch = line.match(/^@@ -(\d+),?(\d*) \+(\d+),?(\d*) @@/); if (hunkMatch) { currentHunk = { oldStart: parseInt(hunkMatch[1]), oldCount: parseInt(hunkMatch[2] || "1"), newStart: parseInt(hunkMatch[3]), newCount: parseInt(hunkMatch[4] || "1"), lines: [], }; hunks.push(currentHunk); oldLine = currentHunk.oldStart; newLine = currentHunk.newStart; continue; } if (!currentHunk) continue; if (line.startsWith("+")) { currentHunk.lines.push({ type: "add", content: line.slice(1), oldLineNumber: null, newLineNumber: newLine++, }); } else if (line.startsWith("-")) { currentHunk.lines.push({ type: "remove", content: line.slice(1), oldLineNumber: oldLine++, newLineNumber: null, }); } else if (line.startsWith(" ")) { currentHunk.lines.push({ type: "context", content: line.slice(1), oldLineNumber: oldLine++, newLineNumber: newLine++, }); } } return hunks; }

// src/llm/client.ts interface ReviewComment { line: number; severity: "critical" | "warning" | "suggestion" | "praise"; message: string; suggestion: string | null; } async function callLLM( prompt: string, apiKey: string, maxRetries: number = 3 ): Promise<ReviewComment[]> { let lastError: Error | null = null; for (let attempt = 0; attempt < maxRetries; attempt++) { try { const response = await fetch("https://api.anthropic.com/v1/messages", { method: "POST", headers: { "Content-Type": "application/json", "x-api-key": apiKey, "anthropic-version": "2023-06-01", }, body: JSON.stringify({ model: "claude-sonnet-4-20250514", max_tokens: 4096, messages: [{ role: "user", content: prompt }], }), signal: AbortSignal.timeout(30000), }); if (response.status === 429) { const retryAfter = parseInt(response.headers.get("retry-after") ?? "5"); await sleep(retryAfter * 1000); continue; } if (!response.ok) { throw new Error(`API returned ${response.status}: ${await response.text()}`); } const data = await response.json(); const text = data.content[0].text; return parseReviewComments(text); } catch (error) { lastError = error as Error; if (attempt < maxRetries - 1) { await sleep(Math.pow(2, attempt) * 1000); } } } throw lastError ?? new Error("LLM call failed after retries"); } function parseReviewComments(text: string): ReviewComment[] { // Extract JSON array from response, handling potential markdown wrapping const jsonMatch = text.match(/\[[\s\S]*\]/); if (!jsonMatch) return []; const parsed = JSON.parse(jsonMatch[0]); return parsed.filter((c: unknown): c is ReviewComment => { if (typeof c !== "object" || c === null) return false; const obj = c as Record<string, unknown>; return ( typeof obj.line === "number" && typeof obj.severity === "string" && typeof obj.message === "string" ); }); } function sleep(ms: number): Promise<void> { return new Promise((resolve) => setTimeout(resolve, ms)); }