Open Source · MIT License · Vision OCR starter

Receipt Scanner

A working AI receipt OCR starter. Drop a photo, get structured JSON. Vendor, line items, totals, tax, currency, payment method. Resize-aware, Zod-validated, ready to drop into Postgres, Xero, or n8n. The hard bit is solved.

~£0.013
per scan
~2s
end-to-end
1568px
max image dim
4x
token saving
MIT
license

Why this exists

Vision-capable language models are now better at reading receipts than every dedicated OCR product I have benchmarked. Tesseract gets 60 percent of the line items right. Cloud Vision gets the totals but misses the line item structure. Claude 3.5 Sonnet gets all of it — vendor, date, items, totals, payment method — in one call.

The catch is that most "Claude-powered receipt scanner" tutorials skip the parts that actually matter for production: image resizing for token cost, EXIF rotation, schema validation, error handling on malformed model output. They show a working demo and leave you to discover the corner cases yourself.

Receipt Scanner is the production-shape version. Resize before upload. Strict JSON contract. Zod-validated output. Persistence stub. Drop-in for Xero, QuickBooks, n8n. Fork it, point it at your storage and your accounting backend, ship.

Built-in features

Everything below works out of the box. Clone, add an Anthropic key, deploy.

Claude 3.5 Sonnet vision OCR

The strongest receipt OCR I have benchmarked end-to-end. Reads printed receipts, supermarket dot-matrix, mobile screenshots, faded thermal prints. Override the model with one env var.

sharp resize and JPEG re-encode

Vision APIs charge per token, and image tokens scale with resolution. A 4000×3000 phone photo down to 1568px max keeps quality high and tokens low. Roughly 4× cost saving with no measurable accuracy loss.

EXIF auto-rotate

Phones save photos rotated. Without explicit rotate the image arrives sideways and the model gets confused. sharp.rotate() reads EXIF and applies the correct orientation before re-encoding.

Strict JSON contract

System prompt forces "JSON only, no backticks, no commentary". Output is parsed and validated against a Zod schema. Malformed model output is rejected before it reaches downstream code.

Zod-validated output

Vendor, address, date, time, currency, line items, subtotal, tax, tip, total, payment method. Every field nullable. The model sees what it sees — do not pretend you got more than you did.

Itemised line items

Each line item carries description, quantity, unit price, and total. Captures vendor-specific formatting from Tesco, Sainsbury's, Pret, Costco and most printed receipts in the wild.

Persistence stub included

lib/persist.ts is a no-op save() function with the schema documented inline. Replace with one Supabase or Postgres insert. Schema lives in docs/schema.sql.

Swap to OpenAI Vision in one file

lib/vision.ts is a single function. Replace its body to call gpt-4o instead of Claude. Same JSON contract, same UI. Useful comparison reference left in the file.

Vercel-native deployment

Vercel ships sharp on the Linux runtime out of the box. No build configuration needed. Push to GitHub, set ANTHROPIC_API_KEY, you are live.

n8n / Zapier ready

After scanning, the structured JSON is yours. Wire it into Xero, QuickBooks, Notion, Airtable, or fan it out via webhooks. The hard bit is solved.

Tech stack

Next.js 14TypeScriptAnthropic ClaudesharpZodTailwind CSSVercelSupabase (optional)

Architecture sketch

One API route. One vision call. One Zod schema enforcing the contract.

┌─────────────────────────────────────────────────────────────┐
│  POST /api/scan                                             │
│    Browser ──FormData(image)──▶ Route handler               │
│       │                                                     │
│       ▼ sharp.rotate().resize(1568).jpeg(85)                │
│       ▼ buffer.toString('base64')                           │
│       ▼ anthropic.messages.create({                         │
│           model: "claude-3-5-sonnet-latest",                │
│           messages: [{ role:'user', content:[              │
│             { type:'image', source:base64 },                │
│             { type:'text', text: SYSTEM_PROMPT }            │
│           ]}]                                               │
│         })                                                  │
│       ▼ JSON.parse(response)                                │
│       ▼ ReceiptSchema.parse(json)   // zod validation       │
│       ▼ persist.save(receipt)       // optional             │
│       ▼ 200 OK { ok: true, id, receipt }                    │
└─────────────────────────────────────────────────────────────┘

Quick start

From clone to running locally in four commands.

git clone https://github.com/sarmakska/receipt-scanner.git
cd receipt-scanner
pnpm install
cp .env.example .env.local
# Add ANTHROPIC_API_KEY to .env.local
pnpm dev

Open http://localhost:3000, drag a receipt photo in, see structured data. That is the loop.

Use cases

What people actually build with this.

Internal expense workflow

"My staff hate typing every receipt in." Snap, upload, save. Replace Expensify or Dext for your finance flow.

AI bookkeeping prototype

A working OCR baseline you can fork. Add Xero, QuickBooks, or HMRC export on top. Ship in days, not months.

Personal expense tracker

Self-host on Vercel. Receipt to Notion or Google Sheets via the persistence stub. Replace SaaS subscriptions you do not need.

Learning vision OCR end-to-end

Read 400 lines of TypeScript, understand image tokens, system prompts, schema-enforced output, and per-call cost control.

Open source · MIT

Use it. Fork it. Ship it.

MIT licensed. No strings attached. Pull requests welcome — multi-page PDFs, email-to-receipt ingestion, bulk batch upload, HMRC export are all open issues.