Receipt Scanner
A working AI receipt OCR starter. Drop a photo, get structured JSON. Vendor, line items, totals, tax, currency, payment method. Resize-aware, Zod-validated, ready to drop into Postgres, Xero, or n8n. The hard bit is solved.
Why this exists
Vision-capable language models are now better at reading receipts than every dedicated OCR product I have benchmarked. Tesseract gets 60 percent of the line items right. Cloud Vision gets the totals but misses the line item structure. Claude 3.5 Sonnet gets all of it — vendor, date, items, totals, payment method — in one call.
The catch is that most "Claude-powered receipt scanner" tutorials skip the parts that actually matter for production: image resizing for token cost, EXIF rotation, schema validation, error handling on malformed model output. They show a working demo and leave you to discover the corner cases yourself.
Receipt Scanner is the production-shape version. Resize before upload. Strict JSON contract. Zod-validated output. Persistence stub. Drop-in for Xero, QuickBooks, n8n. Fork it, point it at your storage and your accounting backend, ship.
Built-in features
Everything below works out of the box. Clone, add an Anthropic key, deploy.
Claude 3.5 Sonnet vision OCR
The strongest receipt OCR I have benchmarked end-to-end. Reads printed receipts, supermarket dot-matrix, mobile screenshots, faded thermal prints. Override the model with one env var.
sharp resize and JPEG re-encode
Vision APIs charge per token, and image tokens scale with resolution. A 4000×3000 phone photo down to 1568px max keeps quality high and tokens low. Roughly 4× cost saving with no measurable accuracy loss.
EXIF auto-rotate
Phones save photos rotated. Without explicit rotate the image arrives sideways and the model gets confused. sharp.rotate() reads EXIF and applies the correct orientation before re-encoding.
Strict JSON contract
System prompt forces "JSON only, no backticks, no commentary". Output is parsed and validated against a Zod schema. Malformed model output is rejected before it reaches downstream code.
Zod-validated output
Vendor, address, date, time, currency, line items, subtotal, tax, tip, total, payment method. Every field nullable. The model sees what it sees — do not pretend you got more than you did.
Itemised line items
Each line item carries description, quantity, unit price, and total. Captures vendor-specific formatting from Tesco, Sainsbury's, Pret, Costco and most printed receipts in the wild.
Persistence stub included
lib/persist.ts is a no-op save() function with the schema documented inline. Replace with one Supabase or Postgres insert. Schema lives in docs/schema.sql.
Swap to OpenAI Vision in one file
lib/vision.ts is a single function. Replace its body to call gpt-4o instead of Claude. Same JSON contract, same UI. Useful comparison reference left in the file.
Vercel-native deployment
Vercel ships sharp on the Linux runtime out of the box. No build configuration needed. Push to GitHub, set ANTHROPIC_API_KEY, you are live.
n8n / Zapier ready
After scanning, the structured JSON is yours. Wire it into Xero, QuickBooks, Notion, Airtable, or fan it out via webhooks. The hard bit is solved.
Tech stack
Architecture sketch
One API route. One vision call. One Zod schema enforcing the contract.
┌─────────────────────────────────────────────────────────────┐
│ POST /api/scan │
│ Browser ──FormData(image)──▶ Route handler │
│ │ │
│ ▼ sharp.rotate().resize(1568).jpeg(85) │
│ ▼ buffer.toString('base64') │
│ ▼ anthropic.messages.create({ │
│ model: "claude-3-5-sonnet-latest", │
│ messages: [{ role:'user', content:[ │
│ { type:'image', source:base64 }, │
│ { type:'text', text: SYSTEM_PROMPT } │
│ ]}] │
│ }) │
│ ▼ JSON.parse(response) │
│ ▼ ReceiptSchema.parse(json) // zod validation │
│ ▼ persist.save(receipt) // optional │
│ ▼ 200 OK { ok: true, id, receipt } │
└─────────────────────────────────────────────────────────────┘Quick start
From clone to running locally in four commands.
git clone https://github.com/sarmakska/receipt-scanner.git cd receipt-scanner pnpm install cp .env.example .env.local # Add ANTHROPIC_API_KEY to .env.local pnpm dev
Open http://localhost:3000, drag a receipt photo in, see structured data. That is the loop.
Use cases
What people actually build with this.
Internal expense workflow
"My staff hate typing every receipt in." Snap, upload, save. Replace Expensify or Dext for your finance flow.
AI bookkeeping prototype
A working OCR baseline you can fork. Add Xero, QuickBooks, or HMRC export on top. Ship in days, not months.
Personal expense tracker
Self-host on Vercel. Receipt to Notion or Google Sheets via the persistence stub. Replace SaaS subscriptions you do not need.
Learning vision OCR end-to-end
Read 400 lines of TypeScript, understand image tokens, system prompts, schema-enforced output, and per-call cost control.
Use it. Fork it. Ship it.
MIT licensed. No strings attached. Pull requests welcome — multi-page PDFs, email-to-receipt ingestion, bulk batch upload, HMRC export are all open issues.