Products I've Built
AI-powered platforms and tools that I build for businesses, from intelligent automation systems to custom AI copilots.
Nineteen open-source products.
Two flagships, seventeen tools. Coding agents, AI gateways, voice loops, evaluation harnesses, RAG starters, OCR pipelines, MCP servers, multi-tenant SaaS, storage engines, consensus, sandboxes, and the infrastructure glue underneath. All MIT. All free to deploy yourself.
SarmaLink-AI
Multi-provider AI backend with automatic failover
Routes every message through up to 14 engines across 7 providers. Sub-50ms failover. Persistent memory, image generation, live tools, document analysis. Powered by DeepSeek V3.2 (685B) and 35 more engines.
- ›36 AI engines across 7 providers
- ›Failover handoff in under 50ms
- ›DeepSeek V3.2 · GPT-OSS 120B · Gemini 3
- ›Persistent memory and live web search
- ›FLUX.2 image generation and editing
StaffPortal
A complete open-source staff management platform
Internal HR platform built and open-sourced by Sarma. Attendance, timesheets, leave, expenses with AI receipt scanning, kiosk sign-in, visitor management, announcements. Self-host or deploy to Vercel in minutes.
- ›Attendance, timesheets, automated overtime
- ›Leave requests, balances, team calendar
- ›AI-OCR expense receipts
- ›Tablet kiosk sign-in for on-site staff
- ›Visitor management with host alerts
RAG-over-PDF
A minimal, production-shaped RAG starter
Upload a PDF, ask questions, get cited streaming answers. Cleanest end-to-end RAG you can clone, run, and ship in 10 minutes. No vector DB to provision, no Pinecone account, no LangChain weight.
- ›Fixed-size chunking with 200-char overlap
- ›text-embedding-3-small at 1536 dims
- ›In-memory cosine retrieval, swap to pgvector
- ›gpt-4o-mini streaming answers
- ›Under £0.001 per question end-to-end
Receipt Scanner
A working AI receipt OCR starter
Drop a photo of a receipt, get structured JSON. Vendor, line items, totals, tax, currency, payment method. Resize-aware, Zod-validated, ready to drop into Postgres, Xero, n8n. The hard bit is solved.
- ›Vision LLM OCR pipeline
- ›sharp resize and JPEG re-encode
- ›Zod-enforced JSON contract
- ›Itemised totals, tax, currency, payment
- ›Around £0.013 per scan
MCP Server Toolkit
Production-grade Model Context Protocol starter
A plugin-first MCP server with OAuth, OpenTelemetry, stdio + streamable HTTP transports, and four production plugins (filesystem, Postgres, GitHub, SarmaLink) in the box. The boring work of MCP, already done.
- ›Stdio + streamable HTTP transports
- ›OAuth 2.1 with PKCE, scope-gated tools
- ›OpenTelemetry across every tool call
- ›4 production plugins in the repo
- ›Cold start under 60 seconds
Voice Agent Starter
Sub-second real-time voice loop
Browser captures over WebRTC, mediasoup SFU forwards to a Fastify worker, pluggable STT/LLM/TTS adapters answer in under a second. Barge-in handled correctly so the agent feels alive, not hostage.
- ›< 700ms target round-trip
- ›WebRTC + mediasoup, one PC, two tracks
- ›Pluggable STT, LLM, TTS adapters
- ›Explicit turn state machine
- ›Barge-in cancellation tested for edge cases
Agent Orchestrator
Multi-agent workflows with deterministic replay
Workflows are TypeScript functions; runs are journaled events in Postgres; replay re-derives a run without invoking the model. Tool budgets, BullMQ queue, Inspector UI. Run agents like payment systems run payments.
- ›Durable state, no run lost on restart
- ›Deterministic replay from any checkpoint
- ›Tool budgets per workflow / agent / tool
- ›Inspector UI with live graph and replay
- ›OpenTelemetry across every workflow step
AI Eval Runner
Evals as code, with traces and regressions
Datasets as files, scorers as functions, traces in DuckDB, viewer in HTMX. Six built-in scorers including LLM-as-judge. Regression mode fails CI when a release loses performance against the baseline.
- ›6+ built-in scorers
- ›DuckDB trace storage on disk
- ›Slice-aware aggregates
- ›Regression mode for CI release gates
- ›Sub-100KB JS viewer
Local LLM Router
OpenAI-compatible proxy for Ollama and cloud LLMs
A YAML-policy router that decides per-request whether to use local Ollama or a cloud LLM. Privacy pinning fails closed; A/B rollout is rampable; every decision is logged in better-sqlite3. Sub-millisecond router overhead.
- ›OpenAI-compatible base URL
- ›YAML policy reviewable in PRs
- ›Privacy pinning fails closed
- ›Rolling A/B routing
- ›Per-request audit log
Webhook-to-Email
A tiny, production-grade webhook receiver
POST anything, get an email. Optional Slack fan-out, HMAC verification, single-attempt retry on 5xx. Stripe, GitHub, Cal.com, Typeform, internal cron jobs, one notification destination for the lot.
- ›Per-source templates in one folder
- ›HMAC SHA-256 verification
- ›Slack fan-out on the same call
- ›Single retry on 5xx, 80MB image
- ›Around 1,000 req/sec on one core
k8s-ops-toolkit
Helm bundles + observability for Next.js on Kubernetes
Production-grade Helm chart for Next.js workloads with probes, autoscaling, ingress and TLS. Ships with the full observability stack, kube-prometheus-stack, Loki, Promtail, Alertmanager, preconfigured for the common Next.js failure modes. Eight minutes from empty cluster to fully observable.
- ›Helm chart with probes + autoscaling
- ›kube-prometheus-stack preconfigured
- ›Loki + Promtail log pipeline
- ›ingress-nginx + cert-manager + Let's Encrypt
- ›One install script, eight-minute bootstrap
terraform-stack
Vercel + Supabase + Cloudflare + DigitalOcean as code
Solo-engineer cloud stack as code. Apply once and you have a Vercel project, Supabase Postgres, Cloudflare zone with R2 + KV, and an optional DigitalOcean droplet, all wired together for a typical Next.js studio site. The infrastructure under sarmalinux.com, exposed as reusable modules.
- ›Vercel module: projects, env vars, domains
- ›Supabase module: project + DB schema hooks
- ›Cloudflare module: zones, R2, KV, Workers
- ›DigitalOcean droplets for self-hosted bits
- ›Composable root main.tf with sensible defaults
slipstream
Token-efficient coding agent with persistent memory
Replaces the default coding-agent loop with a disciplined one. Lossless context compaction, per-project memory that survives restarts, a live local dashboard showing sessions, sub-agents and token spend. The new flagship.
- ›Token-efficient context compaction (no lost facts)
- ›Per-project persistent memory across sessions
- ›Live local dashboard for sessions and sub-agents
- ›Pluggable model adapters, runs against any chat API
- ›Single binary, drops in alongside existing tooling
forge-infer
Minimal LLM inference server, readable in an afternoon
A small, honest inference server that does the three things the big serving frameworks promise. Paged KV-cache, continuous batching, speculative decoding. Built to learn from and to actually serve private models.
- ›Real paged KV-cache, not faked
- ›Continuous batching with per-request fairness
- ›Speculative decoding with a small draft model
- ›OpenAI-compatible chat completions endpoint
- ›Single Python file per major component
shipyard
Multi-tenant SaaS starter you would actually let bill people
The opinionated starting point for a new SaaS. Tenant isolation enforced at the database layer, RBAC, billing, audit log, per-tenant rate limits. Stop pasting Clerk and Stripe together every time.
- ›Tenant isolation enforced by row-level security
- ›RBAC with roles, permissions, policy checks
- ›Billing via Stripe subscriptions plus usage metering
- ›Append-only audit log on every state change
- ›Per-tenant rate limits and abuse controls
lsmdb
Log-structured merge-tree storage engine in Go
A teaching-quality storage engine. Small enough to read end to end in an afternoon, real enough to stress with millions of writes. WAL, SSTables, bloom filters, levelled compaction, MVCC snapshots.
- ›Write-ahead log with group commit
- ›Sorted string tables with block-level bloom filters
- ›Levelled compaction with size-tiered fallback
- ›MVCC snapshot reads, lock-free
- ›Crash-safe, fuzzed under random kill-and-restore
raftkv
Raft KV store with a fault-injection harness
A Raft implementation paired with a Jepsen-style harness that proves linearizability under network partitions, leader crashes, slow disks and corrupted writes. The repo you read to actually understand consensus.
- ›Leader election, log replication, snapshotting
- ›Built-in fault-injection harness
- ›Linearizability checker validates every run
- ›Disk write corruption + slow disk scenarios
- ›Replayable failure traces
sandboxd
WebAssembly sandbox for untrusted code
Run user-supplied scripts safely. Deny-by-default host ABI, strict CPU, wall-clock and memory limits, capability-style host functions. The bit you put between your app and a plugin so it cannot read the filesystem by accident.
- ›Deny-by-default host ABI
- ›Strict CPU time, wall-clock, memory limits
- ›Capability-style host functions, no ambient authority
- ›Per-tenant quotas and accounting
- ›Designed for embedding in larger apps
Hosted versions, on the 14th of February 2030.
Every product on this page is open source today and will stay that way. On the date below I open the doors to hosted, supported, paid versions for teams that would rather not run them themselves. Same code, same MIT licence underneath, with the operations, monitoring, backups and SLA on me.
Hosted SarmaLink-AI
The multi-provider gateway as a service. Bring your own provider keys or use mine, with usage-based pricing and pooled rate limits.
Hosted slipstream
Team-mode dashboard with shared per-project memory, audit-grade event logs, and SSO. Self-hosted plugin stays free forever.
Managed shipyard
The multi-tenant SaaS starter ready-deployed for you. Postgres + RLS, Stripe billing wired, audit log live, your domain on top.
Hosted ai-eval-runner
Eval results, regression diffs and traces as a dashboard your whole team can read, with CI integration on top of the open core.
Why so far out. The open source comes first. The hosted business comes after the open source is genuinely good. I would rather under-promise on dates than push a half-baked paid tier the day it stops being interesting to maintain. If that date moves, it will move closer, not further away.
Let's build something good.
You've got a problem. I solve problems with software for a living.
The fastest way to find out if we can work together is to talk.
Stack I build with