Next.js on Kubernetes, production-grade in five commands.
A Helm chart for the app and a bootstrap script for the platform. Ingress, TLS, autoscaling, metrics, logs, alerts. None of the yaml.
k8s-ops-toolkit is the platform layer most teams spend a week assembling, written down. The Helm chart deploys your Next.js app with deployment, service, ingress (TLS via cert-manager), HPA, PDB, and a Prometheus ServiceMonitor. The bootstrap script installs ingress-nginx, cert-manager, kube-prometheus-stack, and Loki + Promtail with sane defaults and pre-baked Grafana dashboards. Five commands, about eight minutes, no surprises.
Why this exists
Most teams running Next.js on Kubernetes solve the same five problems in the first month: TLS, autoscaling, metrics, logs, alerts. Each is a few hours; together they are a week of yak-shaving before anyone is comfortable pushing to production.
The bigger ecosystems (Argo, Crossplane, Backstage) solve much larger problems and bring much heavier machinery with them. The lighter starters skip observability entirely. The middle ground is what most teams actually need and rarely package well.
k8s-ops-toolkit is that middle ground. A small Helm chart you can read in twenty minutes plus a bootstrap script for the platform stack. Pre-baked Grafana dashboards, working alert rules, sensible defaults. The week you would have spent, given back.
What it does
Every feature below ships in the public repository today. Clone, configure, run.
Helm chart for Next.js
Deployment, service, ingress with TLS, HPA, PDB, ServiceMonitor. Twenty minutes to read.
cert-manager built-in
Let's Encrypt issuer wired up. Automatic renewal. Default 90-day cert with 14-day expiry alerts.
Prometheus + Grafana
kube-prometheus-stack with three pre-baked dashboards: Cluster, Ingress, Next.js app.
Loki + Promtail
Log aggregation that does not break the bank. Indexed by label, queryable from Grafana.
Alertmanager rules
CrashLoopBackOff, ingress 5xx spikes, p99 latency, cert expiry, disk pressure. Wire to Slack or PagerDuty.
HPA on CPU or RPS
Default CPU autoscaling. Optional pattern for scaling on requests-per-second from the ServiceMonitor.
ingress-nginx default
The ingress everyone runs. Documented annotations for body size, websocket, redirects.
PDB for safe maintenance
Pod disruption budgets so cluster upgrades do not take you down.
No service mesh, by design
Mesh complexity is rarely worth the cost for Next.js workloads. We deliberately do not bundle one.
Plain Helm, no operator
You can read the templates. You can copy them. You can fork them. No magic.
Tech stack
Architecture, in one diagram
The whole system on a single screen. Every box maps to a real folder in the repo.
┌──────────────────────────────────────────────────────────┐
│ Internet │
└──────────────────────────┬───────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────┐
│ ingress-nginx (LoadBalancer) │
│ - cert-manager → Let's Encrypt → TLS │
└──────────────────────────┬───────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────┐
│ Next.js app (Helm chart) │
│ - Deployment + Service + HPA + PDB │
│ - /api/health probes, /api/metrics scrape │
└──────────────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────┐ ┌─────────────────────┐
│ Prometheus │ │ Loki + Promtail │
│ - ServiceMonitor │ │ - log shipping │
└──────────┬──────────┘ └──────────┬──────────┘
▼ ▼
┌─────────────────────────────┐
│ Grafana (dashboards) │
│ Alertmanager (Slack/PD) │
└─────────────────────────────┘Quick start
From clone to first request in under five minutes.
git clone https://github.com/sarmakska/k8s-ops-toolkit.git cd k8s-ops-toolkit
./scripts/install.sh \ --domain example.com \ --email you@example.com \ --slack-webhook https://hooks.slack.com/...
helm install my-app charts/nextjs-app \ --set image.repository=ghcr.io/you/my-app \ --set image.tag=v1.0.0 \ --set ingress.host=app.example.com
kubectl port-forward -n monitoring svc/grafana 3000:80 # Grafana → Cluster Overview, Ingress nginx, Next.js app
Where it fits
The patterns this repository was built around.
First production cluster
Greenfield team going from "we deploy to Vercel" to "we run our own k8s." Skip the week of yak-shaving.
Adding observability later
You already have apps running but no metrics or logs. The bootstrap script gets you instrumented in an afternoon.
Standardising deploys
Pin every Next.js deploy in your org to the same chart. Consistent probes, consistent autoscaling, consistent alerts.
Cost-controlled SaaS infra
A single $70/mo cluster on DigitalOcean hosting an arbitrary number of apps. Predictable bill, no surprise vendors.
Related products
The wider Sarma Linux toolkit. Every project ships with the same opinions: open source, MIT, real depth, no marketing fluff.
Your platform stack, written down. Five commands. Eight minutes.
Clone the repo, follow the four-step quick start, ship something real.