Eighteen months ago I made a deliberate decision: ship everything in public. Not because open source is virtuous in the abstract, but because the leverage maths are better than private code for a solo engineer with no marketing budget and no sales team.
This post is the honest accounting of what that decision actually produced.
The 12 repos[^1]
| Spec | Repo | Category | Time to ship | Outcome |
|---|---|---|---|---|
| Sarmalink-ai | AI gateway | 6 weeks | Highest inbound; several consulting conversations; most-starred | |
| mcp-server-toolkit | AI tooling | 2 weeks | Good traction as MCP interest grew; steady GitHub traffic | |
| local-llm-router | AI tooling | 1 week | Moderate; niche audience but very targeted | |
| voice-agent-starter | AI tooling | 4 weeks | Strong interest from real-time voice builders; opened new client conversations | |
| agent-orchestrator | AI tooling | 8 weeks | Slower initial traction; deepens on every post about it | |
| ai-eval-runner | AI tooling | 3 weeks | Niche but sticky; eval engineers are a deliberate buyer | |
| rag-over-pdf | AI starter | 3 days | High stars/forks ratio; often first contact for RAG work | |
| receipt-scanner | Utilities | 2 days | Low stars; a few direct questions; minimal ongoing | |
| webhook-to-email | Utilities | 1 day | Low stars; occasionally useful in client work; near-zero maintenance | |
| staff-portal | HR/ops | 3 weeks | Predates this work; mostly a reference for HR portal clients | |
| k8s-ops-toolkit | Infrastructure | 3 weeks | Growing slowly; Kubernetes audience is less browse-and-star | |
| terraform-stack | Infrastructure | 2 weeks | Steady; IaC engineers bookmark rather than star |
Time to ship is wall-clock weeks including nights and weekends, not a full-time-equivalent. A "2-day" repo means I sat down on a Saturday and had it out by Sunday evening.
The leverage model
A solo engineer's constraint is hours. There are roughly 2,000 working hours in a year. If those hours produce only private client work, the output ceases to compound: the work is done, the client uses it, you invoice, and the cycle repeats.
Public code compounds differently. The hours I spent on Sarmalink-ai[2] in weeks 1-6 continue to generate inbound 18 months later. Every blog post about it re-activates the signal. Every engineer who forks it and hits a bug creates a conversation. Every person who finds it through GitHub search and books a call has already pre-qualified themselves.
The asymmetry is stark: private work pays once; public work pays repeatedly, at decreasing marginal cost.
Where the returns actually came from
The pattern in the table is clear: AI tooling repos outperformed everything else by a significant margin. The combination of high search volume (everyone is building with LLMs right now), a specific problem solved (multi-provider failover, durable workflows, local routing), and a working implementation that someone can clone and run immediately — that combination is what generates real inbound.
Starter repos (rag-over-pdf[4]) generate high stars and forks but lower-quality conversations. Someone who clones a minimal starter does not necessarily need consulting; they need the code. That is still valuable, but the lead quality differs.
Infrastructure repos (k8s-ops-toolkit, terraform-stack) generate lower star counts but higher-intent visitors. The person who lands on a Terraform module repo and reads the documentation is further along in their buying journey than the person who stars a starter repo. Stars are a vanity metric; the right metric is "did this person reach out with a real problem."
Utility repos (receipt-scanner, webhook-to-email) were worth shipping — a couple of days each — but they do not generate meaningful compounding returns. They exist because they were useful to me in client work and trivially publishable.
The maintenance cost is real
Open-sourcing a repo is not free. Once it is public, people open issues, ask questions, and expect the README to be accurate. The AI tooling repos in particular require maintenance as the underlying ecosystem (LLM providers, MCP protocol, voice stacks) moves quickly.
My rough estimate: the 12 repos cost about 2-3 hours per month in aggregate maintenance today. That is the ongoing cost of keeping READMEs current, responding to well-formed issues, and making sure the install instructions still work. Issues that are not well-formed I close promptly.
If a repo is not worth 20 minutes a month to maintain, it should be archived. I have not archived any yet, but the utility repos are candidates.
What the data says by category
| Spec | Category | Total weeks shipped | Signal quality | Maintenance burden |
|---|---|---|---|---|
| AI tooling (6 repos) | ~24 weeks | Highest — leads from all 6 | Medium — ecosystem moves fast | |
| AI starters (1 repo) | ~0.4 weeks | High stars, lower lead quality | Very low | |
| Infrastructure (2 repos) | ~5 weeks | Good — considered buyers | Low | |
| Utilities (2 repos) | ~0.5 weeks | Low stars; some direct value | Near-zero | |
| HR/ops (1 repo) | ~3 weeks | Niche; reference only now | Low |
AI tooling is the category to invest in. High signal quality, compounding return, and the maintenance burden is manageable because the problems being solved are relatively stable (the interface between your code and an LLM provider does not change weekly).
Infrastructure repos are slower to accumulate traction but attract considered buyers. An engineer evaluating a Terraform stack is a different buyer from one cloning a demo.
Starters are worth shipping but not worth investing deeply in. A minimal, working implementation ships quickly and generates stars; depth does not add proportionate value.
The hidden return: learning
The return I undervalued initially is learning. Shipping publicly forces a kind of completeness that private work does not. A private prototype can have rough edges, an incomplete README, a missing error case. A public repo that you are proud to link in a blog post cannot.
That forcing function made me write better code. Not theoretically better — concretely better, because I was about to publish it and someone else would read it.
The agent-orchestrator[3] is the clearest example. The durable workflow logic went through two full rewrites before I published it, not because the first version was wrong but because writing the documentation revealed gaps in the mental model. Shipping publicly forced me to resolve them.
The operating thesis
The operating thesis for a solo technical engineer without a marketing team is: build in public, document thoroughly, ship frequently, maintain honestly. The compound effect of public work over 18 months is larger than the compound effect of the same private work would have been.
The repos are the marketing, the portfolio, and the compounding asset simultaneously. There is no equivalent for private code.
What I would do differently
Ship the AI tooling repos earlier. I waited until they were "good enough" and cost myself months of compounding. The first version of Sarmalink-ai was rougher than what I shipped, but it was also working — shipping it earlier with a clear README and a good description would have generated returns sooner.
Invest in blog content from the start. A repo without context is a directory of files. A repo with three or four posts explaining the design decisions and the tradeoffs is a product. The posts compound the repo's reach.
Archive the repos that are not worth maintaining. A graveyard of abandoned repos signals low effort just as much as no repos. If something stops being worth 20 minutes a month, archive it and say so in the README.
The number that matters most
Twelve repos in 18 months. The marginal cost of each subsequent repo is lower than the last because the infrastructure is in place: the GitHub organisation, the documentation templates, the blog post format, the Supabase seed patterns, the CI setup. The fixed cost is already paid.
The next repo will take less time to ship than the first and will land in a GitHub profile that already has 11 others giving it context. That compounding is the whole point.
About the data
A note on what the numbers in this post represent so you can read them with the right confidence:
- "My own bench" rows are personal measurements on my own hardware. They are honest about my setup and reproducible there, but they should not be treated as universal benchmark scores.
- Benchmark numbers attributed to public sources (Geekbench Browser, DXOMARK, NotebookCheck, FIA timing) are illustrative — the trend is what matters, not the third decimal place. Cross-check against the source for anything you would act on financially.
- Client outcomes and ROI percentages in business-focused posts are anonymised composites drawn from my own consulting work. Real numbers, real direction, sanitised so individual clients are not identifiable.
- Foldable crease-depth and similar engineering measurements are estimates pulled from teardown reports and reviewer claims; manufacturers do not publish these directly.
- Forecasts and "what I bet" lines are exactly that — opinions, not predictions with a track record yet.
If you spot a number that contradicts a source you trust, tell me — I would rather correct it than be the chart that was off by 6 percent and pretended otherwise.