New flagship of the SarmaLinux line . Open Source . MIT

slipstream

A coding-agent platform plugin that swaps whole-file reads for precise tools, keeps context alive across compaction, and stands up a live local dashboard so I can actually watch the agents work. The runner I ship with.

9
MCP tools
59
agent skills
88
tests
MIT
license
127.0.0.1:53267
live . local-only
agents
  • mainrunning
  • sp-shipperstep 4/7
  • sp-reviewerwaiting
activity
  • + sp_symbol(retrieve.ts, retrieveSymbol)
  • + sp_lines(server.ts, 40, 92)
  • ~ sp_remember(decision)
  • + sp_map()
token budget
12%ok
mem 4 facts loaded
plan
  • [x] orient with sp_map
  • [x] slice retrieveSymbol
  • [ ] wire PreCompact hook
  • [ ] verify with /doctor
cp | ctx 12% ok | mem 4 | skill scoped-read | Opus 4.8

The live dashboard, mocked. Real one renders on 127.0.0.1, refreshes over SSE.

Why this exists

A long coding-agent session usually dies one of two ways. The agent reads whole files until the context window fills and starts forgetting the start of its own plan. Or it does good work, the session ends, and every decision it made evaporates. I write small production sites on Cloudflare, Supabase, Vercel and Resend, and I lean on the agent in my IDE for the boring parts. Both of those failure modes were biting me every long day.

The first failure mode is whole-file context bleed. The agent opens a 1,200 line component to change one prop. The budget bleeds. Three prompts later it has paged out the convention we agreed on at the top. Slipstream answers this with a bundled MCP server: a compact sp_map of the project, and sp_symbol / sp_lines for surgical slices. One symbol in, not one file in.

The second failure mode is the compaction cliff. The window summarises, the durable facts go with the noise. I tried writing everything into a single hand-rolled notes file. It rotted within a day. Slipstream answers this with a structured memory store, a PreCompact hook that writes a session digest the instant before the trim, and a signal-ranked recall that reloads only the relevant subset on the next session.

Then I added the thing I actually wanted most: a window into the session. When you fire off a plan and a subagent and walk away, you should be able to glance at a tab and see which agent is on which step and where the budget is. That is the live local dashboard, and it is the headline feature.

Watch the agents work

Session start boots a small 127.0.0.1 server on a free port and prints the URL into chat. Four panels, themed in the SarmaLinux palette, refreshed over SSE.

Agents

Every agent and subagent, its status (running, waiting, done, failed) and the task it is on. Grouped so a subagent's work does not tangle with the main thread.

Activity

The per-agent stream of prompts, tool calls and results as they land. The append-only event log behind it makes replay free.

Token budget

A bar that fills as reads pull bytes into context. With sp_* tools on, the bar crawls. With whole-file reads, it lurches.

Plan + mind map

The current plan and a Mermaid mind map of the session's agents, redrawn as events arrive. Same renderer as /slipstream:mindmap.

Honest about what it is

It is a local observability dashboard for your session. It watches and visualises, it does not drive. Nothing leaves the machine, there is no telemetry, the bind is 127.0.0.1 only, and obvious secrets are pattern-redacted before they ever reach the log. Auto-open lives behind a setting in .claude/slipstream/dashboard.json, and SLIPSTREAM_DASHBOARD=0 disables it per session.

Before and after, real token numbers

Numbers from this repository on my machine (Apple Silicon, Node 25), using slipstream's own conservative 3.6 bytes-per-token estimate from src/context/budget.ts.

ApproachBytes into contextApprox tokensSaving
Whole-file Read of src/map/retrieve.ts4,841~1,345baseline
sp_symbol(retrieve.ts, retrieveSymbol)1,381~38471% fewer
Reading every file in src/146,150~40,597naive orient
sp_map index instead7,821~2,1735.4% of reading everything

The dashboard's token-budget bar makes this visible while it happens. With the tools on, the bar crawls. With whole-file reads, it lurches. The discipline that prevents sp_symbol from ever returning the whole file lives in src/mcp/tools.ts where it cannot be bypassed.

The bundled MCP tools

Nine tools, served over stdio by dist/mcp/index.js. Every one returns the smallest correct thing.

sp_map
sp_map()

The compact project map: every file, its exported symbols and a one-line purpose. No file contents. The agent orients with this before it reads anything.

sp_symbol
sp_symbol(file, symbol)

Just that symbol's source slice, with its doc comment. Walks braces from the declaration line. A single call replaces opening the whole file.

sp_lines
sp_lines(file, start, end)

Exactly that line range, bounded. No surrounding context, no leakage. For when the slice you want is a block, not a symbol.

sp_search
sp_search(query)

Ranked file locations for a query. Returns locations, not contents, so the agent decides what to slice next.

sp_remember
sp_remember(fact)

Write a durable Markdown fact into the memory store under .claude/slipstream/memory/. Survives compaction. Reloaded on next session.

sp_recall
sp_recall(query?)

Read memories back into the turn. With a query, ranks by signal. Without, returns the index. Capped under a ~1,200 token ceiling.

sp_forget
sp_forget(slug)

Remove a stale fact. The MEMORY.md index regenerates so the durable view stays clean.

sp_budget
sp_budget()

The context-budget level (ok / warn / compact) and a conservative token estimate from bytes-into-context at 3.6 bytes per token.

sp_mindmap
sp_mindmap()

The project rendered as a themed Mermaid mind map, returned to chat or written to a self-contained HTML artifact.

Deep dive: MCP-Tools wiki → . Token-Efficiency →

Lossless compaction + smart recall

The two memory features that make a long session survivable. One catches the thread before it is trimmed. The other refuses to dump the whole store back on the next start.

Lossless compaction

The agent platform fires PreCompact just before it summarises and trims the conversation, which is exactly the moment the thread tends to blur. slipstream's hook reconstructs what happened from the dashboard event log, builds a structured digest (open task, decisions, files touched, next step) in src/memory/digest.ts, and writes it to the store as a durable fact.

On the next session start it is reloaded first, so a resumed session picks up where it left off rather than from a lossy summary. The hook is idempotent. If the session is resumed mid-compact, the digest is updated, not duplicated.

Smart recall, not load-everything

A naive memory layer dumps the whole store back into context every session, which costs more tokens the larger and more useful it grows. slipstream instead builds a task signal from the git branch, the files changed in the working tree and the last prompt, ranks memories against it (src/memory/recall.ts), and reloads only the relevant subset under a hard ~1,200 token ceiling, plus the MEMORY.md index for the rest.

With no signal it loads nothing and defers to the index, because loading arbitrary memories with no signal is the behaviour we are avoiding.

Recall, diagrammed

Signal-ranked recall on session start. Empty signal short-circuits to nothing.

rendering
Signal-ranked recall: branch + changed files + last prompt rank memories; cap under a token ceiling.

One line in the status bar

Context budget level, durable memory count, active skill, model. The formatting is a pure function, unit-tested in src/statusline, so the bar never lies about the helper underneath.

cp | ctx 12% ok | mem 4 | skill scoped-read | Opus 4.8

Terse output style

A bundled output style under output-styles/slipstream.md tuned for high-signal, low-token answers. Switch to it with /output-style slipstream to spend fewer tokens per turn without losing precision. Pairs with the statusline so the cost stays visible.

/output-style slipstream
# answers go terse, code blocks lean, no fluff

Three shipped subagents

Lean, token-disciplined subagents under agents/, each using the MCP tools rather than whole-file reads. Delegate with the Task tool, for example "use sp-reviewer to check this before I push".

sp-shipper

Scaffold to deployed

Drives a small production site through the integration skills end to end: scaffold, wire auth, set up Supabase with row-level security, deploy to Cloudflare or Vercel, attach a domain, send the first transactional mail. Every shipping skill has a verification gate the agent must run; sp-shipper refuses to advance past a red one.

sp-schema

Postgres + RLS

Designs and migrates a Supabase / Postgres schema with row-level security that denies by default and explicit policies per role. Uses sp_symbol and sp_lines instead of reading whole migration files, so it stays inside budget even on a real codebase.

sp-reviewer

Pre-push guardrail

Runs lint, build, the test suite and a secret scan, then delivers a clear FAIL verdict that blocks the push when something is off. Designed to be invoked as "use sp-reviewer to check this before I push", so the green light is a discrete, auditable step.

Subagents wiki →

Slash commands

Nine commands under commands/. Each is a thin wrapper around the helper, audited in one file, unit-tested where shape matters.

CommandWhat it does
/slipstream:doctorEnd-to-end install verifier. 12+ PASS / FAIL checks: MCP server built and declared, every hook wired, memory store reachable, helper CLI built, statusline and output style present, manifest valid.
/slipstream:mapBuild the compact project map once. Walks the tree, records exported symbols and a one-line purpose per file, writes the index the agent will read from.
/slipstream:rememberSave a durable decision to the memory store. One Markdown file per fact with frontmatter, plus a regenerated MEMORY.md index.
/slipstream:recallPull facts back into the turn. With a query, signal-ranks against the working tree. Without, returns the durable index for the agent to scan.
/slipstream:forgetDrop a stale fact by slug. Index regenerates so the durable view stays clean. Used when a decision is reversed.
/slipstream:statusOne screen: current plan, context-budget level with recommendation, durable memory count, project map size.
/slipstream:mindmapRender the project as a themed Mermaid mind map. Inline in chat or written to a self-contained HTML artifact under .claude/slipstream/dashboard/.
/slipstream:dashboardPrint the dashboard URL again. Useful after a reload. Starting is idempotent so the running server is reused.
/slipstream:validateRun plugin-validate against the manifest. Fails loudly on a malformed skill or a missing hook declaration.

Architecture, end to end

Hooks, helper, MCP server, map, memory, event log, dashboard. Same diagram as the README, themed for the site.

rendering
Architecture: host fires hooks, helper writes JSONL, MCP serves tools, local SSE server renders the dashboard.

Architecture wiki → . Data formats → . Design decisions →

The five pillars

Scoped-read tools, durable memory, lossless compaction, the live dashboard, the statusline. Each one earns its place.

01

Precise tools, not whole-file reads

A bundled MCP server (src/mcp) exposes sp_map, sp_symbol, sp_lines and sp_search. The agent orients with the map and pulls one declaration or one line range. The discipline lives in src/mcp/tools.ts where it cannot be bypassed.

02

Persistent memory with lossless compaction

A file-based store under .claude/slipstream/memory/, one Markdown fact per file plus a regenerated MEMORY.md index. A PreCompact hook writes a structured session digest the instant before the window is trimmed.

03

Lossless context across sessions

On session start, recall reloads the digest first, then signal-ranks the store against the git branch, the working tree and the last prompt, and loads only the relevant subset under a ~1,200 token ceiling.

04

Live local agent dashboard

Session start boots a 127.0.0.1 server and prints the URL into chat. Four panels: agents, the activity stream, the token-budget bar and a Mermaid mind map. The append-only event log doubles as replay.

05

Budget and skill in the statusline

A one-line statusline renders the context-budget level, the durable memory count, the active skill and the model. The formatting is a pure function, unit-tested so the bar never lies.

What is in the box

Everything below is shipped today and covered by the test suite (88 tests across 11 files).

PreCompact session digest

A hook reconstructs the open task, decisions, files touched and next step from the dashboard event log, then writes one structured fact to memory before the window is trimmed. The next session reloads it first.

Signal-ranked recall

On session start, recall builds a task signal from the git branch, the files changed in the working tree and the last prompt, ranks memories against it and reloads only the relevant subset under a hard token ceiling. With no signal it loads nothing.

Hand-rolled MCP server

A small newline-delimited JSON-RPC stdio loop, no SDK dependency. The slice of the protocol in play (initialize, tools/list, tools/call) is small and stable. The request handler is a pure exported function so tests drive it without spawning a process.

Three shipped subagents

sp-shipper drives a site from scaffold to deployed across integration skills, refusing to advance past a red verification gate. sp-schema designs and migrates a Supabase schema with row-level security that denies by default. sp-reviewer is a pre-push guardrail with a hard FAIL verdict.

Terse output style

A bundled output style tuned for high-signal, low-token answers. Switch to it with /output-style slipstream to spend fewer tokens per turn without losing precision.

Local-only by construction

The dashboard binds 127.0.0.1 on a free port. Nothing leaves the machine. Obvious secrets are pattern-redacted before they reach the event log. No telemetry, no accounts, no hosted layer.

Append-only event log

Every lifecycle hook (SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, SubagentStop, Stop, PreCompact) writes one JSON event to .claude/slipstream/dashboard/<session>.jsonl. State is a pure fold over the log, so replay is free.

Doctor end to end

/slipstream:doctor checks the MCP server is built and declared, every hook is wired, the memory store is reachable, the helper CLI is built, the statusline, output style and subagents are present, and the plugin manifest is valid. PASS / FAIL per check.

Run it in any IDE

Two layers. The full plugin (skills, hooks, memory, lossless compaction, statusline, live dashboard) runs inside a coding-agent host that loads the plugin format. The MCP tools, the token-saving core, are standard Model Context Protocol and work in any MCP-capable editor.

Full experience

In a plugin-capable agent host

You get the bundled MCP server, the slash commands, the 59 skills, every lifecycle hook, the live dashboard, the statusline, the terse output style and the three subagents. Node 20 or newer on your PATH.

# In your coding-agent host
/plugin marketplace add sarmakska/slipstream
/plugin install slipstream

# Then, in the project
/slipstream:map        # build the project map once
/slipstream:doctor     # verify the install end to end
/slipstream:status     # plan, budget, memory count, map
MCP-only path

In Antigravity, Cursor, Windsurf and other MCP editors

These editors do not load the host's plugin format, so the skills, hooks, slash commands and dashboard are not available there. The nine sp_* tools are. Build the server once, then register the absolute path under your editor's mcpServers block.

# Any MCP-capable editor: Cursor, Windsurf, Antigravity, others
git clone https://github.com/sarmakska/slipstream
cd slipstream
pnpm install
pnpm build

# Register the server (paths vary by editor)
# Cursor: .cursor/mcp.json
# Windsurf: ~/.codeium/windsurf/mcp_config.json
# Antigravity: Settings -> MCP
{
  "mcpServers": {
    "slipstream": {
      "command": "node",
      "args": ["/absolute/path/to/slipstream/dist/mcp/index.js"]
    }
  }
}
What the agent actually does

Instead of whole-file reads

// What sp_map returns at orient time, JSON in chat
{
  "files": [
    { "path": "src/map/retrieve.ts",
      "purpose": "retrieve a symbol or line range from a file",
      "symbols": ["retrieveSymbol", "retrieveLines"] }
  ]
}

// What sp_symbol returns instead of opening the whole file
/** Walk braces from the declaration line to return one symbol slice. */
export function retrieveSymbol(file: string, symbol: string): string { /* ... */ }
Compaction crossing

How slipstream changes the shape of a session

Orient, pull, compact, reload. The hook writes a durable digest the instant before the trim, the next session reloads it first.

rendering
Long session into compaction: PreCompact writes a structured digest to memory, SessionStart reloads it first plus a signal-ranked subset.

Run-in-Any-IDE wiki → . Install-in-VS-Code →

vs other context-saving approaches

Honest about the trade-offs. A hand-written instructions file is free but it rots. A summariser is automatic but it is lossy by design. Manual notes are precise but they evaporate the moment you stop typing them.

ConcernslipstreamHand-written instructions fileSummariser toolManual notes
Reads costSlice or symbolWhole-file by defaultWhole-file then summariseWhole-file by default
Memory across sessionsStructured store + indexSingle hand-written fileLossy summaryManual rewrite
Compaction safetyPreCompact digest, losslessLoses what is not in the fileLossy by designLoses everything
Recall strategySignal-ranked, ~1,200 token capAlways loaded, full fileWhatever the summariser keptWhatever you remember to paste
Watching the agentLive dashboard + replayNoneNoneNone
StatuslineBudget, mem, skill, modelNoneNoneNone
Verification gates59 skills, each gatedNoneNoneWhatever you wrote
Data leaves the machineNeverNever (it is just a file)Depends on toolNever
LicenseMITVaries

Full comparison page in the wiki, with the design rationale behind each choice: Comparisons →

A guardrailed skill library

Fifty-nine skills under skills/, grouped by area. Each shipping skill carries a verification gate, a real check the agent must pass before advancing. The library targets the stack I actually ship on, not a universal scaffolder.

frontend
7 skills

tailwind, forms, router, dark-mode, responsive-layout, component-library

backend
5 skills

hono-api, zod-validation, error-handling, rate-limit, openapi

supabase
7 skills

init, schema, rls, auth, edge-function, storage, typegen

cloudflare
6 skills

worker, pages, d1, kv, r2, secrets

vercel
4 skills

link, env, preview, deploy

resend
4 skills

setup, domain, transactional, webhook

auth
4 skills

session, password-reset, oauth, rbac

payments
4 skills

stripe-setup, checkout, subscriptions, webhooks

seo
4 skills

meta-tags, open-graph, structured-data, sitemap

analytics
3 skills

plausible, web-vitals, events

git
5 skills

init-repo, feature-branch, conventional-commit, pull-request, release-tag

memory + context
6 skills

memory-capture, memory-recall, memory-prune, scoped-read, context-budget, compact-and-offload

Full catalogue: Skill-Catalogue wiki → . How the engine runs them → . Writing a skill →

/slipstream:doctor

A one-shot end-to-end install verifier. Fifteen checks, each PASS or FAIL with the exact reason. Run it after install, run it after upgrades, run it when something feels off.

Plugin manifest validPASS
MCP server built (dist/mcp/index.js)PASS
MCP server declared in plugin.json mcpServersPASS
SessionStart hook wiredPASS
UserPromptSubmit hook wiredPASS
PreToolUse hook wiredPASS
PostToolUse hook wiredPASS
SubagentStop hook wiredPASS
Stop hook wiredPASS
PreCompact hook wired (lossless compaction)PASS
Memory store reachablePASS
Helper CLI builtPASS
Statusline command presentPASS
Output style presentPASS
Subagents present (sp-shipper, sp-schema, sp-reviewer)PASS

The doctor walkthrough: Troubleshooting wiki →

Tech stack

Boring on purpose. The MCP path has zero runtime dependencies. The server path uses only node:http, no Express, no socket library. The event store is a JSONL file, not a database.

TypeScriptNode 20+MCP (stdio JSON-RPC)node:http + SSEJSONL event logMermaidvitestpnpmzero runtime deps on MCP path

Frequently asked

More in the wiki FAQ →

Do I have to use the full plugin to get the token savings?+

No. The MCP tools (sp_map, sp_symbol, sp_lines, sp_search, sp_remember, sp_recall, sp_forget, sp_budget, sp_mindmap) are standard Model Context Protocol. They register in any MCP-capable editor: Antigravity, Cursor, Windsurf, others. The skills, hooks, dashboard and lossless compaction need the host plugin layer; that is the trade-off.

Is the token budget accurate?+

It is a conservative estimate from bytes-into-context at ~3.6 bytes per token, not the real internal counter. It is tuned to warn early and compact a little before it has to. The wording everywhere says "estimate". I would rather be honestly approximate and conservative than precise-looking and wrong.

Does anything leave the machine?+

No. The dashboard server binds 127.0.0.1 on a free port. The memory store is files in your project. The event log is a JSONL file. There is no telemetry, no accounts, no hosted layer. Obvious secrets are pattern-redacted before they reach the log, but treat redaction as belt-and-braces rather than a vault.

What happens during the long session?+

Reads stay slice-sized because the agent is nudged to use sp_symbol and sp_lines. The PreToolUse hook warns before a large whole-file read. Just before the platform compacts, the PreCompact hook writes a structured digest of the open task, decisions and next step. The next session reloads that digest first and signal-ranks the rest.

Can the dashboard steer the agents?+

No, by design. It observes, it does not control. It cannot pause a tool call or redirect a subagent. The honest framing is "a local observability dashboard for your session". If you want control plane, that is a different product.

Why a hand-rolled MCP server instead of the SDK?+

The slice of the protocol the agent loop drives is small and stable: initialize, tools/list, tools/call. A plugin that bundles a server should add as little as possible to the install. The cost is I implement the framing; the benefit is zero runtime dependencies on the MCP path and a server I can audit in one file. The request handler is a pure exported function so tests drive it directly.

Does it work in Cursor, Windsurf, Antigravity?+

The MCP tools do, fully. Build the server once, register the absolute path under your editor's mcpServers block, and the agent gains all nine sp_* tools. The full plugin layer (skills, hooks, dashboard, lossless compaction) needs a coding-agent host that loads the plugin format. Use the full host when you can; use the MCP path when you cannot.

Is it production-ready?+

It is the runner I ship with. 88 tests across 11 files: dashboard event validity, the concurrency-safe append-only writer under 25 parallel writers, a real SSE server end to end, idempotent start, replay, the real MCP server spawned over stdio for tools/list and a sp_symbol call, the PreCompact digest builds and reloads, signal-ranked recall returns only the relevant subset within budget, the statusline string is pinned, and doctor runs against both the real tree and a deliberately broken one.

MIT . local-only . zero telemetry

Ready to ship a sane long session?

Star the repo, drop it into your agent host, build the map once and run doctor. The next long session stops bleeding tokens and keeps its thread.

All open-source projects