This is the working guide. If you have read the architecture essay^[5] you know what slipstream is and why it exists. This post is about using it day to day: per-editor install, the first session, the sp_ tools, the nine dashboard views, the cross-tab workflow, the brief, the benchmark, the doctors, recipes and gotchas. Twenty minutes from install to your first regenerated benchmark number.

The repo is at github.com/sarmakska/slipstream^[1]. v1.0 is the current release. MIT, 127.0.0.1 only, no telemetry, 321 tests, CI green.

Install in the editor you actually use

Three flavours: Claude Code, MCP via slipstream-setup, and the bare hand-edit. Pick one.

Claude Code (recommended)

/plugin install slipstream

That wires the hooks, the MCP server, the skills, the statusline and the dashboard at once. Restart Claude Code, open any project, and the agent has fourteen sp_ tools and the live dashboard auto-starts on first tool use.

Cursor, Windsurf, Antigravity, VS Code, JetBrains

git clone https://github.com/sarmakska/slipstream
cd slipstream
pnpm install && pnpm build
npx slipstream-setup --editor=auto

--editor=auto detects which config to write based on the dotfiles in the current project (.cursor/, .windsurf/, .antigravity/, .vscode/). You can be explicit: --editor=cursor, --editor=windsurf, etc. The command is idempotent. Re-running it never duplicates an entry. It refuses to double-wire when both the plugin and a project .mcp.json register slipstream, so you cannot break your own install.

Restart the editor. The handshake should now advertise fourteen sp_ tools.

Bare hand-edit (last resort)

Open your editor's MCP host config file. Add the slipstream block under mcpServers:

{
  "mcpServers": {
    "slipstream": {
"command": "node",
"args": ["/absolute/path/to/slipstream/dist/mcp/index.js"]
    }
  }
}

Restart. The wiki has the exact path per editor^[2].

Verify the install

slipstream doctor
slipstream memory doctor

The first runs seventeen install checks (claude-dir, mcp-build, plugin manifest, hooks wired, PreCompact hook, memory dir, observations dir, cli-build, statusline, output style, subagents, plugin-valid, dashboard port and socket, duplicate-registration, double-emit, stale-dashboard) and prints a one-line fix per failure. If everything passes you are good. If anything fails, the fix line is usually enough to clear it.

The second checks the memory store: total observations, duplicates, stale entries, by-type breakdown. Exits non-zero if the store needs attention, so a CI step can gate on it.

Live

GitHub repo stats (live)

Fetching live data…

Source: GitHub REST API · cached 10–60 min

Your first session

Open the editor on any non-trivial repo. Type a prompt that involves reading code:

Refactor the validation logic in src/auth.ts so the password rule is configurable.

What slipstream does next, in order:

The SessionStart hook runs. It builds a bounded knowledge feed (what this project is, how it is organised, the most-connected files from the code graph, what was recently asked, what is remembered) and injects it into the agent's context.
The agent calls sp_map to get the file index. This is roughly 2 KB instead of the 40 KB it would cost to read every file in src/.
The agent calls sp_symbol src/auth.ts validateUser. This returns the function body (around 600 bytes) instead of the 18 KB whole file.
The agent writes the change.
The Stop hook folds the turn into the observation memory, distils a session summary, and posts a status line to the cross-tab bus.

The session ends with one observation written, the file changed, around 5 KB of context spent (versus around 60 KB the default loop would spend).

Now open the dashboard.

sp_dashboard

The agent receives a 127.0.0.1 URL. Click it. Or run slipstream dashboard start from a terminal.

The fourteen sp_ tools, when to call each

Spec	When	What it returns	Typical bytes
sp_map	First call on a new project or after big refactors	The compact project index: files, exported symbols, one-line purpose	1.5 to 4 KB
sp_symbol	You know the name of the thing you need	The body of one named declaration	300 to 1500 B
sp_lines	You need a specific line range	Only the requested slice	100 to 800 B
sp_search	You know the term, not the file	Ranked file:line locations, no bodies	400 to 1200 B
sp_remember	A durable fact worth keeping across sessions	Confirmation; nothing else	< 100 B
sp_recall	Resume a topic from a previous session	Matching memories with their detail	500 B to 4 KB
sp_search_memory	You think slipstream has seen this before	Cheap ranked summaries of relevant observations	200 B to 2 KB
sp_observations	You want full bodies for ids you have filtered down	Full observation detail by id	500 B to 8 KB
sp_timeline	You want the said-and-done of one session	Ordered turns with the tools called and the files touched	1 to 6 KB
sp_budget	Before a big read, or near the warn threshold	served / target tokens, level, recommendation	< 300 B
sp_digest	Cross-editor compaction stand-in (no PreCompact)	Path of the written digest plus token estimate	< 200 B
sp_resume	Start of a new session in an MCP-only editor	The latest digest body for reload	500 B to 4 KB
sp_dashboard	You want the live URL	127.0.0.1 URL, and whether it just started	< 100 B
sp_savings	You want the optimisation tally	savedTokens, scopedReads, pct, dollar cost	< 200 B

Chart

Three reads you should be making instead of Read

Source: pnpm benchmark on real files

That chart shows the four most common reads. Pick the smallest one that answers the question. sp_search first when you do not know the file. sp_symbol when you know the name. sp_lines when you need an exact range. sp_map once per session to orient. Read only when you genuinely need the whole file.

A small example: change a single function

sp_search "validateUser"          // returns src/auth.ts:42
sp_symbol "src/auth.ts" "validateUser"   // returns the 612-byte function body
// agent writes the change with Edit

Total context: about 1.2 KB. Cost of doing the same with whole-file reads: about 22 KB. The ratio holds across most realistic edits.

A small example: orient on a fresh project

sp_map                            // 2.3 KB index of files, exports, purpose
sp_recall "auth flow"             // 1.5 KB of relevant durable memories
sp_search_memory "validator"      // 800 B of past observations on this term

Total context to be oriented and ready: under 5 KB. The default loop reads ten files (around 60 KB) to do the same orientation. The chart in the architecture essay shows this exact comparison.

The nine dashboard views, view by view

Sidebar groups into Now, History and Knowledge. Walk through each.

Now: Overview

The plain-English answer to "what is this project?" It shows:

A one-paragraph project summary, downloadable as Markdown via slipstream brief or the Download brief button.
Headline KPIs: sessions, observations, files touched, opt %, durable memories, drift flags.
The dollar cost of tokens saved at the assumed per-million rate (which is stated so the number is honest).

Open this once a day to get the lay of the land.

Now: Live activity

The current session in real time. Six KPI tiles with sparklines. Filterable timeline. Plan. Inline SVG mind map. Per-skill activity panel. In-place token budget editor.

Every tab opens with an insights band: one natural-language paragraph plus three to five bullets that describe the view rather than only tabulate it. Read the band first, drill into the panels second.

If you have multiple Claude Code tabs open, the Agents office strip shows each tab as a character at a desk, with the task it is currently doing and the files in flight. Watching the whole team at once is genuinely useful.

Now: Said and done

The said-to-did timeline of every prompt and every tool call, grouped by turn. The honest record of what the agent actually did versus what you asked.

Use this when you suspect the agent did something it should not have, or when you want to confirm it did the thing you asked.

History: Daily journal

Per-day digest. Six tiles (observations, sessions, files, drift, tools, skills). Top files for the day. Tools used as colour-coded pills. The list of sessions involved, with prev / today / next navigation.

Open this on Monday morning to see what last week looked like. Click any day on the Project tab's heatmap to land here pre-filtered.

History: Sessions

Clickable table of every recorded session. Click a row, the session opens: prompts, tool calls, files touched, exchanges, failures, the full timeline. Plus a Download report link that emits a Markdown document, the said-to-did story plus a summary. The honest version of team sharing.

The Sessions tab used to feel empty because the rows were not clickable. v0.31 fixed that.

Knowledge: Project stats

Six project-wide KPIs at the top: sessions, observations, unique files, opt %, memories, drift. Then the 365-day GitHub-style activity heatmap clickable per day, the file leaderboard with violet-gradient bars, the inline-SVG donut for observations by kind, and the distilled lessons grid.

Open this to see patterns. Files you keep returning to are the candidates for refactoring. The lessons grid surfaces topics that have recurred enough across sessions to be worth remembering as an explicit rule.

Knowledge: Memory

Full-project observation search with kind filter chips (edit, plan, decision, search, map, error, run) and colour-coded result badges. Click any hit to expand the full detail. This is the search layer over the observation store; sp_search_memory calls the same backend.

Use this when you want to find something the agent did three weeks ago.

Knowledge: Memory graph

Files on an outer ring sized by how often they were touched, sessions on an inner ring, an edge wherever a session changed a file. Navigate the memory by relationship rather than by list. Click a file to read its sessions. Click a session to read the files it changed.

This view rewards staring. After about thirty seconds you start to see which files are the load-bearing ones and which sessions were heavy.

Knowledge: Code map

The new interactive d3 code dependency graph. Files as nodes, imports as edges, force-directed with zoom, pan, drag, search, area colouring and god nodes ringed in white. Click any node to read its imports and importers in a side panel.

Open this on a codebase you do not know well. It tells you which files are central faster than git log ever will.

slipstream graph prints the same data on the CLI. /api/codegraph serves it over HTTP, so Claude can read the structure too.

Cross-tab workflow

The cross-tab agent bus is the v1.0 feature most people will notice first. The mechanics:

Open two Claude Code tabs on the same project.
In Tab A, work on the auth refactor.
In Tab B, work on the dashboard polish.
At the end of each turn, each tab posts its open thread, files in flight, and a one-line status to /.claude/slipstream/bus.jsonl.
At the next prompt or session start, each tab sees the other's status.
The agents stop duplicating work. Tab B knows Tab A is touching the auth file and stays out of its way.

The bus is a local file. No daemon, no message queue, no network. Five tests cover it.

The platform does not allow live mid-turn cross-process messaging, so the coordination is turn-boundary, not real-time. That is enough for the duplication problem in practice.

The brief, the graph, the benchmark

Three commands that pay back immediately.

slipstream brief

slipstream brief

Dumps everything slipstream knows about the current project into one Markdown document: what it is, how it is organised, the architecture table, the durable memory store, lessons, instincts and recent work. Useful to share with a new contributor, to feed a colleague who is helping debug, or to pin in your own notes when you are paused on a project for a few weeks. Also available as the Download brief button on the Overview view and as the /api/brief endpoint.

slipstream graph

slipstream graph

Prints the code dependency graph: every file, its imports and importers, with the god nodes flagged. Pipe it to a file to keep a snapshot. Or open the Code map view in the dashboard for the interactive version.

pnpm benchmark

pnpm benchmark

Runs scripts/benchmark-token-savings.mjs against three representative files in the current project. Measures whole-file reads versus scoped symbol reads. Emits a Markdown table. The number is reproducible from a clean clone.

Run it once on your own project, paste the table into your README. That is more persuasive than any marketing claim.

Recipes

Three common workflows worth knowing.

Resume a project after two weeks away

slipstream brief        // gets you the latest project state
sp_recall "where I left off"   // reads the most recent durable memory
sp_resume               // load the latest compaction digest, if there is one

The brief gives you the full project lay. The recall gives you your last working topic. The resume gives you the agent's working state from the session that ended.

Audit what an agent actually did this morning

// In the dashboard:
History > Daily journal > today > pick a session > Download report

The report is a Markdown document of every prompt and every tool call, with a summary. The honest record. Share it in a PR description if useful.

Run a long refactor without losing state to compaction

// At the start, in Claude Code:
sp_remember "Refactor goal: split auth.ts into validation, session and provider modules."
// During the refactor, when sp_budget reports warn:
sp_digest                     // write a lossless digest
// Or just continue, and Claude Code's PreCompact hook will do it for you.
// On the next session:
sp_resume                     // load the digest
sp_recall "Refactor goal"     // get the goal back

The combination of the durable goal and the compaction digest means even a multi-day refactor across many sessions does not lose state.

The doctors

Two of them.

slipstream doctor

slipstream doctor

Seventeen install checks. Each FAIL prints a one-line fix. The checks that catch the most common pitfalls:

mcp-build: dist/mcp/index.js exists. Fix: pnpm install && pnpm build.
duplicate-registration: slipstream wired via both the plugin and a project .mcp.json, with the plugin actually loaded. Fix: pick one path and remove the other.
double-emit: hooks active AND SLIPSTREAM_MCP_EMIT=1. Fix: unset the env var.
stale-dashboard: a previous dashboard build is still running on the recorded port. Fix: sp_dashboard restarts it cleanly.
hooks-wired: SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, Stop all registered. Fix: reinstall the plugin.

Run doctor before you ship anything. It catches the 95% case in 30 seconds.

slipstream memory doctor

slipstream memory doctor

Health check for the memory store. Total, duplicates, stale, by-type breakdown. Exits non-zero if the store needs attention. Wire this into a pre-commit hook if you want to catch memory drift early.

The seven gotchas

Things people hit in the first week.

"The dashboard does not start"

Run slipstream doctor. Look for dashboard-port FAIL. Either kill the squatting process, or call sp_dashboard which will restart it on a fresh port (v0.6.1 fixed the version-aware restart so this is reliable now).

"Observation memory is empty in Cursor"

If you installed v0.7.1 or older, this was a real bug. v0.7.2 fixed it: the four memory-reading sp_ tools now call captureObservations with flushOpen on every read in MCP-only mode. Upgrade to v1.0 and the memory populates the moment you query it.

"The agent reads whole files even though slipstream is installed"

The agent has to be told. The using-slipstream skill that ships in v1.0 plus a per-turn hook reminder do the steering. If they are not active in your editor, add the skill manually: @skill using-slipstream at the start of the session. The discipline is the difference between the theoretical savings and the savings you actually observe.

"Multiple Claude Code tabs are duplicating work"

That used to be normal. Upgrade to v1.0. The cross-tab bus posts each tab's open thread and files in flight at turn end; every other tab sees this at its next prompt. If the bus file is missing or empty, run slipstream doctor to check the install.

"The benchmark number does not match the marketing"

Two reasons. First, the headline is per-read efficiency (around 95%), not end-to-end session efficiency (which depends on how often the agent re-reads, recalls or pulls the wrong file). The benchmark script states this plainly. Second, the numbers in this post come from a specific repo (this one); your repo may have different file shapes. Run pnpm benchmark on your own code and use that number.

"I cannot find the local URL after a restart"

/.claude/slipstream/dashboard.url is written on every dashboard start. Cat it. Any statusline, editor extension or shell script can read it.

"The insights band reads as data not prose"

That is a template bug. Open an issue on the repo with the exact paragraph that read wrong, and which view it came from. The insight generators in src/dashboard/insights.ts are pure functions with deterministic templates; fixing the prose is a one-line PR most of the time.

What to do this afternoon

If you have made it this far and slipstream is installed, do these three things now:

Run pnpm benchmark on your project. Paste the table into your README.
Open the dashboard and click through all nine views once. The Code map is the one most likely to change how you think about your repo.
Set up two Claude Code tabs on the same project and try the cross-tab workflow on a small task. The Agents office view is the most fun half-hour you can spend with this tool.

The product page is at sarmalinux.com/products/slipstream. The repo is at github.com/sarmakska/slipstream^[1]. The setup guide for every editor is on the wiki^[2]. The MCP specification slipstream implements is at modelcontextprotocol.io^[3]. The competitive positioning document is in the repo at STRATEGY.md^[4].

If something in here broke for you, please raise an issue. The closed issues all have detailed resolution comments so you can see the bar. Pull requests are welcome and reviewed in days, not weeks.

That is the working guide. Twenty minutes from install to your first regenerated benchmark number. The discipline starts on session two.

---

A note on this post

The commands, file paths, dashboard views and outputs in this post are real and reproducible on a fresh clone of the slipstream repository at the v1.0 release. Token counts come from the checked-in benchmark script. Citations link to the primary source.

About the data

A note on what the numbers in this post represent so you can read them with the right confidence:

"My own bench" rows are personal measurements on my own hardware. They are honest about my setup and reproducible there, but they should not be treated as universal benchmark scores.
Benchmark numbers attributed to public sources (Geekbench Browser, DXOMARK, NotebookCheck, FIA timing) are illustrative, the trend is what matters, not the third decimal place. Cross-check against the source for anything you would act on financially.
Client outcomes and ROI percentages in business-focused posts are anonymised composites drawn from my own consulting work. Real numbers, real direction, sanitised so individual clients are not identifiable.
Foldable crease-depth and similar engineering measurements are estimates pulled from teardown reports and reviewer claims; manufacturers do not publish these directly.
Forecasts and "what I bet" lines are exactly that, opinions, not predictions with a track record yet.

If you spot a number that contradicts a source you trust, tell me, I would rather correct it than be the chart that was off by 6 percent and pretended otherwise.

Comments

By signing in, Sarma will receive your name, avatar, email, sign-in provider, and approximate location (country/city, derived from your IP) for moderation and reply purposes. None of this is shown publicly, only your name and avatar appear on the post. No newsletter, no marketing, no third-party sharing.

Loading comments…

References

Comments

Have a project in mind?