This is the working guide. If you have read the architecture essay[5] you know what slipstream is and why it exists. This post is about using it day to day: per-editor install, the first session, the sp_ tools, the nine dashboard views, the cross-tab workflow, the brief, the benchmark, the doctors, recipes and gotchas. Twenty minutes from install to your first regenerated benchmark number.
The repo is at github.com/sarmakska/slipstream[1]. v1.0 is the current release. MIT, 127.0.0.1 only, no telemetry, 321 tests, CI green.
Install in the editor you actually use
Three flavours: Claude Code, MCP via slipstream-setup, and the bare hand-edit. Pick one.
Claude Code (recommended)
/plugin install slipstreamThat wires the hooks, the MCP server, the skills, the statusline and the dashboard at once. Restart Claude Code, open any project, and the agent has fourteen sp_ tools and the live dashboard auto-starts on first tool use.
Cursor, Windsurf, Antigravity, VS Code, JetBrains
git clone https://github.com/sarmakska/slipstream
cd slipstream
pnpm install && pnpm build
npx slipstream-setup --editor=auto--editor=auto detects which config to write based on the dotfiles in the current project (.cursor/, .windsurf/, .antigravity/, .vscode/). You can be explicit: --editor=cursor, --editor=windsurf, etc. The command is idempotent. Re-running it never duplicates an entry. It refuses to double-wire when both the plugin and a project .mcp.json register slipstream, so you cannot break your own install.
Restart the editor. The handshake should now advertise fourteen sp_ tools.
Bare hand-edit (last resort)
Open your editor's MCP host config file. Add the slipstream block under mcpServers:
{
"mcpServers": {
"slipstream": {
"command": "node",
"args": ["/absolute/path/to/slipstream/dist/mcp/index.js"]
}
}
}Restart. The wiki has the exact path per editor[2].
Verify the install
slipstream doctor
slipstream memory doctorThe first runs seventeen install checks (claude-dir, mcp-build, plugin manifest, hooks wired, PreCompact hook, memory dir, observations dir, cli-build, statusline, output style, subagents, plugin-valid, dashboard port and socket, duplicate-registration, double-emit, stale-dashboard) and prints a one-line fix per failure. If everything passes you are good. If anything fails, the fix line is usually enough to clear it.
The second checks the memory store: total observations, duplicates, stale entries, by-type breakdown. Exits non-zero if the store needs attention, so a CI step can gate on it.
Source: GitHub REST API · cached 10–60 min
Your first session
Open the editor on any non-trivial repo. Type a prompt that involves reading code:
Refactor the validation logic in src/auth.ts so the password rule is configurable.
What slipstream does next, in order:
- The SessionStart hook runs. It builds a bounded knowledge feed (what this project is, how it is organised, the most-connected files from the code graph, what was recently asked, what is remembered) and injects it into the agent's context.
- The agent calls
sp_mapto get the file index. This is roughly 2 KB instead of the 40 KB it would cost to read every file insrc/. - The agent calls
sp_symbol src/auth.ts validateUser. This returns the function body (around 600 bytes) instead of the 18 KB whole file. - The agent writes the change.
- The Stop hook folds the turn into the observation memory, distils a session summary, and posts a status line to the cross-tab bus.
The session ends with one observation written, the file changed, around 5 KB of context spent (versus around 60 KB the default loop would spend).
Now open the dashboard.
sp_dashboardThe agent receives a 127.0.0.1 URL. Click it. Or run slipstream dashboard start from a terminal.
The fourteen sp_ tools, when to call each
| Spec | When | What it returns | Typical bytes |
|---|---|---|---|
| sp_map | First call on a new project or after big refactors | The compact project index: files, exported symbols, one-line purpose | 1.5 to 4 KB |
| sp_symbol | You know the name of the thing you need | The body of one named declaration | 300 to 1500 B |
| sp_lines | You need a specific line range | Only the requested slice | 100 to 800 B |
| sp_search | You know the term, not the file | Ranked file:line locations, no bodies | 400 to 1200 B |
| sp_remember | A durable fact worth keeping across sessions | Confirmation; nothing else | < 100 B |
| sp_recall | Resume a topic from a previous session | Matching memories with their detail | 500 B to 4 KB |
| sp_search_memory | You think slipstream has seen this before | Cheap ranked summaries of relevant observations | 200 B to 2 KB |
| sp_observations | You want full bodies for ids you have filtered down | Full observation detail by id | 500 B to 8 KB |
| sp_timeline | You want the said-and-done of one session | Ordered turns with the tools called and the files touched | 1 to 6 KB |
| sp_budget | Before a big read, or near the warn threshold | served / target tokens, level, recommendation | < 300 B |
| sp_digest | Cross-editor compaction stand-in (no PreCompact) | Path of the written digest plus token estimate | < 200 B |
| sp_resume | Start of a new session in an MCP-only editor | The latest digest body for reload | 500 B to 4 KB |
| sp_dashboard | You want the live URL | 127.0.0.1 URL, and whether it just started | < 100 B |
| sp_savings | You want the optimisation tally | savedTokens, scopedReads, pct, dollar cost | < 200 B |
Source: pnpm benchmark on real files
That chart shows the four most common reads. Pick the smallest one that answers the question. sp_search first when you do not know the file. sp_symbol when you know the name. sp_lines when you need an exact range. sp_map once per session to orient. Read only when you genuinely need the whole file.
A small example: change a single function
sp_search "validateUser" // returns src/auth.ts:42
sp_symbol "src/auth.ts" "validateUser" // returns the 612-byte function body
// agent writes the change with EditTotal context: about 1.2 KB. Cost of doing the same with whole-file reads: about 22 KB. The ratio holds across most realistic edits.
A small example: orient on a fresh project
sp_map // 2.3 KB index of files, exports, purpose
sp_recall "auth flow" // 1.5 KB of relevant durable memories
sp_search_memory "validator" // 800 B of past observations on this termTotal context to be oriented and ready: under 5 KB. The default loop reads ten files (around 60 KB) to do the same orientation. The chart in the architecture essay shows this exact comparison.
The nine dashboard views, view by view
Sidebar groups into Now, History and Knowledge. Walk through each.
Now: Overview
The plain-English answer to "what is this project?" It shows:
- A one-paragraph project summary, downloadable as Markdown via
slipstream briefor the Download brief button. - Headline KPIs: sessions, observations, files touched, opt %, durable memories, drift flags.
- The dollar cost of tokens saved at the assumed per-million rate (which is stated so the number is honest).
Open this once a day to get the lay of the land.
Now: Live activity
The current session in real time. Six KPI tiles with sparklines. Filterable timeline. Plan. Inline SVG mind map. Per-skill activity panel. In-place token budget editor.
Every tab opens with an insights band: one natural-language paragraph plus three to five bullets that describe the view rather than only tabulate it. Read the band first, drill into the panels second.
If you have multiple Claude Code tabs open, the Agents office strip shows each tab as a character at a desk, with the task it is currently doing and the files in flight. Watching the whole team at once is genuinely useful.
Now: Said and done
The said-to-did timeline of every prompt and every tool call, grouped by turn. The honest record of what the agent actually did versus what you asked.
Use this when you suspect the agent did something it should not have, or when you want to confirm it did the thing you asked.
History: Daily journal
Per-day digest. Six tiles (observations, sessions, files, drift, tools, skills). Top files for the day. Tools used as colour-coded pills. The list of sessions involved, with prev / today / next navigation.
Open this on Monday morning to see what last week looked like. Click any day on the Project tab's heatmap to land here pre-filtered.
History: Sessions
Clickable table of every recorded session. Click a row, the session opens: prompts, tool calls, files touched, exchanges, failures, the full timeline. Plus a Download report link that emits a Markdown document, the said-to-did story plus a summary. The honest version of team sharing.
The Sessions tab used to feel empty because the rows were not clickable. v0.31 fixed that.
Knowledge: Project stats
Six project-wide KPIs at the top: sessions, observations, unique files, opt %, memories, drift. Then the 365-day GitHub-style activity heatmap clickable per day, the file leaderboard with violet-gradient bars, the inline-SVG donut for observations by kind, and the distilled lessons grid.
Open this to see patterns. Files you keep returning to are the candidates for refactoring. The lessons grid surfaces topics that have recurred enough across sessions to be worth remembering as an explicit rule.
Knowledge: Memory
Full-project observation search with kind filter chips (edit, plan, decision, search, map, error, run) and colour-coded result badges. Click any hit to expand the full detail. This is the search layer over the observation store; sp_search_memory calls the same backend.
Use this when you want to find something the agent did three weeks ago.
Knowledge: Memory graph
Files on an outer ring sized by how often they were touched, sessions on an inner ring, an edge wherever a session changed a file. Navigate the memory by relationship rather than by list. Click a file to read its sessions. Click a session to read the files it changed.
This view rewards staring. After about thirty seconds you start to see which files are the load-bearing ones and which sessions were heavy.
Knowledge: Code map
The new interactive d3 code dependency graph. Files as nodes, imports as edges, force-directed with zoom, pan, drag, search, area colouring and god nodes ringed in white. Click any node to read its imports and importers in a side panel.
Open this on a codebase you do not know well. It tells you which files are central faster than git log ever will.
slipstream graph prints the same data on the CLI. /api/codegraph serves it over HTTP, so Claude can read the structure too.
Cross-tab workflow
The cross-tab agent bus is the v1.0 feature most people will notice first. The mechanics:
- Open two Claude Code tabs on the same project.
- In Tab A, work on the auth refactor.
- In Tab B, work on the dashboard polish.
- At the end of each turn, each tab posts its open thread, files in flight, and a one-line status to
./.claude/slipstream/bus.jsonl - At the next prompt or session start, each tab sees the other's status.
- The agents stop duplicating work. Tab B knows Tab A is touching the auth file and stays out of its way.
The bus is a local file. No daemon, no message queue, no network. Five tests cover it.
The platform does not allow live mid-turn cross-process messaging, so the coordination is turn-boundary, not real-time. That is enough for the duplication problem in practice.
The brief, the graph, the benchmark
Three commands that pay back immediately.
slipstream brief
slipstream briefDumps everything slipstream knows about the current project into one Markdown document: what it is, how it is organised, the architecture table, the durable memory store, lessons, instincts and recent work. Useful to share with a new contributor, to feed a colleague who is helping debug, or to pin in your own notes when you are paused on a project for a few weeks. Also available as the Download brief button on the Overview view and as the /api/brief endpoint.
slipstream graph
slipstream graphPrints the code dependency graph: every file, its imports and importers, with the god nodes flagged. Pipe it to a file to keep a snapshot. Or open the Code map view in the dashboard for the interactive version.
pnpm benchmark
pnpm benchmarkRuns scripts/benchmark-token-savings.mjs against three representative files in the current project. Measures whole-file reads versus scoped symbol reads. Emits a Markdown table. The number is reproducible from a clean clone.
Run it once on your own project, paste the table into your README. That is more persuasive than any marketing claim.
Recipes
Three common workflows worth knowing.
Resume a project after two weeks away
slipstream brief // gets you the latest project state
sp_recall "where I left off" // reads the most recent durable memory
sp_resume // load the latest compaction digest, if there is oneThe brief gives you the full project lay. The recall gives you your last working topic. The resume gives you the agent's working state from the session that ended.
Audit what an agent actually did this morning
// In the dashboard:
History > Daily journal > today > pick a session > Download reportThe report is a Markdown document of every prompt and every tool call, with a summary. The honest record. Share it in a PR description if useful.
Run a long refactor without losing state to compaction
// At the start, in Claude Code:
sp_remember "Refactor goal: split auth.ts into validation, session and provider modules."
// During the refactor, when sp_budget reports warn:
sp_digest // write a lossless digest
// Or just continue, and Claude Code's PreCompact hook will do it for you.
// On the next session:
sp_resume // load the digest
sp_recall "Refactor goal" // get the goal backThe combination of the durable goal and the compaction digest means even a multi-day refactor across many sessions does not lose state.
The doctors
Two of them.
slipstream doctor
slipstream doctorSeventeen install checks. Each FAIL prints a one-line fix. The checks that catch the most common pitfalls:
mcp-build:dist/mcp/index.jsexists. Fix:pnpm install && pnpm build.duplicate-registration: slipstream wired via both the plugin and a project.mcp.json, with the plugin actually loaded. Fix: pick one path and remove the other.double-emit: hooks active ANDSLIPSTREAM_MCP_EMIT=1. Fix: unset the env var.stale-dashboard: a previous dashboard build is still running on the recorded port. Fix:sp_dashboardrestarts it cleanly.hooks-wired: SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, Stop all registered. Fix: reinstall the plugin.
Run doctor before you ship anything. It catches the 95% case in 30 seconds.
slipstream memory doctor
slipstream memory doctorHealth check for the memory store. Total, duplicates, stale, by-type breakdown. Exits non-zero if the store needs attention. Wire this into a pre-commit hook if you want to catch memory drift early.
The seven gotchas
Things people hit in the first week.
"The dashboard does not start"
Run slipstream doctor. Look for dashboard-port FAIL. Either kill the squatting process, or call sp_dashboard which will restart it on a fresh port (v0.6.1 fixed the version-aware restart so this is reliable now).
"Observation memory is empty in Cursor"
If you installed v0.7.1 or older, this was a real bug. v0.7.2 fixed it: the four memory-reading sp_ tools now call captureObservations with flushOpen on every read in MCP-only mode. Upgrade to v1.0 and the memory populates the moment you query it.
"The agent reads whole files even though slipstream is installed"
The agent has to be told. The using-slipstream skill that ships in v1.0 plus a per-turn hook reminder do the steering. If they are not active in your editor, add the skill manually: @skill using-slipstream at the start of the session. The discipline is the difference between the theoretical savings and the savings you actually observe.
"Multiple Claude Code tabs are duplicating work"
That used to be normal. Upgrade to v1.0. The cross-tab bus posts each tab's open thread and files in flight at turn end; every other tab sees this at its next prompt. If the bus file is missing or empty, run slipstream doctor to check the install.
"The benchmark number does not match the marketing"
Two reasons. First, the headline is per-read efficiency (around 95%), not end-to-end session efficiency (which depends on how often the agent re-reads, recalls or pulls the wrong file). The benchmark script states this plainly. Second, the numbers in this post come from a specific repo (this one); your repo may have different file shapes. Run pnpm benchmark on your own code and use that number.
"I cannot find the local URL after a restart"
is written on every dashboard start. Cat it. Any statusline, editor extension or shell script can read it.
"The insights band reads as data not prose"
That is a template bug. Open an issue on the repo with the exact paragraph that read wrong, and which view it came from. The insight generators in src/dashboard/insights.ts are pure functions with deterministic templates; fixing the prose is a one-line PR most of the time.
What to do this afternoon
If you have made it this far and slipstream is installed, do these three things now:
- Run
pnpm benchmarkon your project. Paste the table into your README. - Open the dashboard and click through all nine views once. The Code map is the one most likely to change how you think about your repo.
- Set up two Claude Code tabs on the same project and try the cross-tab workflow on a small task. The Agents office view is the most fun half-hour you can spend with this tool.
The product page is at sarmalinux.com/products/slipstream. The repo is at github.com/sarmakska/slipstream[1]. The setup guide for every editor is on the wiki[2]. The MCP specification slipstream implements is at modelcontextprotocol.io[3]. The competitive positioning document is in the repo at STRATEGY.md[4].
If something in here broke for you, please raise an issue. The closed issues all have detailed resolution comments so you can see the bar. Pull requests are welcome and reviewed in days, not weeks.
That is the working guide. Twenty minutes from install to your first regenerated benchmark number. The discipline starts on session two.
---
A note on this post
The commands, file paths, dashboard views and outputs in this post are real and reproducible on a fresh clone of the slipstream repository at the v1.0 release. Token counts come from the checked-in benchmark script. Citations link to the primary source.