Substrate: A Local-First Knowledge Layer for AI Agents

The core insight: Knowledge that survives across agent sessions is filesystem state, not vector state. Embeddings hide retrieval; markdown on disk doesn't. Substrate is what happens when you take that constraint seriously and design the smallest layer that satisfies it.

The problem, stated precisely

Coding agents — Claude Code, Cursor, Zed, Codex — are good at the next 30 minutes. They're bad at the next 30 days. Three structural reasons, none of which are fixed by a bigger model:

Symptom	Why it happens
Session compacts, reasoning chain lost	Conversation memory is volatile; nothing pins decisions to disk
CLAUDE.md grows past 200 lines, becomes a junk drawer	One flat file, no dates, tags, or provenance
Vector recall returns "relevant" chunks you can't audit	Embedding ranking is opaque; you can't force a specific note in
Wikis (Notion, Obsidian) don't expose a programmatic API agents can use	Human-shaped retrieval — browse, click, scroll

The common thread: the storage layer doesn't match the agent's retrieval model. Substrate fixes that by collapsing the layer into a directory of markdown files plus an MCP server.

Architecture

Loading diagram...

Three boundaries:

Humans write through the CLI. Every add/edit triggers an editor session and an automatic git commit.
Agents read through the MCP server. Five tools, stdio transport, no network surface.
The filesystem is the contract. No daemon, no database, no embeddings index. Both the CLI and the MCP server are thin adapters over ~/.substrate/bundles/.

Anatomy of a bundle

markdown

---
id: 2026-05-24-deploy-staging-landmines
created: 2026-05-24T19:30:00+05:30
tags: [deployment, fly, gotcha]
context_refs: []
---

# What broke

asyncpg refused TLS with `?ssl=disable` — needed `?sslmode=disable`.
Fly.io internal DNS requires the `.internal` suffix, not the public hostname.

The format is intentionally trivial: YAML frontmatter for indexable metadata, markdown for the body. Bundles sit under ~/.substrate/bundles/YYYY-MM-DD/<slug>.md. There is no kind field, no schema beyond id/created/tags/context_refs. Semantics live in the body and the tag set, not in the type system.

The retrieval model

Loading diagram...

The retrieval is exact, not statistical. get_bundle("2026-05-12-prompt-run-migration") always returns that bundle. search_bundles("migration") is substring + tag match — no ranking, no embeddings, no surprises. When an agent fetches a bundle, you can read the same bytes it received.

Four usage patterns that paid off

I've been running substrate against my main repo for two weeks. Four patterns earned their keep; everything else was scaffolding:

Cross-session handoffs. Write a bundle at session end with what was decided and what's half-done. Next session, the agent calls get_by_date and resumes. Compaction stops costing reasoning.
Pre-authored prompts. Author a detailed playbook (migrations, releases, PR review rubric) once. Tag it prompt,playbook. When the user mentions it, the agent fetches and follows verbatim. Your calm self leaves instructions for your rushed self's agent.
Shared conventions across parallel agents. Write the API error envelope once. Three agents working on three branches all converge on the same shape because they all search the same convention bundle.
Debugging breadcrumbs. When a three-hour bug gets resolved, write the root cause. Six days later, the agent finds it in 80ms instead of re-spelunking.

If your workload doesn't hit at least one of these patterns three times a week, substrate is overkill and a folder of markdown is fine.

How agents discover bundles

Loading diagram...

This decision tree lives in 12 lines of agent instructions inside CLAUDE.md (or AGENTS.md, or your host's equivalent). The agent doesn't need to be retrained; it just needs to know when to reach for the tool.

Design invariants

Five non-negotiables, derived from a year of watching agent-memory tools accumulate accidental complexity:

Files on disk are source-of-truth. Every index, cache, and dashboard is disposable and rebuildable from the markdown.
Git is the history. No custom audit log, no version field — git log is the trail.
Boring formats. Markdown + YAML frontmatter. No proprietary schema, no vendor lock-in.
Falsifiable retrieval. usage.log records every fetch so unused bundles surface as deletion candidates.
MCP-native, not MCP-only. The CLI works offline, in scripts, in CI. MCP is one adapter, not the only one.

The implementation is small enough to audit in an afternoon: ~600 lines of Python for the CLI, ~400 for the MCP server, ~600 for the dashboard renderer. No services, no migrations, no upgrade paths that can break your bundles.

Roadmap

Launched as v0.2.0; current release lives on PyPI as substrate-kb. Source: github.com/xlreon/substrate.

Work I'd take PRs on first:

Windows clipboard support (current implementation shells out to pbcopy)
Confirmed MCP configs for Windsurf, Cody, Claude Desktop
Community-curated bundle packs — deployment checklists, code-review rubrics, ADR templates
First-class per-project stores alongside the global one

Try it

bash

uv tool install substrate-kb
substrate init
claude mcp add substrate -- substrate-mcp
substrate add "my first handoff" --tag handoff

Four commands. The next time your session compacts, the next session won't start from zero.

If it earns its keep, star the repo.

The problem, stated precisely

Architecture

Anatomy of a bundle

The retrieval model

Four usage patterns that paid off

How agents discover bundles

Design invariants

Roadmap

Try it

Sidharth Satapathy