The PM Wiki: Why I Built an LLM-Maintained Knowledge Base for Product Managers

In April, Andrej Karpathy published a 75-line idea file he called the "LLM Wiki." Thousands of engineers forked it. Nobody built it for product managers. So I did — and open-sourced it.

The pattern is simple to state: instead of asking AI to re-retrieve your documents on every question, an AI agent maintains a knowledge wiki — integrating every new source, updating cross-references, flagging contradictions. Knowledge compounds instead of evaporating. Reading it, I kept thinking the same thing: this describes a PM's job better than an engineer's.

The loop every PM runs

A customer says something in a call. A metric moves. A stakeholder takes a position. A competitor ships. Six weeks later you're writing a PRD or defending a roadmap, and you're reconstructing all of it from memory, Slack scrollback, and half-remembered decks.

Three specific failures hide inside that loop:

1
Nothing compounds. Every document starts from scratch. The synthesis you did for last quarter's review is buried in a deck nobody will open again.
2
Evidence and claims get separated. "Users want X" survives; which users, how many, and what they actually said doesn't. PRDs become plausible-sounding instead of defensible.
3
Decision amnesia. "Why did we choose X over Y?" gets asked forever, and the rationale — the options weighed, the evidence, what would change our mind — lives nowhere.

We already know the two standard fixes, and we know why they fail. Traditional wikis — Confluence, Notion — die because maintenance grows faster than value. RAG-style "chat with your docs" tools don't accumulate; they re-retrieve fragments per question. The fix is an agent that maintains the knowledge base — because for an LLM, the cost of updating fifteen cross-referenced pages is near zero.

The PM version: four things, tracked forever

The engineering forks of Karpathy's idea track codebases and papers. The PM version needed its own entity model — the PM-shaped one. I kept it deliberately minimal: four entity types, each with a job.

Problem pages. One per pain point, stated in the user's language — who has it, severity, a running mention count, and the evidence chain: every source that supports (or contradicts) it. Every claim links to who said it, when, and in what words.
A decision log. Context, options considered with the evidence for each, rationale, decider — and reversal conditions: the observable facts that would reopen the decision. Never deleted; superseded decisions link forward.
An assumption register. Every load-bearing belief, with a status — untested / validated / weakening / invalidated — and a dated history of every status flip with the source that caused it. This is your risk register.
An open-questions queue. Every contradiction and evidence gap the weekly health check finds becomes a tagged research question: ask users, check data, ask a stakeholder, run an experiment. Your next interview script generates itself from this queue.

Personas, competitors, metrics, and bets emerge later — from evidence, not from an empty template. That restraint is the point: the schema stays small enough that the agent can hold the whole system honest.

Deliverables become queries

The division of labor is strict. Sources — interview transcripts, feedback digests, analytics readouts, meeting notes, competitor intel — are immutable; you curate what goes in. The wiki is the compounding artifact the agent maintains: every ingest updates cross-references, bumps evidence counts, and flags contradictions, so Thursday's synthesis already includes Tuesday's interview. Outputs are generated on demand.

And that last layer is where the payoff lands. A PRD or stakeholder brief that used to take days of re-gathering context is generated in minutes from evidence that's already synthesized, cross-referenced, and cited. "Why did we choose X over Y?" is answered in seconds, forever, with the original options, evidence, and reversal conditions attached.

format_quote

"PRDs and stakeholder briefs stop being documents you write. They become queries against evidence you've already accumulated."

— TheGlocalPM

The pressure test

Before publishing, I pressure-tested the system the only way that counts: I handed the rulebook to a completely fresh AI agent with a one-line prompt and no other context. It checked a decision's reversal conditions before touching it, refused to change an assumption's status without my approval, and generated a fully-cited PRD from the worked example. The rulebook held.

That matters because the schema — one file of rules, conventions, and workflows — is the real product. It's what turns a generic chatbot into a disciplined wiki maintainer. The repo ships it as a copy-paste template, alongside a full worked example (a fictional fitness app with a churn problem) and a quick-start guide any PM can follow in about fifteen minutes. No code involved anywhere: if you can use a chat tool and a folder, you can run this.

The honest cost

The system isn't free. It costs roughly thirty focused minutes a week: feeding in the sources you decide matter, and running a weekly health check where the agent surfaces contradictions, weakening assumptions, and gaps. The agent does the bookkeeping that kills every Confluence wiki; you keep the judgment. That trade — bookkeeping to the machine, judgment to the human — is exactly the shape I think AI-augmented product work should take, and it's the same argument I made in The Future of AI in Product Management: the PM's job shifts from coordinator to orchestrator.

Clone it, fork it, run it

Everything is free and open source under MIT: the schema, the entity model, the rituals, and the worked example. To the best of my knowledge it's the first open-source instantiation of the LLM Wiki pattern purpose-built for product management — and PM work is arguably the best fit for the pattern anywhere.

Credit where it belongs: the pattern is Andrej Karpathy's. The PM instantiation is my contribution back.

Get the PM Wiki on GitHub →

The PM Wiki: Why I Built an LLM-Maintained Knowledge Base for Product Managers

The loop every PM runs

The PM version: four things, tracked forever

Deliverables become queries

The pressure test

The honest cost

Clone it, fork it, run it

Ali Mahmoud · Lead Product Manager

Read Next

Evals: The Product Manager's Quality Gate

The Retention Problem in Quick Commerce: Why Subscriptions Beat Discounts