How this site is built and updated
The Open-Source AI Stack is a living map of ten production-pipeline layers and five cross-cutting meta-layers, with the projects, news, and funding flowing through each. A scheduled agent updates the news once a day at 08:00 Pacific.
Click any section to expand.
Data posture (per feature)
The site historically described itself as "pure BYOK, no PII, no server." That language is retired. The self-paced course at /learn needs a logged-in profile to track progress and produce the take-home Personal Notes; the rest of the site remains server-less in the relevant sense. Per feature:
- Reference site (the stack pages, grants, news, glossary): static HTML. No server-side state, no cookies needed to read.
- In-site chat agent (the floating bubble): pure BYOK. The API key you paste on /settings stays in your browser localStorage. No proxy, no shared key.
- Daily news routine: a scheduled agent updates the news MDX files; no user data involved.
- Course at
/learn: email-and-password profile at Supabase. Saves your progress, the synthesis paragraphs you write, and the chat dialogues with the course agent. Self-service delete on /learn/profile.
Specifics are documented at /privacy: what's collected, where it lives, who has access, and how to export or delete.
The agent and what it does
A scheduled remote Claude agent runs three stages each day. Fetch: it pulls about forty layer-specific RSS and Atom feeds (GitHub releases, vendor blogs, academic newsletters) plus about twenty HTML-scrape sources for vendor blogs without RSS. Items go into an inbox keyed on a content hash. Dedupe and route: URLs are normalized, SimHash clustering removes near-duplicates, and each remaining item is classified by stack layer. Summarize and publish: per-layer summaries are generated with prompt-cached system prompts, an editorial letter is drafted, and the day's MDX file is committed to the repo. Vercel rebuilds and the site goes live.
What is on each layer page
Each of the 15 layer pages is a five-in-one view. What it is:
a 3-5 sentence editorial intro. Key projects: a curated
catalog (5-10 per layer, ordered open-first, with license, maturity,
and a one-line description) sourced from
data/projects.yaml in the repo.
Latest news: the count of items at this layer in
today's daily issue, with a link out. Grants flowing in:
the most recent grants tagged to this layer, sourced from
data/grants.yaml. Learn more: a curated reading
list (papers, posts, talks, docs) from data/reading-lists.yaml.
A sidebar carries quick-glance metadata: lock-in vector, sovereignty
relevance, related layers.
Glossary
About 145 technical terms are defined at
/glossary (for example, mixture of expertsweightsA model architecture where each token activates only a fraction of total parameters by routing through learned expert subnetworks, decoupling capacity from compute.
Open full entry or PagedAttentionruntimeAn attention implementation that manages the KV cache in fixed-size blocks like operating-system virtual memory, eliminating fragmentation and letting many concurrent requests share GPU memory efficiently.
Open full entry ), grouped by stack layer and
cross-referenced by alias. Each entry has a 30-word summary
(the hover-card definition) and a deeper page with a 3-4
paragraph senior-engineer explanation. Inline terms tagged in
prose show a definition on hover or tap; the card has a
"Chat about this" button that opens the in-site chat with the
term as context. The chat agent's read_glossary
tool fetches a term's full body by canonical slug or any alias.
What the agent does not do
It does not modify the taxonomy. It does not write opinion. It does not auto-publish unverified claims. Items that lack a primary source go to a needs-review queue instead of being published.
Editorial voice
Neutral-observational. The editorial letter at the top of each day describes what moved; per-item summaries are factual and source-linked. A linter rejects vocabulary that drifts into AI-slop ("delve," "tapestry," "landscape," "fascinating") and marketing buzzwords ("transformative," "robust," "leveraging," "utilize").
Source list
Per-layer feeds plus a few aggregators. AI News by smol AI
(news.smol.ai) is the spine; SemiAnalysis, Interconnects, Latent
Space, Import AI, and The Batch are the highest-signal newsletters
across layers. GitHub release feeds (vLLMruntimeAn open-source inference engine introduced by UC Berkeley in 2023, built around PagedAttention to manage KV cache memory and serve tokens efficiently under load.
Open full entry , SGLangruntimeAn open inference engine from the LMSYS team featuring RadixAttention for prefix sharing and a structured-generation frontend, particularly strong on agent and tool-calling workloads.
Open full entry , llama.cppruntimeGeorgi Gerganov's C++ inference engine optimized for CPUs and consumer GPUs, the on-device standard and the engine behind Ollama, LM Studio, and most local-first AI products.
Open full entry , MCPprotocolsAn open protocol from Anthropic that standardizes how language models discover and call external tools, data sources, and prompts via a small JSON-RPC interface.
Open full entry ,
OpenHands, Aider) plus vendor blogs cover the rest. Full source
list is at data/sources.yaml in the repo.
Grants methodology
Funder profiles come from public announcement pages, RFP texts, and grants-announcement posts. Grant entries link to the primary source where possible. The funded-vs-underfunded rollup attributes each grant to one or more stack layers based on the project's primary focus; cross-layer grants count toward each layer. Underfunded areas are synthesized from a May 2026 ecosystem research pass, cross-checked against currently-funded projects; project-shape descriptions are concrete enough that a grant application could quote them.
Grants coverage and limitations
The grants page is a curated map, not a comprehensive index of every open-source-AI grant ever made. Coverage is intentionally stronger in some places than others, because the audience this site is built for is "someone who used to fund Bitcoin OSS and is now considering open-source AI."
Strong coverage: cypherpunk-adjacent and sovereignty-positioned funders (HRF, OpenSats, Cosmos Institute, Foresight Institute, Block / GooseagentsBlock's open-source coding agent, BYOK across multiple model providers, with MCP support and a permissive license; the most cited fully-open agent platform in 2026. Open full entry ), major AI-safety funders (Open Philanthropy, FLI, SFF, Manifund, Lightspeed Grants), and major OSS infrastructure programs (NLnet NGI Zero, Sovereign Tech Fund, Mozilla Builders, Linux Foundation AAIF). For these we have most known recent grants and most relevant context.
Thin coverage: academic NSF and DOE grants, regional public funders outside the US / EU / UK, individual fellowship-style grants under $50K, hyperscaler in-house grant programs (most are not externally accessible anyway), and any grants in languages other than English where the announcement did not get covered in English-language outlets.
Three routines keep this honest, on different cycles:
- Weekly grants-watch (Mondays 09:00 PT): surfaces new grant announcements from RSS feeds of funders we already track, to a review queue.
- Monthly grants-discovery (1st of each month 11:00 PT): actively searches for funders and grants we do NOT already track, to a separate candidate queue.
- Quarterly grants-audit (1st of Mar / Jun / Sep / Dec 10:00 PT): re-verifies every existing entry against its live source for dead links and fact drift, plus a per-funder coverage pass that flags funders where our grant count appears low relative to what their grants page lists.
None of the three routines auto-publish to the site. Findings
go to data/inbox/ and require human review before
they land in the canonical YAML files. Missing a funder or a
grant? File an issue on GitHub.
Open methodology
The site, the agent prompt, the source list, the layer taxonomy, and the per-day issues are all in the repo. Pull requests welcome. Errata channel: file a GitHub issue.
Provenance
The stack taxonomy and the layer overview prose started from a personal wiki built by Austin in May 2026 (a Karpathy-style LLM-wiki, separately maintained). The wiki seeded the foundations; this site is the public, daily-updated, polished form.