Track your progress. Log in to sync your course progress, notes, and chat history across devices.
Learn
Three self-paced tracks. The first walks the stack layer by layer and why openness matters at each one. The second covers how LLMs actually work, the mechanics from the inference loop to fine-tuning. The third covers how to run the stack on hardware you control. Pick any; do all three, in any order.
Bottom-up from infrastructure to protocols. Each module ends with a Probe dialog and you writing your own summary; the course produces a downloadable map of what you learned.
The model-side foundation. Tokens, transformers, attention, the KV cache, decoding, chat templates, long context, RAG, tool use, fine-tuning. Start with the loop; the rest follows.
VRAM math, memory bandwidth tiers, quantization formats, inference engines, hardware strategy, production serving, benchmarking. Skim or read in order.