Glossary

post-training

Everything that happens after pretraining ends: supervised fine-tuning, preference optimization, red-teaming, distillation, and safety work that turns a base into a shippable assistant.

Training also: Safety and Guardrails also: Weights

The umbrella term for the training work that follows pretrainingtrainingThe first and most compute-expensive training phase, where a base model learns general capabilities by predicting the next token on trillions of words of web and book data. Open full entry . By 2024 frontierweightsThe current capability envelope of AI, defined by the most capable models in deployment at any given time; an evolving label rather than a fixed threshold. Open full entry labs published increasingly detailed post-training recipes: multi-stage SFT, preference optimization, rejection sampling against learned reward models, capability-specific fine-tuningtrainingContinued training of a pretrained base model on a smaller, task-specific dataset to specialize its behavior without retraining from scratch. Open full entry for code or math, and dedicated alignmenttrainingThe training-and-evaluation work of shaping a model's behavior to match human intent, refuse harmful requests, and answer honestly, distinct from raw capability training. Open full entry passes for refusal and tone.

Post-training quality is now what separates open weightsweightsA model release that publishes the trained parameters under some downloadable license, distinct from "open source" which (per OSAID) also requires data and training-code openness. Open full entry base models from production assistants. The LlamaweightsMeta's open-weight model family, the most widely deployed open release through 2024 to 2026, released under the source-available Community License with an MAU cap and acceptable-use clause. Open full entry 3 release notes describe roughly ten million instruction examples and millions of preference pairs in the post-training set; Qwen3 reports a multi-phase recipe involving distinct math, reasoning, and conversational stages.

For open-source fine-tuners the implication is that pretrained bases are plentiful and cheap to access, but matching the post-training quality of a hosted assistant is the hard part. Open recipes (Tülu, Zephyr, Nous Hermes, OpenHermes) document workable post-training pipelines for practitioners who do not have a frontierweightsThe current capability envelope of AI, defined by the most capable models in deployment at any given time; an evolving label rather than a fixed threshold. Open full entry -lab data budget.

Sources

Mentioned in

Back to glossary