Glossary
verifiable inference
An inference architecture that provides cryptographic proof the claimed model produced the claimed output, via TEE attestation, zero-knowledge proofs (ZKML), or proof-of-sample-correctness schemes.
The class of techniques that move “the operator says this is what the
model returned” to “here is mathematical evidence.” Three families.
TEEidentity-trustA hardware-isolated CPU region where code and data are protected from inspection by the host OS, used to run inference in a way the operator cannot read or modify.
Open full entry -based: the inferenceruntimeRunning a trained model to produce outputs (tokens, images, embeddings) from inputs at serving time, as distinct from the gradient updates of training.
Open full entry runs inside a TEEidentity-trustA hardware-isolated CPU region where code and data are protected from inspection by the host OS, used to run inference in a way the operator cannot read or modify.
Open full entry and the operator presents
an attestationidentity-trustA cryptographic protocol that lets a remote party verify which code is running inside a TEE, including which model is loaded and which build of the inference engine.
Open full entry plus signed output. ZK-based (ZKML): the inferenceruntimeRunning a trained model to produce outputs (tokens, images, embeddings) from inputs at serving time, as distinct from the gradient updates of training.
Open full entry
generates a zero-knowledge proof that some specific model produced the
output from the input. Sampling-based: the operator periodically
re-runs requests on a referee and slashes if results diverge.
For agentic paymentsprotocolsThe class of payment flows initiated and settled by autonomous AI agents on a user's behalf, distinct from human-initiated checkout flows. Open full entry and on-chain agents, verifiable inference is the hard problem. A smart contract paying an agent for a task wants evidence the agent actually did the work, not just an off-chain claim. ZKMLidentity-trustZero-knowledge proofs of correct machine-learning inference, letting a prover convince a verifier that a specific model produced a specific output without revealing model or input. Open full entry is cryptographically clean but adds 10,000x compute overhead in the worst case; TEE-based is fast but trusts the hardware vendor; sampling adds latencycomputeThe time from request submission to response completion, broken down for LLMs into time-to-first-token and time-per-output-token, the user-facing speed metric. Open full entry but works on commodity infrastructure.
In 2026 no approach dominates. Production confidential AI uses TEEs; research and on-chain settings experiment with ZKMLidentity-trustZero-knowledge proofs of correct machine-learning inference, letting a prover convince a verifier that a specific model produced a specific output without revealing model or input. Open full entry ; large hosted inference still relies on contractual rather than cryptographic verification.