The Open-Source AI Stack
RSS

Glossary

verifiable inference

An inference architecture that provides cryptographic proof the claimed model produced the claimed output, via TEE attestation, zero-knowledge proofs (ZKML), or proof-of-sample-correctness schemes.

Identity and Trust also: Runtime aka verifiable AI inference

The class of techniques that move “the operator says this is what the model returned” to “here is mathematical evidence.” Three families. TEEidentity-trustA hardware-isolated CPU region where code and data are protected from inspection by the host OS, used to run inference in a way the operator cannot read or modify. Open full entry -based: the inferenceruntimeRunning a trained model to produce outputs (tokens, images, embeddings) from inputs at serving time, as distinct from the gradient updates of training. Open full entry runs inside a TEEidentity-trustA hardware-isolated CPU region where code and data are protected from inspection by the host OS, used to run inference in a way the operator cannot read or modify. Open full entry and the operator presents an attestationidentity-trustA cryptographic protocol that lets a remote party verify which code is running inside a TEE, including which model is loaded and which build of the inference engine. Open full entry plus signed output. ZK-based (ZKML): the inferenceruntimeRunning a trained model to produce outputs (tokens, images, embeddings) from inputs at serving time, as distinct from the gradient updates of training. Open full entry generates a zero-knowledge proof that some specific model produced the output from the input. Sampling-based: the operator periodically re-runs requests on a referee and slashes if results diverge.

For agentic paymentsprotocolsThe class of payment flows initiated and settled by autonomous AI agents on a user's behalf, distinct from human-initiated checkout flows. Open full entry and on-chain agents, verifiable inference is the hard problem. A smart contract paying an agent for a task wants evidence the agent actually did the work, not just an off-chain claim. ZKMLidentity-trustZero-knowledge proofs of correct machine-learning inference, letting a prover convince a verifier that a specific model produced a specific output without revealing model or input. Open full entry is cryptographically clean but adds 10,000x compute overhead in the worst case; TEE-based is fast but trusts the hardware vendor; sampling adds latencycomputeThe time from request submission to response completion, broken down for LLMs into time-to-first-token and time-per-output-token, the user-facing speed metric. Open full entry but works on commodity infrastructure.

In 2026 no approach dominates. Production confidential AI uses TEEs; research and on-chain settings experiment with ZKMLidentity-trustZero-knowledge proofs of correct machine-learning inference, letting a prover convince a verifier that a specific model produced a specific output without revealing model or input. Open full entry ; large hosted inference still relies on contractual rather than cryptographic verification.

Sources

Mentioned in

Back to glossary