GLOSSARY

The physical-AI proof vocabulary

Plain-English definitions of the terms behind WorldFlux — physical AI, world models, evidence packs, and the standards they map to.

Physical AI
AI that perceives and acts in the physical world — robots, humanoids, and autonomous machines — as opposed to purely digital AI such as chatbots. Reliability matters because failures happen in the real world, not just on a screen.
World model
A model that learns the dynamics of an environment so it can predict the outcome of actions. World models are a foundation for robot planning and control.
Vision-language-action (VLA) model
A robot-control model that maps camera images and a natural-language instruction directly to actions. OpenVLA and Physical Intelligence π are examples.
Proof layer
The independent layer that turns an AI evaluation into signed, verifiable evidence others can trust. It sits between self-reported metrics and handing over your model — the category WorldFlux defines.
Evidence pack
A tamper-evident bundle of a claim, the test protocol, the evidence (metrics, logs, artifacts), and the provenance, cryptographically signed and shareable as an expiring, revocable link anyone can re-verify.
Bring-your-own-compute (BYO-compute) evaluation
Running an evaluation on your own hardware so your model weights, keys, and data never leave it. The evaluation tool ingests only the outputs, never the model itself.
Chain of custody
A verifiable record of how a result was produced and by whom — from the run on your hardware to the signed evidence pack — so a reviewer can trust the result without trusting the vendor.
Sigstore
A public standard for cryptographically signing software and artifacts so that anyone can verify their origin and integrity. WorldFlux signs every evidence pack with Sigstore.
ML bill-of-materials (ML-BOM, CycloneDX)
A machine-readable inventory of the models, datasets, and dependencies that make up an AI system, expressed in the CycloneDX standard. It ships inside each WorldFlux evidence pack.
LIBERO
A standard robot-manipulation benchmark suite used to evaluate vision-language-action policies across a set of tasks.
Deployment gap
The difference between benchmark or demo performance and real-world reliability. WorldFlux makes it measurable: OpenVLA scored 74.4% on the standard LIBERO suite but 24.4% once the scene was changed.
EU AI Act, Article 11
The provision of the EU AI Act that requires makers of high-risk AI to produce technical documentation and conformity evidence before the system can be sold. WorldFlux evidence packs are designed to map to it.