FREQUENTLY ASKED

Questions about proving physical AI

What WorldFlux is, how it turns an AI test into evidence anyone can verify, and how your models, data, and keys stay yours.

What is WorldFlux?

WorldFlux is the proof layer for physical AI. It is a bring-your-own-compute control plane that turns world-model and robot-policy (vision-language-action) evaluations into signed, independently verifiable evidence packages — without ever taking custody of your models, data, or keys. You run the evaluation on your own hardware, and WorldFlux packages and signs the result so others can trust it.

How do you prove a robot or AI model is safe to deploy?

A polished demo and a spreadsheet of scores are not proof, because no one downstream can independently verify them. With WorldFlux you run your model's evaluation on your own hardware, then WorldFlux packages a tamper-evident file containing the claim, the test protocol, the evidence, and its provenance, signs it with Sigstore, and gives you an expiring, revocable link. Customers, insurers, and regulators open that link and re-verify the signature themselves.

Does WorldFlux see my model weights or training data?

No. WorldFlux is bring-your-own-compute: your weights, keys, and credentials never leave your hardware. It ingests what your evaluation already produced and never re-runs or hosts your model. That is the core difference from hosted evaluation services, which require you to upload your model.

What is an evidence pack?

An evidence pack is a tamper-evident bundle of four things — the claim, the test protocol, the evidence (metrics, logs, artifacts), and the provenance — cryptographically signed with Sigstore and shipped with a CycloneDX ML bill-of-materials. Anyone with the share link can re-check the signature; links expire after 30 days by default and can be revoked.

How is WorldFlux different from an experiment tracker like Weights & Biases or MLflow?

Experiment trackers record the numbers you report yourself, which a skeptical buyer has no reason to trust. WorldFlux produces independent, signed evidence that a third party can verify without taking your word for it. Use a tracker to manage your own experiments; use WorldFlux when you need to prove a result to someone else.

How is WorldFlux different from a hosted evaluation service?

Hosted evaluation services require you to upload your model, which serious teams often cannot do. WorldFlux never takes custody of your IP — you keep your weights and keys on your own hardware, and WorldFlux signs the evidence your evaluation produced. It is the neutral layer between self-reported metrics and handing over your model.

Does WorldFlux help with the EU AI Act and other compliance frameworks?

Yes. WorldFlux produces evidence in the shape regulators and buyers ask for, mappable to the EU AI Act (Article 11 technical documentation), the NIST AI RMF, ISO 42001, SOC 2, and GDPR. It makes evidence inspectable, not certified — you get a signed, re-verifiable record rather than a rubber stamp.[4][3]

Which models, frameworks, and benchmarks does WorldFlux support?

WorldFlux evaluates and packages results from the models teams actually build on — NVIDIA Cosmos, NVIDIA Isaac GR00T, Physical Intelligence π, OpenVLA, V-JEPA 2, and SmolVLA — and ingests real robotics test harnesses including LeRobot, OpenPI, GR00T, and MuJoCo.

How much does WorldFlux cost?

WorldFlux starts free: you can use the CLI within a free quota. Paid plans are metered (Pro, for solo labs running tests weekly) and seat-based (Team, with member roles and audit-log retention). Design-partner pilots are available for teams that need a signed URL and a written go/no-go memo in under two weeks.

How do reviewers verify a WorldFlux evidence pack?

They open the share link and re-check the Sigstore signature themselves — no raw logs and no access to your model are required. Because the pack is signed and carries its provenance, verification does not depend on trusting WorldFlux either. Share links expire after 30 days by default and can be revoked at any time.

Why does physical-AI evaluation matter now?

Physical AI is scaling fast — Goldman Sachs raised its humanoid-robot forecast sixfold in a year to $38B by 2035, and Morgan Stanley projects a roughly $5T market by 2050 — while regulation is making evidence mandatory. The EU AI Act now requires technical documentation for high-risk AI before sale, and Gartner projects AI-governance platform spending will pass $1B by 2030.[2][1][4][3]

Is WorldFlux available now, and how do I start?

WorldFlux is in beta and taking on a small number of design partners. You can start free today: install the CLI with `pip install worldflux`, or book a call to discuss a pilot.