Governance
Trust is a process, published in full.
A standard earns authority only by being radically open. Here is how.
The five axes of trust
01
Full disclosure of evaluation data
Every score and its underlying data are published in the open.
02
Full disclosure of decision logs
How a verdict was reached is recorded and public — append-only.
03
7-day dispute SLA
Recommenders and reviewers can dispute a score; we respond within 7 days.
04
Academic citation circuit
CC BY 4.0 + DOI + ORCID + /cite — built to be referenced and challenged.
05
Citation-tracking transparency
Where AIRVS is cited is tracked openly at /citations.
Standard evolution policy
The standard is frozen — and alive
v1.0.0 is frozen, but a frozen standard that never evolves is dead. AIRVS evolves under Semantic Versioning on a fixed, public cadence — every change goes through an RFC comment period before it ships.
| Cadence | Scope | Comment period |
|---|---|---|
| Quarterly · 4x/yr | PATCH fixes to evaluation axes | 30-day public comment |
| Semi-annual · 2x/yr | MINOR — add a new axis, backward-compatible | 30-day public comment, BC preserved |
| Annual · 1x/yr | MAJOR review of the object definition (v2.0+ candidate) | 60-day comment + 6-month dual run |
| Every 3 years | External audit (journal / standards-body background) | Independent reviewers |
SemVer
What a version number means
MAJOR
Breaking change to the object definition or scoring model.
MINOR
Backward-compatible addition — e.g. a new evaluation axis.
PATCH
Clarification or fix that changes no results.
Verifications are locked to the version that scored them. A later version never silently re-scores past records; re-evaluation is published as a separate, dated appendix.
Read the v1.0.0 standard →Our decision rule (v0 · draft)
How the four dimensions become one verdict
The standard fixes the label vocabulary; each evaluator publishes their own mapping. This is MC AI Labs' rule, applied before any evaluation and version-locked. Provisional at D+0 (process + coherence only); Confirmed at D+90 (adds outcome).
- 🔴 HallucinatedAccuracy/Hallucination axis fails on a non-existent source, ticker, or fabricated figure. (hard override)
- 🟠 Unreliable≤3 of 6 process axes pass, or (Confirmed) D+90 excess < -3pp vs benchmark.
- 🟡 Questionable4 of 6 axes pass, or macro/micro coherence is Partial with no counter-scenario.
- 🔵 Acceptable5 of 6 axes pass, coherence ≥ Partial, and (Confirmed) D+90 return ≥ 0.
- 🟢 TrustworthyAll 6 axes pass, coherence Sufficient, and (Confirmed) D+90 return ≥ 0 with non-negative excess.
Draft v0 — to be frozen and version-locked at first real publication. The rule is intentionally simple and conservative; it is published so anyone can replicate or contest a verdict. Machine-readable rule (JSON) →