AIRVS — The Recommendation Provenance Standard  ·  v1.0.0  ·  DOI: 10.5281/zenodo.20391984

Governance

Trust is a process, published in full.

A standard earns authority only by being radically open. Here is how.

The five axes of trust

01

Full disclosure of evaluation data

Every score and its underlying data are published in the open.

02

Full disclosure of decision logs

How a verdict was reached is recorded and public — append-only.

03

7-day dispute SLA

Recommenders and reviewers can dispute a score; we respond within 7 days.

04

Academic citation circuit

CC BY 4.0 + DOI + ORCID + /cite — built to be referenced and challenged.

05

Citation-tracking transparency

Where AIRVS is cited is tracked openly at /citations.

Standard evolution policy

The standard is frozen — and alive

v1.0.0 is frozen, but a frozen standard that never evolves is dead. AIRVS evolves under Semantic Versioning on a fixed, public cadence — every change goes through an RFC comment period before it ships.

CadenceScopeComment period
Quarterly · 4x/yrPATCH fixes to evaluation axes30-day public comment
Semi-annual · 2x/yrMINOR — add a new axis, backward-compatible30-day public comment, BC preserved
Annual · 1x/yrMAJOR review of the object definition (v2.0+ candidate)60-day comment + 6-month dual run
Every 3 yearsExternal audit (journal / standards-body background)Independent reviewers

SemVer

What a version number means

MAJOR

Breaking change to the object definition or scoring model.

MINOR

Backward-compatible addition — e.g. a new evaluation axis.

PATCH

Clarification or fix that changes no results.

Verifications are locked to the version that scored them. A later version never silently re-scores past records; re-evaluation is published as a separate, dated appendix.

Read the v1.0.0 standard →

Our decision rule (v0 · draft)

How the four dimensions become one verdict

The standard fixes the label vocabulary; each evaluator publishes their own mapping. This is MC AI Labs' rule, applied before any evaluation and version-locked. Provisional at D+0 (process + coherence only); Confirmed at D+90 (adds outcome).

  1. 🔴 HallucinatedAccuracy/Hallucination axis fails on a non-existent source, ticker, or fabricated figure. (hard override)
  2. 🟠 Unreliable≤3 of 6 process axes pass, or (Confirmed) D+90 excess < -3pp vs benchmark.
  3. 🟡 Questionable4 of 6 axes pass, or macro/micro coherence is Partial with no counter-scenario.
  4. 🔵 Acceptable5 of 6 axes pass, coherence ≥ Partial, and (Confirmed) D+90 return ≥ 0.
  5. 🟢 TrustworthyAll 6 axes pass, coherence Sufficient, and (Confirmed) D+90 return ≥ 0 with non-negative excess.

Draft v0 — to be frozen and version-locked at first real publication. The rule is intentionally simple and conservative; it is published so anyone can replicate or contest a verdict. Machine-readable rule (JSON) →