# /delegate & /validate v3 Spec – Explicit Redundancy & Consensus

> Terminology note: this spec describes **v3 of the `/delegate` and `/validate` endpoints** (and `factcheck`), not a “Router v3”. The router continues to evolve under the hood, but the public-facing versioning lives on the **endpoint contracts**.

***

### 1. Goal

v3 turns **redundancy + consensus** into explicit, configurable knobs at the **endpoint layer** for:

* `POST /api/v3/delegate`
* `POST /api/v3/validate`
* `POST /api/v3/factcheck` (name TBD, same pattern)

Instead of “one request → one session → whatever redundancy that session config happens to have”, v3 lets callers say:

* how many **independent sessions** should run the job (1 / 3 / 5), and
* how their results should be **aggregated** into a single verdict/answer.

Think of v3 as:

> “Agent-ready consensus”: explicit redundancy knobs + structured consensus metadata for `/delegate`, `/validate`, and `factcheck`.

***

### 2. v2 vs v3 – Mental Model

**v2 (today)**

* You call `/api/v2/delegate` or `/api/v2/validate` with:
  * v2 payloads: `objective/input/execution/policy` (delegate), `claim/policy/context` (validate).
  * A single `session_id`.
* That session may internally:
  * fan out to multiple miners,
  * run PoI/PoUW,
  * apply some redundancy.
* But from the endpoint’s point of view:
  * It is still **one session**, and
  * Consensus is **implicit** inside that session.

**v3 (this spec)**

* You call `/api/v3/delegate` or `/api/v3/validate` with:
  * a v3 payload that includes the **same core contract** as v2 *plus*
  * a new `consensus` block that explicitly asks for redundancy across **1 / 3 / 5 sessions**.
* The router:
  * selects the appropriate sessions,
  * runs the job across **multiple independent sessions** when requested,
  * aggregates results,
  * returns a **final answer + per-replica evidence + consensus metadata**.

***

### 3. Request Shape – Redundant Sessions as Policy

Every v3 request gains a `consensus` block (shape, not final schema):

* `replicas`
  * `1`, `3`, or `5`.
  * `1` behaves almost like v2 (single session).
  * `3` and `5` trigger multi-session runs.
* `session_pool`
  * A list of candidate `session_id`s such as `["201", "202", "203"]`.
  * Future: allow session “labels” or “profiles” instead of raw ids.
* `aggregation`
  * `majority` initially.
  * Future v4: `weighted`, `median`, `median_of_means`, `model_ensemble`, etc.
* `disagreement_policy`
  * How to behave when replicas disagree:
    * `return_all` – return all replica results + consensus summary (default, best for agents).
    * `fail_hard` – treat disagreement as an error.
    * `best_effort` – still pick a winner but annotate low confidence.

Router behavior for v3 endpoints:

1. `replicas = 1`
   * Behaves similarly to v2 today:
     * choose a single `session_id` (from `session_pool` or policy),
     * run the job there,
     * return a single result,
     * still attach `consensus` metadata (with `replicas = 1`).
2. `replicas = 3` or `5`
   * Router:
     * picks 3 or 5 sessions (from `session_pool`, routing policy, or both),
     * sends the same logical job to each session independently,
     * waits for all (or enough) to finish under the request timeout,
     * aggregates their outputs into:
       * a **final answer/verdict**, and
       * a **per-replica evidence bundle + agreement metrics**.

Applies to:

* `/api/v3/delegate` – redundant **execution** runs (multiple workflow executions).
* `/api/v3/validate` – redundant **verification** runs (multiple “judges”).
* `/api/v3/factcheck` – redundant **world/fact** checks (generalizing the current 1/3 pattern into a standard 1/3/5 contract).

***

### 4. Response Shape – Consensus Metadata

Every v3 response includes a `consensus` section, so agents can see **how** the answer was formed.

Example fields (shape, not final):

* `replicas`
  * How many independent runs were executed (`1`, `3`, or `5`).
* `agreement`
  * Fractional agreement, such as `0.67` for 2/3, `0.8` for 4/5.
* `verdicts`
  * An array of per-replica verdicts or coarse result labels:
    * For `/validate`: `["VALID", "VALID", "INVALID"]`.
    * For `factcheck`: `["TRUE", "TRUE", "FALSE"]`.
    * For `/delegate`: task-level statuses like `["OK", "OK", "TIMEOUT"]`.
* `strategy`
  * Aggregation strategy used (e.g. `"majority"` in v3).
* `confidence`
  * A derived confidence score combining:
    * agreement ratio,
    * tier,
    * per-session reliability (future: v4 hooks into PoI/PoUW reputation).

Examples:

* `/api/v3/delegate`
  * “2 out of 3 runs produced equivalent structured results under policy X; final output is consensus-merged; `confidence = 0.82`.”
* `/api/v3/validate` or `factcheck`
  * “3 out of 5 runs agreed that the claim is TRUE under this policy; `agreement = 0.6`, `confidence = 0.78`, `strategy = "majority"`.”

Key point: the caller doesn’t just get “an answer”—they see **how many independent runs agreed**, **how they were aggregated**, and **how confident the network is**.

***

### 5. Endpoint-Level Versioning (v2 vs v3)

To keep things clean:

* v2 endpoints stay as:
  * `/api/v2/delegate`
  * `/api/v2/validate`
* v3 endpoints are:
  * `/api/v3/delegate`
  * `/api/v3/validate`
  * `factcheck` will follow the same convention when promoted.

Both sets can coexist:

* v2 = “contract-first, policy-aware, mostly single-session.”
* v3 = “contract-first **plus** explicit multi-session redundancy + consensus metadata.”

MCP / A2A / x402 mappings follow the same pattern:

* Tools like `cortensor_delegate_v3`, `cortensor_validate_v3` mirror the REST v3 semantics.
* x402 adds the same `consensus` semantics but under a pay-per-call wrapper.

***

### 6. What Needs to Change in the Router (Behavioral Summary)

To support `/delegate` & `/validate` v3, the router needs to:

1. **Session selection for 1 / 3 / 5 replicas**
   * Given a v3 request:
     * resolve the **candidate session set** from:
       * `session_pool` (explicit),
       * routing rules (e.g., “safe tier → sessions with heavy models + redundancy”),
       * optionally ERC-8004 / Corgent/Bardiel policies.
     * choose `replicas` distinct sessions that:
       * are live, healthy, and correctly configured,
       * satisfy redundancy / policy constraints (e.g., different miners / hardware slices if possible).
2. **Fan-out & fan-in**
   * For `replicas > 1`:
     * **fan-out**:
       * send the logical request to each selected session; each session receives a v2-like payload (objective/claim/policy) plus some v3 metadata (replica id, correlation id).
     * **fan-in**:
       * collect per-replica responses (including intermediate traces if requested),
       * normalize them into a common shape,
       * compute:
         * agreement metrics,
         * aggregated verdict or merged answer,
         * confidence score,
       * attach the full `consensus` block.
3. **Consensus strategy (v3 baseline)**
   * Start simple:
     * `aggregation = "majority"` for:
       * discrete verdicts (`VALID`/`INVALID`, `TRUE`/`FALSE`/`UNSURE`),
       * simple success/error flags on `/delegate`.
     * For numeric scores (e.g., risk scores), use:
       * simple average,
       * or majority on bucketed ranges.
   * Edge cases:
     * all replicas fail → `disagreement_policy` + error handling.
     * some replicas time out → ignore or treat as negative evidence depending on policy.
4. **`disagreement_policy`**
   * `return_all`:
     * always return all per-replica outputs + consensus block, even on disagreement.
   * `fail_hard`:
     * treat significant disagreement as `HTTP 409` or similar, with embedded replica evidence.
   * `best_effort`:
     * still provide a winner (majority), but include a low `confidence` and full per-replica evidence so agents can decide.

Implementation detail: the **endpoint contract** doesn’t mandate the exact confidence formula—it just guarantees that:

* `replicas`, `agreement`, `verdicts`, `strategy`, and `confidence` are present and consistent.

***

### 7. How Corgent & Bardiel Use v3

* **Corgent**
  * Primary **infra-native surface** for ERC-8004 and agent frameworks.
  * Will expose v3 semantics in its own APIs and onchain/offchain contracts:
    * For `/delegate v3`: “run this agent step across 3 sessions and give me consensus result + traces.”
    * For `/validate v3`: “judge this claim under policy X with 5 independent verifiers.”
  * Can map EIP-8183 / future EIPs to consensus metadata (e.g., storing validation artifacts that reference v3 consensus outputs).
* **Bardiel**
  * Virtual-native **service agent** that can **consume v3** under the hood:
    * For Virtual/GAME flows:
      * pick `replicas` based on risk (e.g., 1 for casual, 3 or 5 for economic actions),
      * use consensus metadata to decide whether to escalate, ask human, or retry.
  * For ERC-8004 contexts:
    * can present itself as an 8004 agent that internally calls `/validate v3` for “oracle-grade” decisions, while exposing a simpler interface to users.

Both rely on the **same v3 endpoint contracts**, but frame them differently:

* Corgent = infra contract for agents & protocols.
* Bardiel = productized agent that uses v3 consensus internally for UX and safety.

***

### 8. Roadmap Context (v2 → v3 → v4)

* **v2 (current)**
  * `/delegate v2`, `/validate v2`: structured payloads, policy tiers, single-session focus, implicit consensus inside a session.
* **v3 (this spec)**
  * `/delegate v3`, `/validate v3`, `factcheck v3`:
    * explicit redundancy / replicas at the endpoint,
    * multi-session routing,
    * structured consensus metadata,
    * uniform semantics across REST / MCP / A2A / x402.
* **v4 (later)**
  * builds on v3:
    * pluggable aggregation strategies,
    * reusable validation artifacts (referenceable IDs),
    * dynamic redundancy (auto-escalate/de-escalate based on risk, cost, and PoI/PoUW reputation),
    * deeper integration with ERC-8004 and EIP-8183-style proof formats.

For now, v3’s job is simple but important:

> Make redundancy and consensus **explicit, visible, and programmable** at the `/delegate` and `/validate` endpoint layer, so agents can stop guessing and start **tuning trust vs cost** with real knobs.
