Private / Encrypted Inference

Draft / WIP – this document captures the current plan for private / encrypted inference on Cortensor, starting with dedicated sessions and evolving toward SDK-native Web3 flows and, ultimately, TEE-backed confidential execution.

It complements the main architecture docs:

  • Technical: technical-architecture/data-management/private-encrypted-inference

  • Phase roadmap: roadmap/testnet-phase-3


1. Goals & Scope

The private / encrypted inference roadmap is designed to:

  • Let users encrypt prompts and results end-to-end for selected workflows.

  • Keep router & miners blind to plaintext wherever possible, while still routing correctly.

  • Start with config-driven dedicated sessions and gradually extend to:

    • Ephemeral / pooled nodes (dynamic assignment)

    • A unified key-issuance engine

    • SDK-native Web3 integration (wallet-based auth + policy)

    • TEE-backed confidential execution (Nitro / TDX) as the highest-assurance mode

We talk about “versions” in terms of the privacy/key system, not the router API:

  • V0 – dedicated sessions, router-managed allowlist + key derivation (env-based).

  • V0.5 – move allowlist to onchain contract, router becomes contract-enforced.

  • V1 – ephemeral / pooled node support (assignment-aware keys).

  • V2 – unified engine for all session types (pluggable policies).

  • V3 – SDK-native Web3 private inference (wallet + policy).

  • V4 – TEE-backed private inference (Nitro / TDX nodes as final privacy layer).

All versions assume offchain payload v2 for encrypted content (so rotation/backfill is possible).


2. V0 – Dedicated Sessions, Router-Managed Policy & Key Issuance

2.1 What V0 Supports

V0 focuses on dedicated-node sessions where the router knows the full path (User ↔ Router ↔ Miner) and can safely manage encryption keys.

New auth endpoints (concept):

  • POST /api/v1/auth/payload_enc_key/session → issue a session-level encryption key derived from session_id.

  • POST /api/v1/auth/payload_enc_key/task → issue a task-level encryption key derived from session_id + task_id.

Auth flow:

  1. Caller sends:

    • address (EOA)

    • signature over canonical scope string

    • scope fields:

      • Session endpoint signs the string: "session_id"

      • Task endpoint signs the string: "session_id:task_id"

  2. Router:

    • Verifies the signature with verify_addr_str_message(scope_string).

    • Checks address against ENCRYPTION_ALLOWED_LIST (V0 policy source).

    • If authorized, derives a payload_enc_key deterministically from:

      • ENCRYPTION_SEED

      • Scope (for example "101" or "101:9001").

  3. Router returns:

    • payload_enc_key (or a wrapped form)

    • scope / scope_type (session or task)

    • key_version (see keyring below)

Apps use this key to encrypt/decrypt payloads client-side; the router only sees encrypted blobs + metadata and routes them like any other payload.

2.2 Deterministic Key Derivation

Current prototype (short-term acceptable):

  • payload_key = SHA256(ENCRYPTION_SEED + ":" + scope)

Longer term, prefer HKDF with explicit versioning:

  • payload_key_vN = HKDF(seed_vN, info = "cts:v0:" + scope)

Key properties:

  • Deterministic for the router (given seed + scope).

  • Not guessable from session_id / task_id alone.

  • Security depends on strength of ENCRYPTION_SEED_vN:

    • Use at least 32 bytes of high-entropy random.

    • Rotate if exposed or suspected compromised.

2.3 Encrypted Payload Metadata (Offchain v2)

Encrypted payloads follow an offchain payload v2 pattern with explicit metadata, for example:

Rules:

  • Router does not see plaintext; it only uses:

    • session_id / task_id

    • scope_type

    • key_version

  • All encrypted prompts/results must use this v2 metadata format.

  • Plaintext may be kept separately only in trusted tooling during backfill/rotation.


3. V0 Key Rotation Model (Critical)

We must never run the network on a single long-lived seed.

3.1 Keyring Model

Router keeps a keyring:

  • active_version = vN

  • legacy_versions = [vN-1, vN-2, ...] (decrypt-only)

Behavior:

  • Issuance:

    • /auth/payload_enc_key/* always derives keys with active ENCRYPTION_SEED_vN.

    • Response includes key_version = "vN".

  • Encryption client-side:

    • Clients persist key_version inside the encrypted payload metadata.

  • Decryption:

    • Router first uses key_version from metadata.

    • Only if needed, fall back to range mapping or legacy assumptions for migration.

3.2 Backfill & Retirement

For offchain v2 payloads:

  1. Background worker:

    • Reads ciphertext with legacy key_version.

    • Decrypts using the corresponding legacy seed.

    • Re-encrypts with active seed/version.

    • Updates stored payload + metadata in-place, so the offchain payload URN/ID stays the same.

  2. Once SLO window passes and backfill coverage is acceptable:

    • Mark old versions as retired (no decrypt).

    • Optionally purge old ciphertexts or mark them invalid.

3.3 If a Seed is Stolen

If ENCRYPTION_SEED_vK is compromised:

  • Assume ciphertexts under that version are compromised if attacker can access them.

  • Immediate steps:

    • Rotate active_version to vK+1.

    • Stop issuing keys for vK.

    • Revoke affected sessions/leases as needed.

  • Recovery options:

    • For offchain v2 payloads:

      • Re-encrypt from trusted source data, or

      • Re-encrypt from ciphertexts decryptable with known-good key material.

    • For onchain:

      • Only store policy + references, never raw secrets or seeds.


4. V0 Policy – ENCRYPTION_ALLOWED_LIST (Dedicated Sessions)

4.1 Env Format

V0 uses an env-based allowlist to control which addresses can request keys for which scopes:

Semantics:

  • session_id:addr,addr → session-level allowlist.

    • Example: 101:0x11...,0x22... → addresses allowed for session 101.

  • session_id-task_id:addr,addr → task-level allowlist.

    • Example: 101-9001:0x44...,0x55... → addresses allowed for session 101, task 9001.

Rules:

  • No spaces; addresses must be 0x + 40 hex chars.

  • Router parses this once and uses it to grant/deny key issuance.

This is V0 only and intended for dedicated-node sessions where operator and app are known/trusted.


5. V0.5 – Contract-Backed Allowlist

To move beyond static env-based policy, we introduce V0.5:

  • Authorization becomes:

    • signature_valid AND contract_policy_allows(scope, address).

  • Session-level and task-level grants are stored in an onchain contract.

  • Router:

    • Still verifies signatures and derives keys.

    • Queries contract (with caching) for allow/deny decisions.

  • Policy updates:

    • Emit events from the contract.

    • Routers subscribe or poll for updates to invalidate caches.

Env allowlist is kept as:

  • Emergency override / bootstrap mode.

  • Local dev / test environment fallback.


6. V1 – Ephemeral / Dynamic Node Support

V1 extends private inference to ephemeral or pooled nodes (node-pool sessions).

Challenges:

  • Miner assignments change over time.

  • We can’t rely on static address lists only.

Design points:

  1. Assignment-aware policy

    • Router confirms that a requesting miner:

      • Is currently assigned to the session/task for this epoch.

      • Has a valid lease or job assignment record.

  2. Replay resistance

    • Scope string extended for key issuance, for example:

      • "session:task:epoch:expires_at"

    • Caller signs this extended scope.

  3. Short-lived grants

    • Keys for ephemeral sessions/tasks should:

      • Be valid only for short windows (for example, minutes).

      • Align with job assignment lifetimes.

V1 still uses the same keyring, derivation style, and metadata format — the difference is how policy decides who gets keys.


7. V2 – Unified Key-Issuance Engine

V2 unifies dedicated and ephemeral sessions under a single framework:

  • Same auth endpoints:

    • /api/v1/auth/payload_enc_key/session

    • /api/v1/auth/payload_enc_key/task

  • Same signature semantics:

    • Canonical scope strings; explicit scope_type.

  • Same metadata:

    • alg, scope_type, session_id, task_id, key_version, nonce, tag, ciphertext, created_at.

  • Same audit model:

    • Every issuance recorded as an event (who, what scope, which key_version).

Policy becomes pluggable:

  • EnvAdapter – env-based allowlist (V0).

  • ContractAclAdapter – onchain ACL for sessions/tasks (V0.5).

  • AssignmentAdapter – dynamic assignment info for ephemeral sessions (V1).

Router composes these adapters to compute:

  • authorized = signature_valid AND env_policy_allows(...) AND contract_policy_allows(...) AND assignment_policy_allows(...)

Encrypted workloads continue to use offchain payload v2 to keep rotation/backfill feasible at scale.


8. V3 – SDK-Native Private Inference (Web3)

V3 focuses on a Web3 SDK–native experience:

  • Web3 client/SDK:

    • Manages wallet-based signing of scope strings.

    • Manages policy registration:

      • Which sessions/tasks require encryption.

      • Which addresses/roles can receive keys.

    • Manages key request + caching flows.

  • Router:

    • Remains:

      • Policy enforcement engine.

      • Key issuance service.

    • Does not hold long-lived plaintext DEKs beyond what’s needed to derive them.

  • Onchain:

    • Stores transparent policy state and references:

      • Which addresses are allowed.

      • Which sessions/tasks are “private-required”.

    • Never stores raw secrets or seeds.

  • 3-party patterns (optional, later):

    • User ↔ Miner ↔ Oracle flow for:

      • Encrypted inference.

      • Verifiable output checking.

    • Converge to envelope encryption:

      • One DEK per session/task.

      • DEK wrapped separately for each authorized recipient.


9. V4 – TEE-Backed Confidential Inference (Nitro / TDX)

V4 introduces TEE-backed nodes as the final, strongest privacy form: even when payloads must be decrypted for model execution, they are only ever decrypted inside a trusted execution environment (TEE) such as AWS Nitro Enclaves or Intel TDX.

9.1 Why TEE Nodes Matter

Encrypted payloads + key control protect data in transit and at rest, but at some point, models need plaintext inside RAM to run.

V4 aims to guarantee that:

  • Decryption and inference happen only inside a TEE with:

    • Hardware-backed isolation from the host OS/hypervisor.

    • Attestation that proves code + configuration to the router and/or caller.

  • Node operators (and cloud providers) cannot inspect plaintext payloads, only resource usage.

This becomes the “final privacy form” for high-sensitivity workloads.

9.2 Initial Target: AWS Nitro Enclaves

Past experience:

  • We’ve used AWS Nitro and Intel TDX for user data privacy projects.

  • Nitro is a practical first target because:

    • Once you have a Docker image, it is “one command away” to build a Nitro image.

    • It is straightforward to run a TEE-enabled container on AWS as an enclave.

Initial V4 rollout plan:

  • Start with Nitro-enabled miners:

    • Build enclave-compatible images from existing miner containers.

    • Expose a TEE runtime that can:

      • Receive encrypted payloads.

      • Obtain DEKs or wrapped keys only after successful attestation.

      • Run inference locally inside the enclave.

  • Later, extend to Intel TDX and other TEE platforms where available.

9.3 Key & Attestation Flow (High-Level)

A typical V4 flow (simplified):

  1. Miner node boots a TEE (Nitro/TDX) with a known enclave image.

  2. Enclave generates an attestation document describing:

    • Code hash / image hash.

    • Configuration (e.g., models, routes).

  3. Router (or a separate attestation service) verifies the attestation:

    • Ensures the enclave matches an allowlisted image/config.

  4. For an encrypted job:

    • Router derives or unwraps a DEK for the scope (session/task) as in V2/V3.

    • Router encrypts the DEK to the enclave’s public key (or uses TEE-specific key exchange).

    • Encrypted payload + wrapped DEK are delivered to the enclave.

  5. Inside the enclave:

    • Enclave unwraps DEK.

    • Decrypts payload.

    • Runs inference using local models.

    • Optionally re-encrypts results to the user’s key or to a new DEK.

  6. Router sees only:

    • TEE attestation claims.

    • Encrypted payloads and results.

    • Metering and success/failure status.

TEE nodes still integrate with:

  • Existing keyring versions (key_version).

  • Offchain payload v2 metadata (same envelope).

  • Policy adapters (e.g., only certain sessions/tasks may require TEE execution).

9.4 How V4 Fits the Roadmap

V4 builds on previous versions:

  • V0–V2:

    • Define scoped keys, rotation, and metadata.

  • V3:

    • Adds Web3 SDK and clean policy registration.

  • V4:

    • Uses those same scoped DEKs and policies.

    • Adds a TEE “execution shape” for miners:

      • Some sessions/tasks marked as privacy_mode = "TEE_REQUIRED" or similar.

      • Router routes those only to TEE-capable miners.

Over time:

  • Nitro-based TEE miners can be the first production path.

  • Intel TDX support can be added for on-prem or other cloud vendors.

  • Full design doc will extend this section with:

    • Attestation formats.

    • Routing policies.

    • Combined “TEE + encrypted payload” patterns.


10. Long-Term Rotation & Backfill at Scale

TEE-backed private inference doesn’t remove the need for operational key hygiene. Over the long run, Cortensor needs a rotation/backfill story that works at network scale, especially if a seed or key material is suspected to be compromised.

Key considerations:

  • Backfill jobs must preserve offchain URNs/IDs

    • For most apps, URNs or IDs pointing at offchain payloads (S3 objects, IPFS CIDs behind a pinning layer, etc.) should remain stable.

    • Backfill workers should:

      • Fetch existing ciphertext by URN.

      • Decrypt using the legacy key_version.

      • Re-encrypt with the active version.

      • Write back to the same URN / storage key so upstream references do not break.

  • Priority tiers for backfill

    • Not all encrypted data is equal:

      • “Hot” session/task payloads that agents still need.

      • “Warm” history needed for audits/repairs.

      • “Cold” archives that can be invalidated with lower user impact.

    • Backfill runners should support:

      • Priority queues by namespace / project / app.

      • Configurable SLOs (for example: 95% of hot data re-encrypted within 24h).

  • Compromised-key scenarios

    • If a seed or specific key_version is suspected compromised:

      • Immediately mark that version as “compromised, decrypt-only”.

      • Block new issuance for that version (only decrypt for backfill).

      • Kick off backfill against payloads tagged with that key_version.

      • Offer app-level knobs:

        • “Hard invalidate” (refuse to decrypt) for the most sensitive flows.

        • “Best-effort backfill” for less critical archives.

  • Shard-aware backfill

    • In a multi-router or multi-region world, backfill should be sharded:

      • Each router (or worker pool) handles a subset of URNs or key ranges.

      • Progress tracked via a central index (for example, “encrypted payload index” keyed by key_version).

      • Operators can see:

        • How many payloads per key_version remain.

        • Estimated time to completion.

  • TEE + rotation interplay

    • When TEE nodes are in use:

      • Backfill workers can optionally run inside TEEs as well, so decrypt/re-encrypt never leaves an enclave.

      • A hybrid approach is also possible:

        • Use a privileged “maintenance enclave” with separate attestation + rate limits for rotation jobs.

    • Over time, high-sensitivity tenants may require:

      • “All decrypt/re-encrypt operations must happen within TEE nodes only.”

The main takeaway:

  • Keys will rotate.

  • Some keys may become compromised.

  • The design assumes:

    • Offchain payload v2 is always re-encryptable in place.

    • URNs/IDs stay stable while the ciphertext behind them gets upgraded.

    • Rotation/backfill is a first-class, observable process — not an afterthought.


11. Concrete V0 Improvements (Next Work Items)

To harden V0 and prepare for V0.5/V1/V2/V3/V4:

  1. Add key_version to responses:

    • /auth/payload_enc_key/session

    • /auth/payload_enc_key/task

  2. Require clients to persist key_version in encrypted payload metadata.

  3. Introduce router keyring env format:

    • ENCRYPTION_SEED_ACTIVE and ENCRYPTION_SEED_LEGACY_* (or similar).

  4. Implement decrypt flow with key_version-first resolution:

    • Use key_version in metadata.

    • Only fall back to legacy/range mapping during migration.

  5. Add backfill worker for offchain v2 ciphertext re-encryption to active version (preserving URNs).

  6. Enforce that encrypted sessions/tasks must use offchain payload v2 for prompts/results.


12. Summary

  • V0 gives dedicated-session private inference: router-managed policy, deterministic per-scope keys, and env-based allowlists.

  • V0.5 moves policy into a contract-backed allowlist, with env as emergency fallback.

  • V1 extends privacy to ephemeral / dynamic node pools, with assignment-aware rules and short-lived grants.

  • V2 unifies everything behind a single key-issuance engine and pluggable policy adapters, using the same metadata and audit model.

  • V3 layers on a Web3 SDK–native experience and envelope encryption, turning private inference into a first-class, programmable feature for onchain + offchain apps.

  • V4 adds TEE-backed confidential inference (AWS Nitro / Intel TDX) so that decryption + execution happen only inside hardware-backed enclaves, giving the strongest privacy guarantees for high-sensitivity workloads.

  • Across all versions, key rotation and backfill are treated as ongoing operational duties:

    • Encrypted payloads live in offchain payload v2.

    • URNs remain stable while ciphertext is upgraded.

    • Compromised keys can be contained via rotation + re-encryption rather than breaking references or rewriting app-level contracts.

This roadmap is intentionally incremental: start with dedicated sessions and env-based allowlists, then steadily layer in contracts, dynamic node pools, unified engines, SDKs, and finally TEE-backed execution — without breaking the core guarantees users expect from private inference on Cortensor.

Last updated