Cortensor
  • Home
  • Abstract
    • Value Proposition
    • Whitepaper
      • Page 1: Introduction and Vision
      • Page 2: Architecture and Technical Overview
      • Page 3: Incentive Structure and Tokenomics
      • Page4: Development Roadmap and Phases
      • Page5: Summary
  • Introduction
    • What is Cortensor?
    • Key Features & Benefits
    • Vision & Mission
    • Team
  • Getting Started
    • Quick Start Guide
    • System Requirements
    • Installation & Setup
      • Getting Test ETH
      • Setup Own RPC Endpoint
      • Router Node Setup
        • Router API Reference
  • Core Concepts
    • Decentralized AI Inference
      • Community-Powered Network
      • Gamification and Quality Control
      • Incentive Structure
    • Universal AI Accessibility
    • Multi-layer Blockchain Architecture
  • Technical Architecture
    • Design Principles
    • Node Roles
    • Node Lifecycle
      • Ephemeral Node State
    • Node Reputation
    • Network & Flow
    • Type of Services
    • Coordination & Orchestration
      • Multi-Oracle Node Reliability & Leadership Rotation
    • AI Inference
      • Open Source Models
        • Centralized vs Decentralized Models
      • Quantization
      • Performance and Scalability
    • Consensus & Validation
      • Proof of Inference (PoI) & Proof of Useful Work (PoUW
      • aka Mining
      • Proof of Useful Work (PoUW)
      • Proof of Useful Work (PoUW) State Machine
        • Miner & Oracle Nodes in PoUW State Machine
      • Sampling in Large Distributed Systems
      • Parallel Processing
      • Embedding Vector Distance
    • Multi-Layered Blockchain Architecture
    • Modular Architecture and Smart Contract Interactions
      • Session Queue
      • Node Pool
      • Session Payment
    • Mining Overview
    • User Interaction & Node Communication
      • Session, Session Queue, Router, and Miner in Cortensor
    • Data Management
      • IPFS Integration
    • Security & Privacy
    • Dashboard
    • Development Previews
      • Multiple Miners Collaboration with Oracle Node
      • Web3 SDK Client & Session/Session Queue Interaction
    • Technical Threads
      • AI Agents and Cortensor's Decentralized AI Inference
    • Infographic Archive
  • Community & Ecosystem
    • Tokenomics
      • Network Incentive Allocation
      • Token Allocations & Safe Wallet Management
    • Staking Pool Overview
    • Contributing to Cortensor
    • Incentives & Reward System
    • Governance & Compliance
    • Safety Measures and Restricted Addresses
    • Buyback Program
    • Liquidity Additions
    • Partnerships
      • Partnership Offering for Demand-Side Partnerships
    • Community Testing
      • Closed Alpha Testing Phase #1
        • Closed Alpha Testing Phase #1 Contest: Closing & Winners Announcement
      • Closed Alpha Testing Phase #2
      • Closed Alpha Testing Phase #3
      • Discord Roles & Mainnet Privileges
      • DevNet Mapping
      • DevNet Modules & Parameters
    • Jobs
      • Technical Writer
      • Communication & Social Media Manager
      • Web3 Frontend Developer
      • Distributed Systems Engineer
  • Integration Guide
    • Web2
      • REST API
      • WebSocket
      • Client SDK
    • Web3
      • Web3 SDK
  • Use Cases
  • Roadmap
    • Technical Roadmap: Launch to Next 365 Days Breakdown
    • Long-term Vision: Beyond Inference
  • Glossary
  • Legal
    • Terms of Use
    • Privacy Policy
    • Disclaimer
    • Agreement for Sale of Tokens
Powered by GitBook
On this page
  • What is Embedding Vector Distance?
  • How It Works in Cortensor
  • Why Embedding Vector Distance is Essential
  • Role in Proof of Inference (PoI)
  • Use Cases
  • Future Enhancements
  • Technical Illustration
  • Conclusion
  1. Technical Architecture
  2. Consensus & Validation

Embedding Vector Distance

Cortensor leverages Embedding Vector Distance as a core mechanism in its Proof of Inference (PoI) validation process. This approach ensures consistency, reliability, and trustworthiness of AI inference outputs across decentralized nodes, forming the foundation of quality assurance in the Cortensor network.


What is Embedding Vector Distance?

Embedding vector distance refers to the mathematical measurement of similarity between two or more output vectors produced by AI models. When identical inputs are processed by the same model across multiple nodes, their outputs—despite slight variations due to computational or environmental factors—should exhibit high similarity. This similarity is quantified using techniques like cosine similarity or Euclidean distance, providing a robust method to evaluate output consistency.


How It Works in Cortensor

  1. Input Distribution: A predefined input (Input A) is sent to multiple miner nodes running identical AI models (e.g., LLaMA or LLaVA). Each node processes this input independently.

  2. Output Generation: Each node produces an inference output based on the input and model. Due to decentralized computation and hardware variability, the outputs may slightly differ in form or structure.

  3. Embedding Creation: The inference outputs are transformed into embedding vectors, which capture their semantic meaning in a numerical format. These embeddings are model-agnostic representations of the output content.

  4. Similarity Measurement: Using embedding distance techniques, the system measures the similarity between the embeddings generated by different nodes. For example:

    • Cosine Similarity: Measures the angle between two vectors, where a value close to 1 indicates high similarity.

    • Euclidean Distance: Measures the straight-line distance between two vectors, where smaller values indicate higher similarity.

  5. Reliability Check: Nodes whose outputs deviate beyond a predefined threshold (e.g., a cosine similarity score below 0.85) may be flagged for inconsistent or unreliable performance.


Why Embedding Vector Distance is Essential

  • Ensures Consistency: By running identical inputs across multiple nodes and measuring the similarity of outputs, Cortensor ensures consistent task execution.

  • Validates Reliability: Embedding distance acts as a safeguard against dishonest nodes or those failing to perform properly.

  • Enhances Decentralized Trust: In a decentralized network, embedding distance provides a quantifiable and verifiable measure of output reliability, fostering trust between miners and users.

  • Facilitates Redundancy: By producing multiple inference variants and validating them against one another, the network can select the most reliable outputs or offer users multiple valid results.


Role in Proof of Inference (PoI)

Embedding vector distance forms the backbone of Cortensor's PoI validation process:

  • Redundancy and Validation: Nodes are tasked with the same inference job. The outputs are compared to ensure they align within acceptable similarity thresholds.

  • Consensus Formation: A majority agreement based on embedding similarity establishes a validated inference result.

  • Cheat Detection: Nodes producing significantly different embeddings are flagged, ensuring the network maintains high reliability and accountability.


Use Cases

  1. Multi-Node AI Inference: When users request AI tasks, multiple miners process the same input to ensure consistency. Embedding vector distance helps validate the outputs.

  2. Synthetic Data Validation: For synthetic data generation, embedding similarity ensures that the produced data aligns with the expected characteristics of the input model.

  3. Quality Assurance in Training: Developers can use embedding distances to evaluate the performance and robustness of models deployed across diverse hardware in the network.


Future Enhancements

Embedding vector distance will evolve as Cortensor introduces Node Reputation Systems:

  • Time-Series Analysis: Track node reliability over time using embedding metrics.

  • Dynamic Thresholds: Adjust similarity thresholds dynamically based on the complexity of tasks or user-defined accuracy preferences.

  • Cross-Model Validation: Expand embedding comparisons across different but compatible models to support heterogeneous miner nodes.


Technical Illustration

Let’s consider an example:

  • Input: "What is the capital of France?"

  • Nodes: 5 miners running the same LLaMA model.

  • Outputs:

    • Node 1: "Paris is the capital of France."

    • Node 2: "The capital of France is Paris."

    • Node 3: "Paris is France's capital."

    • Node 4: "The capital of Paris is France." (anomaly)

    • Node 5: "France's capital city is Paris."

  • Embeddings:

    • Node 1: [0.23, 0.11, 0.87, ...]

    • Node 2: [0.24, 0.10, 0.86, ...]

    • Node 3: [0.22, 0.12, 0.88, ...]

    • Node 4: [0.45, 0.32, 0.74, ...] (outlier)

    • Node 5: [0.21, 0.10, 0.89, ...]

  • Similarity Scores:

    • Nodes 1, 2, 3, 5: High similarity (~0.95 cosine similarity)

    • Node 4: Low similarity (~0.65 cosine similarity)

Result: Nodes 1, 2, 3, and 5 are validated. Node 4 is flagged as an outlier.


Conclusion

Embedding vector distance ensures Cortensor can maintain high-quality, reliable AI inference results in an untrusted decentralized environment. By integrating this technique into PoI validation, Cortensor sets a standard for trust and accountability in decentralized AI.

PreviousParallel ProcessingNextMulti-Layered Blockchain Architecture

Last updated 5 months ago