AI Inference

AI inference within the Cortensor network is at the core of the platform’s capabilities, enabling efficient and scalable AI computations through a decentralized architecture. This section delves into the mechanisms and processes that facilitate AI inference, ensuring high performance, inclusivity, and security.

Overview

Cortensor's AI inference leverages a distributed network of miner nodes to perform computations using advanced AI models. The system supports a diverse range of hardware, from low-end devices to high-end GPUs, ensuring broad participation and inclusivity. The primary AI models currently supported include Llama 3, available in both quantized and regular versions, allowing even lower-end devices to contribute effectively.

AI Inference Process

Task Initiation:

Users create sessions and submit prompts through router nodes.
Router nodes verify session parameters, including payment and model specifications, before processing the request.

Task Allocation:

Router nodes dynamically allocate inference tasks to suitable miner nodes.
Allocation algorithms consider node performance, current workload, and specific task requirements to optimize resource utilization.

Inference Execution:

Miner nodes perform the assigned AI inference tasks.
Tasks are segmented into smaller subtasks to enhance processing efficiency and balance the workload.
Model quantization allows lower-end devices to handle inference tasks, promoting inclusivity.

Result Submission:

Miner nodes submit the results securely through encrypted channels.
Results are sent to the router nodes for initial aggregation and verification.

Validation:

Validation nodes or other miner nodes verify the inference results.
Validation methods include semantic checks, embedding comparisons, and checksum verifications.
Users can configure the level of validation required, balancing between cost and accuracy.

Result Delivery:

Validated results are delivered to users through their preferred channels.
The router node ensures secure and efficient result delivery while maintaining user privacy.

Security and Privacy

Encrypted Communication:

All communications within the network are encrypted to ensure data privacy and integrity.
Router nodes manage encryption and decryption, ensuring secure interactions between clients and miner nodes.

Validation and Verification:

Validation nodes verify the accuracy of AI inference results.
Configurable validation processes allow users to specify the required level of accuracy, influencing costs and ensuring reliable outputs.

Inclusivity through Quantization

Model Quantization:

Cortensor employs model quantization to support a diverse range of hardware, including lower-end devices.
This inclusivity allows devices with limited computational power to perform inference tasks, enhancing the network's scalability and resource utilization.
The focus on supporting Llama 3 models, both quantized and regular, ensures wide participation and efficient task execution across different hardware capabilities.

PreviousDesign Principles NextOpen Source Models

Last updated 1 year ago