# Cortensor Portal

> Status: Draft specification and roadmap reference\
> Scope: Hosted Cortensor Portal product built on top of managed router fleets\
> Purpose: Define the intended product shape, system architecture, and phased rollout path for Cortensor Portal

***

### 1. Overview

Cortensor Portal is a proposed **hosted API product** built on top of managed Cortensor router infrastructure.

Instead of requiring users to operate their own router nodes, sessions, and supporting infrastructure, the portal provides a **managed control plane and API gateway** where developers can:

* sign up and authenticate
* create and revoke API keys
* call a stable hosted inference API
* view usage, limits, and model access
* consume Cortensor as a product rather than raw infrastructure

The intended V1 mental model is:

> **An OpenAI-style hosted inference API backed by managed Cortensor router pools**

This framing keeps the developer experience simple while preserving the flexibility of Cortensor’s underlying network.

***

### 2. Product Goals

The Portal should provide a stable hosted surface for inference while hiding most raw Cortensor topology and operational complexity.

#### 2.1 Primary Goals

* Expose a **simple hosted API** for inference.
* Allow users to interact with Cortensor **without running their own router**.
* Centralize:
  * API key management
  * entitlement checks
  * rate limits
  * quota logic
  * usage and billing state
* Keep router pools and backend session layouts **replaceable** without breaking user integrations.

#### 2.2 Non-Goals for Early Versions

The first versions should not try to expose the full complexity of the network.

Avoid making early Portal versions:

* a thin dashboard directly attached to one raw router
* a power-user infra console for raw session orchestration
* a marketplace-like capacity routing product on day one
* a “support every model and every backend mode” surface

Early phases should prioritize **hosted inference simplicity** over exposing all Cortensor-native controls.

***

### 3. Core Design Principles

#### 3.1 Separate Product Logic from Router Logic

The Portal should **not** be just a frontend attached directly to one router node.

Instead, the system should be logically separated into three layers:

1. **Portal / Product Layer**
2. **Control / Gateway Layer**
3. **Router Fleet / Inference Layer**

This separation matters because:

* business logic should not live inside router nodes
* customer API keys should not directly hit raw routers
* model routing and router pools should be replaceable
* billing, rate limiting, policy enforcement, and observability should stay centralized

#### 3.2 Stable Product Surface, Replaceable Backend

The Portal API should remain stable even if:

* router pools change
* session IDs change
* models are re-mapped internally
* capacity is rebalanced or drained

Users should integrate with **Portal-facing model aliases and API semantics**, not router internals.

#### 3.3 Centralized Enforcement, Decentralized Execution

The Portal should centralize:

* authentication
* authorization
* API key handling
* rate limiting
* quota checks
* billing usage capture
* product policies

The Cortensor router fleet remains the backend for:

* inference execution
* routing into the network
* session-level orchestration
* future validation / delegation / advanced agent flows

***

### 4. High-Level Architecture

The recommended high-level flow is:

**User → Portal API / Gateway → Portal control logic → router pool → Cortensor router backend**

Avoid exposing this shape as the primary product surface:

**User → raw router node**

#### 4.1 Logical Layers

**Portal / Product Layer**

User-facing application layer for:

* auth and account management
* API key CRUD
* usage dashboard
* model catalog / documentation
* quotas / plans / billing visibility
* playgrounds and future product UX

**Control / Gateway Layer**

Hot-path request layer responsible for:

* validating Portal API keys
* checking model entitlement
* applying rate limits
* applying quota checks
* selecting backend router pool
* request normalization
* usage event emission
* stable response shaping

**Router Fleet / Inference Layer**

Managed Cortensor backend consisting of:

* multiple router nodes
* model-specific router pools
* dedicated-backed session groups underneath
* load balancing / health / draining logic
* future burst / fallback / hybrid capacity logic

***

### 5. System Components

#### 5.1 Portal Frontend

The frontend web application should provide:

* sign-in / auth
* org/account views
* API key creation and revocation
* usage and limits dashboard
* model catalog
* docs / examples / playground
* future billing / credits management

#### 5.2 Portal Backend

Portal app backend responsibilities:

* user/org session handling
* API key metadata CRUD
* dashboard APIs
* usage summaries
* billing/account state
* admin tooling
* coordination with gateway and durable storage

#### 5.3 Inference Gateway

The gateway is the critical hot-path service. Responsibilities:

* receive customer inference requests
* validate Portal API keys
* look up entitlement / plan access
* enforce rate limits
* enforce quotas
* map Portal model alias → router pool
* forward request to router fleet
* normalize / wrap responses
* emit usage and metering events
* stamp request IDs and trace metadata

#### 5.4 Router Fleet Manager

Whether implemented as a separate service or as internal control logic, the system needs a router-fleet management function that understands:

* router health
* router pool health
* session-to-model mapping
* live vs draining pools
* degraded pools
* failover behavior
* future autoscaling / replacement logic

#### 5.5 Durable Product Data Store

Durable storage is needed for:

* identity
* orgs
* API key metadata
* plan state
* usage ledgers
* request logs
* router pool metadata
* model access policies

Supabase is a strong candidate here.

#### 5.6 Hot-Path Runtime State Layer

A separate fast state layer should be used for:

* rate limiting
* burst smoothing
* short-TTL key policy cache
* short-window request counters
* fast deny/allow checks

Examples:

* Redis
* Upstash
* managed equivalent fast key-value system

***

### 6. Product Positioning

#### 6.1 Initial Positioning

For the first Portal milestone, the product should be framed as a:

> **Hosted inference product**

rather than:

> **Managed gateway into the Cortensor network**

Reasoning:

* easier to explain
* easier to support
* easier to price
* easier to keep consistent while backend evolves
* reduces exposure of raw router/session/network concepts too early

#### 6.2 Longer-Term Positioning Evolution

As the Portal matures, it can gradually expose more Cortensor-native concepts where useful, especially for:

* advanced routing control
* performance / trust tiers
* agent-native workflows
* future validation / delegation APIs
* hybrid dedicated + ephemeral offerings

The roadmap should therefore evolve from:

* **V1:** hosted inference first\
  to
* **later versions:** managed gateway into Cortensor network capabilities where product maturity supports it

***

### 7. Suggested Model Offering Strategy

#### 7.1 V1 Model Catalog

V1 should start with a **narrow and curated** model catalog.

Initial candidates:

* OSS 20B
* OSS 120B
* Gemma 4 family
* newer Qwen family

#### 7.2 Why Start Narrow

Reasons to start with a smaller set:

* easier operational predictability
* easier debugging
* simpler support burden
* cleaner pricing/quota design
* clearer product messaging
* lower risk than exposing every model supported by raw Cortensor infra

#### 7.3 Model Pool Abstraction

Internally, model families should map to **router pools**.

Example conceptual pools:

* `oss-20b pool`
* `oss-120b pool`
* `gemma4 pool`
* `qwen pool`

Then the gateway resolves a Portal-facing model alias to a backend pool.

Example:

* customer requests `gemma-4-26b`
* gateway maps it to `gemma4 router pool`
* load balancer / fleet manager selects a healthy router in that pool
* router uses dedicated-backed sessions underneath

This keeps internal topology hidden from customers.

#### 7.4 Later Model Expansion

Later milestones can expand model coverage in stages:

* more open-source families
* tiered performance variants
* possible premium / trust-optimized models
* selective hybrid or burst-backed offerings

Model expansion should be tied to:

* operational maturity
* quota/billing clarity
* supportability

***

### 8. Capacity Strategy – Dedicated vs Ephemeral

#### 8.1 V1 Recommendation: Dedicated-Backed First

For the first hosted version, the recommendation is:

> Use **dedicated-backed router pools** for the primary product path.

Reasons:

* more predictable latency
* more consistent output behavior
* easier quota/billing semantics
* easier failure analysis
* better supportability
* easier to set customer expectations

#### 8.2 Capacity Roadmap

The Portal capacity strategy should evolve in phases:

* **V1:** dedicated-backed hosted inference
* **V2:** hybrid bursting for selected lower-cost or flexible tiers
* **V3:** more dynamic marketplace-like capacity strategy

#### 8.3 Tradeoff Summary

**Dedicated nodes**

* cleaner
* more predictable
* more supportable
* better for early commercial productization

**Ephemeral nodes**

* more flexible
* potentially more cost-efficient
* better fit for dynamic capacity markets
* but introduces more variability and operational complexity

***

### 9. Supabase in the Architecture

Supabase is a reasonable choice for the **Portal-side source of truth**.

#### 9.1 Good Uses for Supabase

Supabase fits well for:

* user auth
* organizations
* org membership
* API key metadata
* model access policies
* router pool metadata
* usage events
* request logs
* billing ledger
* subscriptions / credits / plan state
* internal/admin metadata

#### 9.2 Likely Early Tables / Entities

* `users`
* `organizations`
* `organization_members`
* `api_keys`
* `model_access_policies`
* `router_pools`
* `router_pool_models`
* `usage_events`
* `request_logs`
* `billing_ledger`
* `plan_subscriptions`

#### 9.3 Important Caveat

Supabase should be the **durable source of truth**, but **not** the only hot-path enforcement layer.

This distinction matters because raw DB checks on every request do not scale well operationally.

***

### 10. Hot Path Enforcement Model

#### 10.1 Why Not Raw DB Reads Per Request

Naively checking the DB on every inference request for:

* key validity
* plan access
* model entitlement
* quota exhaustion
* rate limit counters

may work for a very small MVP, but becomes painful quickly.

Problems:

* higher latency
* DB contention on bursty traffic
* fragile rate-limit counters
* difficult horizontal scaling
* awkward burst handling
* slower request-time policy decisions

#### 10.2 Recommended Split

Use the DB for:

* key metadata
* org/plan state
* source-of-truth quotas
* durable usage ledger
* revocation state
* durable billing state

Use a runtime fast layer for:

* rate limiting
* burst control
* short-window counters
* hot key policy cache
* fast request allow/deny decisions

This is the core Portal hot-path principle.

***

### 11. Rate Limit, Quota, and Billing

These should be treated as **separate product primitives** and rolled out with increasing sophistication over time.

#### 11.1 Rate Limit

Purpose:

* anti-abuse
* load smoothing
* burst control

Examples:

* requests per second
* requests per minute
* concurrent request caps

Operationally:

* enforced in the gateway hot path
* backed by fast runtime state

#### 11.2 Quota

Purpose:

* enforce plan consumption limits
* implement prepaid / monthly limits

Examples:

* requests per month
* tokens per month
* daily image generations
* monthly credit caps

Operationally:

* durable source of truth in DB/ledger
* hot-path checks can rely on cached snapshots
* should not depend on naive synchronous DB increments for every request at scale

#### 11.3 Billing Ledger

Purpose:

* durable auditable source for accounting and product logic

Examples:

* request count
* token count
* model-specific usage
* premium-tier surcharges
* latency / SLA dimensions

Operationally:

* durable store
* potentially written asynchronously from gateway usage events

#### 11.4 Billing Milestone Guidance

Suggested rollout:

* **V1:** prepaid credits or monthly quota
* **V2:** refined quota tiers / model-specific pricing
* **V3+:** richer billing dimensions if product traction justifies it

***

### 12. API Key Architecture

The Portal should expose **Portal-native API keys**, not raw router-native credentials.

#### 12.1 Recommended Key Design

* only store **hashed** keys
* keep short prefixes for display / search
* support:
  * revocation
  * expiration
  * scopes
  * environment segmentation

Suggested key families:

* `pk_live_xxx`
* `pk_test_xxx`

Possible scopes:

* inference only
* usage read-only
* model-specific access
* environment-specific access

This makes the product feel clean and lets the gateway enforce future rules more safely.

#### 12.2 Portal Key Flow

* customer uses Portal API key
* gateway verifies key and status
* gateway resolves metadata / entitlement
* gateway never passes customer key directly to raw router nodes

***

### 13. Logical Service Split

Even if V1 ships as one deployable codebase, the design should think in three logical services.

#### 13.1 Portal App Backend

Responsibilities:

* auth
* API key CRUD
* dashboard
* org/account management
* admin tooling
* usage views

#### 13.2 Inference Gateway

Responsibilities:

* receive inference calls
* authenticate keys
* apply rate limits / quota checks
* map model alias to router pool
* normalize responses
* emit request / usage events

#### 13.3 Router Fleet Manager

Responsibilities:

* router health awareness
* router pool state
* draining / maintenance / failover
* future autoscaling
* router replacement
* pool scoring and selection metadata

This logical split keeps product logic separate from infrastructure logic even if early implementation is consolidated.

***

### 14. Gateway Strategy Roadmap

The architecture should allow several gateway implementations, but the spec should define a milestone path rather than present them as permanent parallel choices.

#### 14.1 Milestone 1 – Fast Product Launch Path

A practical early milestone can use:

* Supabase
* lightweight Portal backend
* managed API key / rate-limit support such as Unkey or similar
* managed router pools underneath

Purpose:

* accelerate first external Portal MVP
* minimize time to first commercial product surface

#### 14.2 Milestone 2 – Portal-Controlled Gateway Layer

As Portal becomes more central, the recommended architecture shifts toward:

* Supabase for durable product/account state
* **custom lightweight gateway**
* Redis / Upstash / equivalent for runtime hot-path enforcement
* managed router pools underneath

Purpose:

* preserve custom control over:
  * model routing
  * quota semantics
  * usage and billing logic
  * response normalization
* reduce dependence on third-party gateway product semantics

#### 14.3 Milestone 3 – More Advanced Gateway / Edge / Enterprise Paths

Later versions may add or evaluate:

* edge gateway deployments (e.g. Cloudflare-style)
* managed gateway products where useful
* enterprise-style gateways if needed
* more advanced policy / analytics / audit controls

These should be treated as **later roadmap expansions**, not early blockers.

***

### 15. Recommended Architecture Milestones

#### 15.1 V1 – Hosted Inference Portal

Target shape:

* Supabase for auth and durable product DB
* lightweight gateway layer
* fast runtime state for rate limits
* managed router pools
* dedicated-backed sessions per model family
* narrow model catalog
* simple prepaid credits or quota

Primary objective:

* launch a usable hosted inference product with strong operational clarity

#### 15.2 V2 – Hybrid Productization

Target additions:

* hybrid dedicated + selective burst capacity
* richer quota and billing semantics
* stronger router fleet scoring and traffic shifting
* improved admin and operational tooling
* broader but still curated model catalog

Primary objective:

* improve efficiency and product flexibility without breaking the Portal-facing API

#### 15.3 V3 – More Dynamic Capacity & Network Exposure

Target additions:

* more dynamic capacity strategies
* deeper Cortensor-native routing visibility for advanced users
* possible exposure of trust/performance tiers
* more agent-native or programmable product surfaces

Primary objective:

* evolve from hosted inference product into a more powerful managed gateway into the Cortensor network where appropriate

***

### 16. Request Flow (Recommended V1 Baseline)

Suggested request flow:

1. client sends request with Portal API key
2. gateway validates key and key status
3. gateway checks cached entitlement / key policy
4. gateway enforces rate limit via fast runtime state
5. gateway checks quota / plan allowance
6. gateway maps requested model alias to router pool
7. gateway forwards request to chosen router group
8. router serves inference via managed dedicated session(s)
9. gateway emits usage / metering event
10. durable usage ledger and request logs are updated

Optional later refinements:

* asynchronous billing ledger aggregation
* event queue between gateway and usage processor
* reconciliation jobs comparing gateway records and router completion records

***

### 17. Suggested Portal MVP Scope

#### 17.1 User-Facing V1 Features

* sign-in / auth
* org/account identity
* API key create / revoke
* simple docs / API reference
* minimal usage dashboard
* few launch models only
* basic quota / error messaging

#### 17.2 Backend V1 Features

* API key verification
* revocation support
* per-key rate limiting
* simple quota checks
* model allowlist / entitlement
* request logging with request IDs
* router pool selection
* basic admin visibility

#### 17.3 V1 Billing Primitive

Keep billing simple initially:

* prepaid credits, or
* monthly quota / subscription

Avoid launching with too many billing dimensions too early unless there is a strong product reason.

***

### 18. Operational Requirements

Even a narrow V1 should include basic operability from the beginning.

#### 18.1 Strongly Recommended Early Capabilities

* request ID tracing
* per-key usage metering
* per-model latency dashboards
* per-model error dashboards
* per-router-pool health dashboards
* admin ability to disable a model quickly
* admin ability to drain / pause a router pool quickly

These are necessary for real product operation, not just a technical demo.

***

### 19. Risks to Watch

#### 19.1 Product / Architecture Risks

* mixing business logic into router nodes
* exposing raw session/router topology too early
* supporting too many models too early
* making direct DB reads part of every hot-path decision
* underbuilding observability
* unclear quota / billing semantics

#### 19.2 Infrastructure Risks

* weak router pool health awareness
* poor failover behavior
* no clear separation between customer API and backend router fleet
* difficulty reconciling usage / completion state if metering is underdesigned

#### 19.3 Product Experience Risks

* too much Cortensor-native vocabulary in user-facing APIs/docs
* customer confusion about models / tiers / limits
* internal backend behavior leaking into the product surface

***

### 20. Roadmap Guidance

#### 20.1 Near-Term Product Direction

The immediate roadmap should favor:

* hosted inference first
* dedicated-backed router pools
* narrow model set
* centralized Portal control plane
* simple product semantics
* strong observability from day one

#### 20.2 Medium-Term Direction

Once the core hosted inference product is stable, the roadmap should add:

* stronger custom gateway control
* hybrid capacity tiers
* richer metering / quota / pricing logic
* more operational automation around router fleets

#### 20.3 Long-Term Direction

As the Portal matures, it can evolve toward:

* more dynamic capacity strategies
* deeper Cortensor-native features for advanced users
* more sophisticated product tiers and policy knobs
* broader agent-ready and network-aware surfaces

The roadmap should therefore preserve **architectural control and replaceability** from the beginning, even if early milestones remain intentionally simple.

***

### 21. Open Questions for Next Iteration

#### 21.1 Product Questions

* Should V1 be framed entirely as hosted inference, or should some Cortensor-native concepts remain visible?
* What should the first pricing primitive be?
  * prepaid
  * subscription quota
  * hybrid
* Should Portal expose latency/performance tiers in V1?

#### 21.2 Gateway Questions

* How quickly should Portal move from MVP gateway choices into a more custom gateway layer?
* Should per-key quota be:
  * request-based
  * token-based
  * both
* How much response normalization should happen in the Portal vs pass-through from router responses?

#### 21.3 Fleet Questions

* How should router pool health be tracked and scored?
* How should traffic shift during degradation or maintenance?
* When should ephemeral capacity be introduced, and for which product tiers?

#### 21.4 Data / Metering Questions

* What should be stored synchronously vs asynchronously?
* How should usage reconciliation be handled if gateway success and router completion differ?
* How should model access policies be versioned when offerings change?

***

### 22. Working Summary

Cortensor Portal should be built as a **managed hosted API product** in front of router fleets, not as a thin dashboard directly attached to a router node.

A strong V1 architecture is:

* Supabase for auth and durable product state
* a dedicated gateway layer for customer inference traffic
* a fast runtime state layer for rate limiting and short-window policy checks
* managed router pools behind load balancers
* dedicated-backed session groups for a small model catalog

Later milestones can expand into hybrid capacity, richer pricing/quotas, and more network-native controls.

This keeps the **Portal API stable** while allowing the underlying Cortensor router/session topology to evolve over time.

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.cortensor.network/community-and-ecosystem/products-and-agents/cortensor-portal.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
