Cortensor Portal V1 Detailed Spec 07 - Router Pools / Model Product Layer

Status

Draft

Purpose

This document defines how Portal V1 should present models as product-facing offerings while routing them to managed backend router pools.

The goal is to keep the customer-facing model surface simple while preserving flexibility in the backend fleet and session topology.


1. Summary

Portal V1 should expose a small curated model catalog and map those product-facing model names onto managed router pools backed by dedicated sessions.

Customers should see:

  • stable product model names

Customers should not need to see:

  • raw router node topology

  • underlying session IDs

  • router allocation details

  • fleet internals or model-hosting complexity

The Portal layer should act as the product abstraction that translates clean model names into backend router pools.


2. Goals

This layer should aim to:

  • keep the model surface simple

  • make routing replaceable behind a stable API

  • support predictable hosted inference

  • reduce operational complexity for V1

  • preserve room to evolve backend topology later without breaking users


The initial product catalog should likely stay narrow.

Good candidates:

  • OSS 20B

  • OSS 120B

  • Gemma 4 family

  • Qwen family

Why start narrow

A smaller launch model set makes V1:

  • easier to explain

  • easier to support

  • easier to price

  • easier to scale incrementally

  • easier to monitor operationally

Portal V1 should optimize for clarity and predictability, not maximum model breadth.


4. Pool Model

Conceptually, Portal V1 should have backend groups like:

  • oss-20b pool

  • oss-120b pool

  • gemma4 pool

  • qwen pool

Each pool represents a managed backend serving group for a specific model family or hosted offering.

The Portal backend should map customer-facing model aliases onto those pools.

Conceptual flow

  • customer requests a product-facing model alias

  • Portal control layer resolves that alias to a pool

  • a healthy router target is selected from that pool

  • the request is forwarded to backend inference infrastructure

This allows product-level stability while backend routing remains flexible.


5. Dedicated-Backed Recommendation

Dedicated-backed sessions should be the primary serving foundation for V1.

Reasons

  • more predictable latency

  • more predictable output behavior

  • easier support

  • easier debugging

  • easier quota and billing design

  • easier to reason about for a hosted product surface

Portal V1 is meant to feel like a stable hosted API product, and dedicated-backed pools best support that in the first release.

Ephemeral Capacity

Ephemeral capacity can remain a later evolution path rather than the primary V1 foundation.

Likely progression:

  • V1: dedicated-backed pools only

  • later: selective hybrid or burst strategies where useful


6. Product Model Naming

Portal should expose stable model names that are easy to document and call.

The naming strategy should prioritize:

  • clarity

  • consistency

  • product framing

Naming principles

Product names should:

  • feel hosted and stable

  • not expose router/session internals

  • be easy to use in docs, SDKs, and examples

  • stay consistent even if backend mappings change

Examples of the kind of naming Portal might use:

  • gpt-oss-20b

  • gpt-oss-120b

  • gemma-4-26b

  • qwen-...

The exact aliases can be finalized later, but the key rule is:

Customers integrate with product names, not with pool names or session IDs.


7. Routing Responsibilities

The Portal control layer should own:

  • resolving the requested model alias

  • identifying the eligible backend pool

  • picking a healthy target

  • hiding internal topology from customers

This preserves freedom to:

  • move pools

  • rotate routers

  • rebalance capacity

  • change session topology

  • adjust backend infrastructure over time

without breaking customer integrations.

Key architectural rule

Model routing should be:

  • Portal-owned

  • not hardcoded ad hoc across the stack

  • not exposed directly through the customer API surface


8. Health and Availability

Portal-side routing should eventually be aware of:

  • healthy pools

  • draining pools

  • paused pools

  • degraded pools

V1 stance

V1 does not need a very advanced health-scoring system immediately.

However, the architecture should allow:

  • pool-level health awareness

  • target selection that avoids obviously bad backends

  • future pool drain / pause behavior

This means even if the first implementation is simple, the design should assume that routing decisions may later depend on:

  • pool health

  • maintenance state

  • capacity state

  • temporary model disablement


9. UX Implications

The Portal web app should present:

  • supported model list if needed

  • maybe simple model-family notes

  • short hosted model descriptions if useful

The Portal UI should avoid:

  • deep infra controls

  • raw router IDs

  • raw session IDs

  • pool topology panels

  • backend fleet details in the customer-facing experience

The customer should see a hosted model catalog, not an infrastructure layout.

Good V1 UI behavior

  • simple model list in docs or onboarding

  • maybe model family badges or short descriptions

  • stable product names used consistently across:

    • UI

    • docs

    • API examples


10. Operational Implications

This layer also defines some backend operational expectations.

Pool design expectations

Portal should eventually know:

  • which pools exist

  • which product models map to which pools

  • whether a pool is:

    • live

    • draining

    • paused

    • degraded

Why this matters

This enables:

  • controlled rollout of new models

  • temporary disabling of unstable model groups

  • safer maintenance workflows

  • support tooling that remains aligned with the product layer

Even if these controls are not fully surfaced in V1, the model-to-pool mapping should still be treated as durable Portal-side product metadata.


11. Open Questions

Open questions for later refinement:

  • which exact launch model aliases should be public?

  • should the UI show model families or individual concrete model names?

  • how much pool health should be surfaced outside admin tools?

  • should some model aliases intentionally hide exact backend size/version for product simplicity?

  • when should ephemeral or hybrid capacity become visible in the product layer, if ever?


12. Relationship to Other Specs

This spec connects directly to:

  • 04-free-plan-rate-limits-and-gateway.md

  • 05-portal-backend-control-plane.md

  • 06-data-model-and-durable-state.md

  • 08-usage-metering-and-billing.md

  • 09-admin-and-ops.md

It also supports the UI/UX and docs specs, since model naming and supported offerings affect:

  • onboarding

  • docs examples

  • API examples

  • pricing / quota communication


13. Working Summary

Portal V1 should expose a small hosted model catalog and route those product-facing names into managed backend router pools.

The backend should start with:

  • model-specific pools

  • dedicated-backed sessions

  • simple, healthy-target selection

  • Portal-owned model alias mapping

In one sentence:

The Router Pools / Model Product Layer gives Portal V1 a stable customer-facing model catalog while keeping the underlying router pools, sessions, and fleet topology hidden and replaceable.

Last updated