Cortensor Portal V1 Detailed Spec 07 - Router Pools / Model Product Layer
Status
Draft
Purpose
This document defines how Portal V1 should present models as product-facing offerings while routing them to managed backend router pools.
The goal is to keep the customer-facing model surface simple while preserving flexibility in the backend fleet and session topology.
1. Summary
Portal V1 should expose a small curated model catalog and map those product-facing model names onto managed router pools backed by dedicated sessions.
Customers should see:
stable product model names
Customers should not need to see:
raw router node topology
underlying session IDs
router allocation details
fleet internals or model-hosting complexity
The Portal layer should act as the product abstraction that translates clean model names into backend router pools.
2. Goals
This layer should aim to:
keep the model surface simple
make routing replaceable behind a stable API
support predictable hosted inference
reduce operational complexity for V1
preserve room to evolve backend topology later without breaking users
3. Recommended Launch Model Shape
The initial product catalog should likely stay narrow.
Good candidates:
OSS 20B
OSS 120B
Gemma 4 family
Qwen family
Why start narrow
A smaller launch model set makes V1:
easier to explain
easier to support
easier to price
easier to scale incrementally
easier to monitor operationally
Portal V1 should optimize for clarity and predictability, not maximum model breadth.
4. Pool Model
Conceptually, Portal V1 should have backend groups like:
oss-20b pooloss-120b poolgemma4 poolqwen pool
Each pool represents a managed backend serving group for a specific model family or hosted offering.
The Portal backend should map customer-facing model aliases onto those pools.
Conceptual flow
customer requests a product-facing model alias
Portal control layer resolves that alias to a pool
a healthy router target is selected from that pool
the request is forwarded to backend inference infrastructure
This allows product-level stability while backend routing remains flexible.
5. Dedicated-Backed Recommendation
Dedicated-backed sessions should be the primary serving foundation for V1.
Reasons
more predictable latency
more predictable output behavior
easier support
easier debugging
easier quota and billing design
easier to reason about for a hosted product surface
Portal V1 is meant to feel like a stable hosted API product, and dedicated-backed pools best support that in the first release.
Ephemeral Capacity
Ephemeral capacity can remain a later evolution path rather than the primary V1 foundation.
Likely progression:
V1: dedicated-backed pools only
later: selective hybrid or burst strategies where useful
6. Product Model Naming
Portal should expose stable model names that are easy to document and call.
The naming strategy should prioritize:
clarity
consistency
product framing
Naming principles
Product names should:
feel hosted and stable
not expose router/session internals
be easy to use in docs, SDKs, and examples
stay consistent even if backend mappings change
Examples of the kind of naming Portal might use:
gpt-oss-20bgpt-oss-120bgemma-4-26bqwen-...
The exact aliases can be finalized later, but the key rule is:
Customers integrate with product names, not with pool names or session IDs.
7. Routing Responsibilities
The Portal control layer should own:
resolving the requested model alias
identifying the eligible backend pool
picking a healthy target
hiding internal topology from customers
This preserves freedom to:
move pools
rotate routers
rebalance capacity
change session topology
adjust backend infrastructure over time
without breaking customer integrations.
Key architectural rule
Model routing should be:
Portal-owned
not hardcoded ad hoc across the stack
not exposed directly through the customer API surface
8. Health and Availability
Portal-side routing should eventually be aware of:
healthy pools
draining pools
paused pools
degraded pools
V1 stance
V1 does not need a very advanced health-scoring system immediately.
However, the architecture should allow:
pool-level health awareness
target selection that avoids obviously bad backends
future pool drain / pause behavior
This means even if the first implementation is simple, the design should assume that routing decisions may later depend on:
pool health
maintenance state
capacity state
temporary model disablement
9. UX Implications
The Portal web app should present:
supported model list if needed
maybe simple model-family notes
short hosted model descriptions if useful
The Portal UI should avoid:
deep infra controls
raw router IDs
raw session IDs
pool topology panels
backend fleet details in the customer-facing experience
The customer should see a hosted model catalog, not an infrastructure layout.
Good V1 UI behavior
simple model list in docs or onboarding
maybe model family badges or short descriptions
stable product names used consistently across:
UI
docs
API examples
10. Operational Implications
This layer also defines some backend operational expectations.
Pool design expectations
Portal should eventually know:
which pools exist
which product models map to which pools
whether a pool is:
live
draining
paused
degraded
Why this matters
This enables:
controlled rollout of new models
temporary disabling of unstable model groups
safer maintenance workflows
support tooling that remains aligned with the product layer
Even if these controls are not fully surfaced in V1, the model-to-pool mapping should still be treated as durable Portal-side product metadata.
11. Open Questions
Open questions for later refinement:
which exact launch model aliases should be public?
should the UI show model families or individual concrete model names?
how much pool health should be surfaced outside admin tools?
should some model aliases intentionally hide exact backend size/version for product simplicity?
when should ephemeral or hybrid capacity become visible in the product layer, if ever?
12. Relationship to Other Specs
This spec connects directly to:
04-free-plan-rate-limits-and-gateway.md05-portal-backend-control-plane.md06-data-model-and-durable-state.md08-usage-metering-and-billing.md09-admin-and-ops.md
It also supports the UI/UX and docs specs, since model naming and supported offerings affect:
onboarding
docs examples
API examples
pricing / quota communication
13. Working Summary
Portal V1 should expose a small hosted model catalog and route those product-facing names into managed backend router pools.
The backend should start with:
model-specific pools
dedicated-backed sessions
simple, healthy-target selection
Portal-owned model alias mapping
In one sentence:
The Router Pools / Model Product Layer gives Portal V1 a stable customer-facing model catalog while keeping the underlying router pools, sessions, and fleet topology hidden and replaceable.
Last updated