Cortensor Portal V1 Detailed Spec 11 - Portal Gateway / Runtime API / Ops Spec
Status
Draft
Purpose
This document consolidates the Portal V1 runtime gateway requirements.
It tracks the pieces that sit between customer API requests and managed Cortensor router pools:
Portal gateway / business API
API key verification and rate limiting
quota and entitlement checks
router-pool selection
usage metering ledger
observability
admin / ops controls
security policy
Payment is intentionally deferred for the initial free-plan implementation, but this spec keeps the usage ledger shaped so billing can be added later without redesigning the request path.
1. Summary
The Portal Gateway is the product runtime layer for Cortensor Portal.
It should be a small service owned by the Portal product layer. It receives customer inference requests, verifies access, selects the correct router pool, forwards the request, records usage, and returns a stable product-facing response.
The gateway should not be replaced by:
raw Nginx routing
raw Supabase reads
direct customer access to router nodes
Nginx can proxy and load balance. Supabase can store durable state. Unkey can manage API keys and rate limits. The gateway owns the Cortensor-specific business logic that ties those systems together.
2. Goals
The Portal Gateway should aim to:
provide one stable product API entry point for Portal users
keep raw router nodes and session IDs hidden from normal customers
verify API keys before any router work is performed
enforce per-key and per-org rate limits / quotas
map public model aliases to managed router pools
emit durable usage events for billing, audits, support, credits, and disputes
expose enough observability to debug customer and router issues
support operational controls such as disabling keys and draining router pools
keep payment optional for the first free-plan launch
3. Non-Goals
The following are explicitly out of scope for the first Portal Gateway release:
full enterprise API gateway platform in V1
complex paid billing implementation in the first free-plan version
user-controlled router/session topology
exposing raw router internals in normal API responses
perfect global load balancing across every router process
replacing Unkey, Clerk, Supabase, or Nginx with full custom equivalents in V1
The gateway should be small, clear, and product-driven, not an overbuilt infrastructure platform.
4. Recommended V1 Architecture
Recommended runtime path:
Recommended Component Ownership
Clerk
Portal user auth, sessions, org membership, user/org identity
Supabase
durable product state, usage ledger, billing ledger, router-pool metadata
Unkey
API key issuance, verification, revocation, rate limits, key analytics
Portal Gateway
request authorization, entitlement, quota, model routing, metering, response normalization
Nginx / reverse proxy
TLS, upstream groups, router-node load balancing
Router nodes
Cortensor inference execution and session-level routing
Key Architecture Principle
The customer should only know:
a stable Portal API
a stable model alias
a stable key format
stable product-facing errors
They should not need to know:
which router served the request
which session executed it
whether a pool is draining
backend topology or fleet composition
5. Gateway Responsibilities
The gateway should own the following V1 runtime responsibilities:
Request identity
Generate or propagate request_id for every request
API key verification
Verify customer key through Unkey
Key status
Reject revoked, paused, expired, or disabled keys
Rate limiting
Enforce per-key RPM and any model-specific limits
Quota
Check org/account allowance before forwarding
Entitlement
Check whether org/key can use requested model
Model routing
Resolve public model alias to router pool
Router target selection
Pick healthy backend target or pool endpoint
Forwarding
Proxy request to router while preserving needed headers/timeouts
Response normalization
Return stable customer-facing response shape
Usage metering
Record durable usage event after request outcome is known
Support visibility
Preserve enough data for debugging without leaking secrets
Additional Responsibilities Implied by Product Ownership
The gateway should also own:
request validation
customer-safe error formatting
request tracing correlation
product-level feature-flag interpretation
future free-plan and paid-plan branching logic
6. Public API Surface
Recommended narrow V1 public API:
If scope needs to stay tighter, V1 can start with:
and expand later.
Public Request Shape Example
Or for chat-style requests:
The customer should not need to know:
router hostname
session ID
pool internals
dedicated vs ephemeral backend details
7. Request Lifecycle
Required V1 request lifecycle:
Receive request.
Assign
request_id.Parse API key.
Verify key with Unkey.
Resolve org/account/key metadata.
Check key status.
Check rate limit.
Check model entitlement.
Check free-plan quota / credits.
Resolve model alias to router pool.
Select router target.
Forward request to router.
Capture response status, latency, and usage.
Write durable usage event.
Return normalized response.
Important Behavior Rules
Invalid or blocked requests should not reach router pools.
Usage should be recorded for:
accepted requests
failed router attempts
completed requests
A successful inference response should not be failed only because an async usage write had a transient problem.
The failed write must be retried and surfaced to ops.
8. Core Flows
8.1 Success Flow
8.2 Blocked Key Flow
8.3 Rate-Limited Flow
8.4 Quota-Exhausted Flow
8.5 Router Failure Flow
8.6 Durable Write Failure Flow
9. Rate Limit and Quota
Rate limiting is required for V1, even with a free plan.
Recommended V1 Model
Per-key RPM
Unkey
prevent request bursts
Per-key daily cap
Unkey or gateway
stop single key abuse
Per-org free allowance
Gateway + Supabase
product quota
Per-model access
Gateway + Supabase
entitlement
Per-IP fallback limit
Gateway or edge layer
abuse protection when key behavior is suspicious
Response Semantics
Rate-limit response:
Quota exhausted response:
For a free-plan MVP, 402 can mean free allowance exhausted even before paid billing is enabled.
If that feels too payment-oriented during beta, 403 is acceptable temporarily, but the long-term product shape likely wants 402.
10. Usage Metering Ledger
Unkey can provide usage visibility, but Portal still needs its own durable usage ledger in Supabase.
Why Portal Needs Its Own Ledger
Reasons include:
billing reconciliation
credit burn-down
support investigations
abuse analysis
customer usage export
disputes and refunds later
router pool quality analysis
internal cost accounting
Recommended usage_events Fields
usage_events Fieldsid
durable event ID
request_id
correlate gateway/router/logs
org_id
account owner
user_id
optional dashboard actor
api_key_id
key attribution
api_key_prefix
safe display/debug value
model_alias
customer-requested model
router_pool_id
selected pool
router_node_id
selected router target, if known
router_session_id
optional internal field
status
accepted, rejected, completed, failed, timeout
http_status
final HTTP status
prompt_tokens
request tokens, if available
completion_tokens
response tokens, if available
total_tokens
total tokens, if available
usage_units
normalized product usage unit
estimated_cost_units
internal cost estimate
latency_ms
gateway total latency
router_latency_ms
backend latency, if measured
error_code
stable error code
created_at
event time
V1 can start with partial token data if router responses do not always include perfect usage values. The schema should still reserve these fields so the gateway and billing layer do not require redesign later.
11. Payment and Billing Scope
Initial Portal V1 can launch as a free-plan product.
Payment is deferred from the first implementation, but the gateway should still produce usage events as if billing will exist later.
Recommended V1 Billing Stance
no card required at first
free quota enforced by gateway
durable usage ledger written from day one
plans table supports:
freeinternalpartnerfuture paid plans
billing ledger can exist with zero-dollar / free events
Deferred Paid-Billing Integrations
These can come later:
Stripe cards / subscriptions
Stripe stablecoin payments
Coinbase Commerce / Business
prepaid credits
monthly quotas with overage
Design Rule
Do not wait for payment implementation to build metering. Metering is needed immediately.
12. Router Pool Selection
The gateway should route by model alias, not by customer-provided session ID.
Example:
Recommended V1 Router-Pool Metadata
pool_id
stable internal ID
model_aliases
customer-facing names served by pool
router_targets
URLs or upstream names
status
active, draining, paused, disabled
weight
future weighted routing
health_state
healthy, degraded, unhealthy
timeout_seconds
pool-specific timeout
supports_streaming
capability flag
There are two routing layers:
Gateway / Nginx balances across router nodes.
Router nodes may internally balance across sessions using env-configured session pools.
The customer should never need to know those internals.
13. Failure Semantics
Stable error behavior should be defined early.
Missing API key
401
no key provided
Invalid API key
401
key not verified
Revoked/disabled key
403
known key but blocked
Model not allowed
403
entitlement failure
Rate limit exceeded
429
short-window protection
Quota exhausted
402 or 403
prefer 402 long term
No healthy router target
503
pool unavailable
Router timeout
504
backend exceeded timeout
Router failed
502
upstream error
Usage write failed
200/502 based on inference result
enqueue retry, do not hide metering failure internally
Stable Error Response Shape
Suggested productized shape:
Suggested codes:
14. Observability
Observability is required in V1 because the gateway becomes the customer support boundary.
Minimum Request Log Fields
request_idorg_idapi_key_idmodel_aliasrouter_pool_idrouter_node_idstatushttp_statuslatency_msrouter_latency_msusage_unitserror_code
Minimum Metrics
requests per key/org/model
usage visibility
latency per model/pool
performance tracking
error rate per router pool
backend health
timeout rate
pool/runtime tuning
failed task rate
router quality
pool saturation
capacity planning
cost/usage units per request
future billing
Request IDs should be propagated to router calls when possible.
15. Admin / Ops Controls
V1 does not need a polished internal admin console, but the backend must support operational controls.
Required Controls
Disable API key
backend endpoint or admin script
Disable org/account
backend endpoint or admin script
Inspect usage
query/API endpoint
Inspect request by ID
query/API endpoint
Disable model access
config/table update
Pause router pool
config/table update
Drain router pool
config/table update
View pool health
internal endpoint/dashboard
Useful Admin Views Later
keys by org
usage by org/key/model
recent failed requests
router pool health
quota state
metering retry queue
Ops Principle
Backend capability first, polished admin UI later.
16. Security Policy
Security must be explicit because the gateway handles customer keys, billing-adjacent state, and router access.
Required V1 Policies
API keys
never store raw keys in Supabase; use Unkey IDs/prefixes/metadata
Secrets
keep auth-provider, Unkey, Supabase service role, and payment-provider secrets server-side only
Service role
only gateway/backend uses Supabase service role
Org access
every dashboard API checks authenticated user/org membership
Router access
routers are not directly customer-facing where possible
Webhooks
verify signatures for external auth / key / billing systems
Audit logs
record admin actions and key lifecycle changes
Logs
redact auth headers, raw API keys, and sensitive payloads when required
RLS
use RLS for browser-readable tables; keep sensitive writes backend-owned
Recommended Auth Split
Avoid split-brain auth where multiple systems act as the authoritative human identity layer.
17. Data Dependencies
The gateway depends on these durable concepts:
organizations
Portal users
organization memberships
API key metadata / shadow records
model catalog
model entitlement rules
router pools
router targets
plan / quota state
usage events
request logs
audit logs
Unkey can own key verification, but Supabase should still store the Portal-facing key metadata needed for dashboard display and internal reporting.
18. Prototype Checklist
The first useful prototype should prove the complete request path.
Required Prototype
create an API key
verify the key at gateway
enforce a simple rate limit
accept one public model alias
resolve model alias to one router pool
forward to a local/test router endpoint
return normalized response
write one usage event
show usage row in a simple query/dashboard
Stretch Prototype
simulate quota exhaustion
revoke key and confirm access fails
mark pool disabled and confirm model returns unavailable
retry one alternate router target after failure
log request ID through gateway and router
19. Open Questions
Should quota exhaustion return
402from the beginning, or use403during free beta?Should V1 usage units be request-based, task-based, token-based, or weighted by model?
Should the gateway synchronously write
usage_events, or write to a queue first?What is the minimum router health signal needed for pool selection?
Should customer responses include router/request debug fields under a debug flag?
How much Unkey usage analytics should be mirrored into Supabase?
Do we need per-project keys in addition to per-organization keys for V1?
20. External References
Useful external references for later implementation comparison:
Supabase Auth
Supabase Row Level Security
Clerk docs
Clerk Organizations
Unkey docs
Unkey rate limiting
NGINX HTTP load balancing
NGINX reverse proxy
Stripe stablecoin payments
Coinbase Commerce docs
(These can be expanded into actual linked references in the documentation site if desired.)
21. Relationship To Other Specs
This spec consolidates and operationalizes:
03-api-key-management.md04-free-plan-rate-limits-and-gateway.md05-portal-backend-control-plane.md06-data-model-and-durable-state.md07-router-pools-and-model-product-layer.md08-usage-metering-and-billing.md09-admin-and-ops.md
It should be treated as the main runtime tracking document for the Portal Gateway.
22. Working Summary
Portal V1 Gateway should be treated as the runtime product boundary for the hosted Portal API.
It exists to:
authenticate and authorize requests
apply rate limits and quota semantics
resolve product model aliases to router pools
route requests into managed Cortensor infrastructure
emit durable usage and request data
present stable product-facing responses and errors
In one sentence:
The Portal Gateway is the runtime layer that turns Portal-issued credentials, product-level models, and simple allowance rules into a stable hosted Cortensor API in front of managed router pools.
Last updated