Rate Limits & Quotas

Three independent layers, each with a different purpose:

Layer	Scope	Limit	Hit behavior
Plan quota	Per-customer, per-billing-period	`maxConcurrentConnections` + `maxMessagesPerPeriod`	Soft drop when overages are off; bill at overage rate when on.
Token bucket	Per-connection	100 msg/sec sustained, 200 burst (default; plan-tunable)	Hard close with 4011 — abuse bound, not billing.
Per-IP connect	Per-source-IP	60 attempts per 60s (default; configurable)	HTTP 429 rejection at upgrade time (no WS handshake).

Plus a hard ceiling that applies to everyone:

Limit	Value
Max inbound message size	64 KB per frame (close code 1009)
Max channels per connection	100
Max peerId length	128 bytes, printable ASCII
Max channel name length	256 bytes
Max `requestId` length	128 bytes
Max `peerMetadata` serialized	4 KB (rejected at JWT verify if larger)
Max `metadata` serialized	8 KB

Plan quota — concurrent + messages

Every plan has two axis caps:

maxConcurrentConnections — peak simultaneous WebSocket connections from this app
maxMessagesPerPeriod — total messages (publish + send) accumulated over the billing period

Both are read from the customer's signalling plan (dashboard pricing page) and refreshed in the signalling-server's plan cache every ~60s when the dashboard changes them.

What "over plan" means

Situation	Connection request	Publish/send request
Within plan	Allowed	Allowed
Over plan, overages OFF	Rejected with close code 4010	Soft-drop with `error: over_message_quota`. Connection stays open.
Over plan, overages ON, balance OK	Allowed; billed at overage rate	Allowed; billed at overage rate
Over plan, overages ON, balance exhausted, auto-recharge OFF	Rejected (4010)	Soft-drop with `error: over_message_quota`.
Over plan, overages ON, balance exhausted, auto-recharge ON	Auto-recharge runs; if it succeeds, allowed	Same

Overage rates are visible on the dashboard billing page for your app. Free tier has hard caps and no overages (overagesAllowed: false).

Over-message-quota is a soft drop, not a close

When you hit maxMessagesPerPeriod with overages off, every subsequent publish/send is rejected with an error message carrying code: "over_message_quota". The WebSocket stays open — your other peers can keep receiving from the channels they're subscribed to, and you can keep subscribing/unsubscribing. Only publish and send are blocked.

This means a chat app's "receiving" half keeps working even if the same user is over the message quota for sending.

Token bucket — per-connection abuse bound

Independent of billing. Each connection gets its own token bucket:

Refill rate: 100 tokens/sec (default; per-plan-tunable)
Burst capacity: 200 tokens (default; refill rate × 2)
Cost: 1 token per publish or send

If your connection sustains > refill rate, the bucket drains. Once empty, the next publish/send triggers a hard close with code 4011 (Over Message Rate).

This is not billing. It's there to prevent a runaway client from drowning the server, and to give us a clean lever for terminating clearly-misbehaving connections. Most apps won't come close — chat is typically << 1 msg/sec/user, even cursor sync at 30 Hz fits in the burst.

If your legitimate use case needs sustained > 100 msg/sec per peer (e.g., very-high-frequency telemetry from a single device), contact sales for a tunable rate.

Per-IP connect rate limit

Anti-abuse against the connect endpoint. A foreign host hammering with bad tokens would otherwise exhaust JWT-verify CPU.

Default: 60 connect attempts per 60s per source IP.
Hit behavior: HTTP 429 at the upgrade response, no WebSocket handshake.
Source IP: X-Forwarded-For if the deploy is behind trusted proxies (configured server-side), else the socket's remote address.

This is not visible to in-flight connections — only to new connect attempts. Customers behind a single corporate NAT shouldn't hit this; values are tuned for the common case where many users come from many IPs.

Where to inspect your usage

Dashboard — Realtime Messaging → Usage shows peak concurrent vs limit, messages used vs limit, and overage cost so far this period. Values are live-ish (≤60s lag from the last server flush).
REST API — GET /v1/usage returns the same composite shape, programmatically. See REST API → Usage.

Rate Limits & Quotas

Plan quota — concurrent + messages​

What "over plan" means​

Over-message-quota is a soft drop, not a close​

Token bucket — per-connection abuse bound​

Per-IP connect rate limit​

Where to inspect your usage​

See also​