Rate Limits & Quotas
Three independent layers, each with a different purpose:
| Layer | Scope | Limit | Hit behavior |
|---|---|---|---|
| Plan quota | Per-customer, per-billing-period | maxConcurrentConnections + maxMessagesPerPeriod | Soft drop when overages are off; bill at overage rate when on. |
| Token bucket | Per-connection | 100 msg/sec sustained, 200 burst (default; plan-tunable) | Hard close with 4011 — abuse bound, not billing. |
| Per-IP connect | Per-source-IP | 60 attempts per 60s (default; configurable) | HTTP 429 rejection at upgrade time (no WS handshake). |
Plus a hard ceiling that applies to everyone:
| Limit | Value |
|---|---|
| Max inbound message size | 64 KB per frame (close code 1009) |
| Max channels per connection | 100 |
| Max peerId length | 128 bytes, printable ASCII |
| Max channel name length | 256 bytes |
Max requestId length | 128 bytes |
Max peerMetadata serialized | 4 KB (rejected at JWT verify if larger) |
Max metadata serialized | 8 KB |
Plan quota — concurrent + messages
Every plan has two axis caps:
maxConcurrentConnections— peak simultaneous WebSocket connections from this appmaxMessagesPerPeriod— total messages (publish + send) accumulated over the billing period
Both are read from the customer's signalling plan (dashboard pricing page) and refreshed in the signalling-server's plan cache every ~60s when the dashboard changes them.
What "over plan" means
| Situation | Connection request | Publish/send request |
|---|---|---|
| Within plan | Allowed | Allowed |
| Over plan, overages OFF | Rejected with close code 4010 | Soft-drop with error: over_message_quota. Connection stays open. |
| Over plan, overages ON, balance OK | Allowed; billed at overage rate | Allowed; billed at overage rate |
| Over plan, overages ON, balance exhausted, auto-recharge OFF | Rejected (4010) | Soft-drop with error: over_message_quota. |
| Over plan, overages ON, balance exhausted, auto-recharge ON | Auto-recharge runs; if it succeeds, allowed | Same |
Overage rates are visible on the dashboard billing page for your app. Free tier has hard caps and no overages (overagesAllowed: false).
Over-message-quota is a soft drop, not a close
When you hit maxMessagesPerPeriod with overages off, every subsequent publish/send is rejected with an error message carrying code: "over_message_quota". The WebSocket stays open — your other peers can keep receiving from the channels they're subscribed to, and you can keep subscribing/unsubscribing. Only publish and send are blocked.
This means a chat app's "receiving" half keeps working even if the same user is over the message quota for sending.
Token bucket — per-connection abuse bound
Independent of billing. Each connection gets its own token bucket:
- Refill rate: 100 tokens/sec (default; per-plan-tunable)
- Burst capacity: 200 tokens (default; refill rate × 2)
- Cost: 1 token per
publishorsend
If your connection sustains > refill rate, the bucket drains. Once empty, the next publish/send triggers a hard close with code 4011 (Over Message Rate).
This is not billing. It's there to prevent a runaway client from drowning the server, and to give us a clean lever for terminating clearly-misbehaving connections. Most apps won't come close — chat is typically << 1 msg/sec/user, even cursor sync at 30 Hz fits in the burst.
If your legitimate use case needs sustained > 100 msg/sec per peer (e.g., very-high-frequency telemetry from a single device), contact sales for a tunable rate.
Per-IP connect rate limit
Anti-abuse against the connect endpoint. A foreign host hammering with bad tokens would otherwise exhaust JWT-verify CPU.
- Default: 60 connect attempts per 60s per source IP.
- Hit behavior: HTTP 429 at the upgrade response, no WebSocket handshake.
- Source IP:
X-Forwarded-Forif the deploy is behind trusted proxies (configured server-side), else the socket's remote address.
This is not visible to in-flight connections — only to new connect attempts. Customers behind a single corporate NAT shouldn't hit this; values are tuned for the common case where many users come from many IPs.
Where to inspect your usage
- Dashboard — Realtime Messaging → Usage shows peak concurrent vs limit, messages used vs limit, and overage cost so far this period. Values are live-ish (≤60s lag from the last server flush).
- REST API —
GET /v1/usagereturns the same composite shape, programmatically. See REST API → Usage.
See also
- Close Codes for what each rate-limit hit looks like on the wire
- Error Codes for the soft-drop
error: over_message_quotapayload - Quickstart if you haven't connected yet