IoT Telemetry
MQTT-style pub/sub for devices, sensors, vehicles, cameras. Devices publish telemetry to per-device channels; backend processes or dashboards subscribe to roll-ups.
SignallingClient is the right class for this — no WebRTC, no per-peer state, multiple channels per connection. Smaller and simpler than MeteredPeer.
The SDK runs in Node, the browser, and any environment with WebSocket. For devices running constrained runtimes (Raspberry Pi, ESP32 with MicroPython, etc.) where @metered-ca/peer isn't a fit, the raw-WebSocket protocol is documented at Wire Format. The SDK and raw clients interoperate.
Architecture
Devices Backend / dashboards
───────── ─────────────────────
[ Sensor 1 ]──publish──┐
[ Sensor 2 ]──publish──┼──► fleet/sensor-{id} ──subscribe──► [ Dashboard ]
[ Sensor 3 ]──publish──┘ [ Alerter ]
[ Sensor 1 ]◄──subscribe──── fleet/sensor-1/cmd ──publish──── [ Control plane ]
Devices publish telemetry on fleet/sensor-{id}/telemetry. Backend services subscribe to fleet/sensor-*/telemetry (wildcard) for the firehose. To send commands to one device, the backend publishes to fleet/sensor-{id}/cmd; the device subscribes to its own command channel.
Device side
import { SignallingClient } from "@metered-ca/peer";
const DEVICE_ID = process.env.DEVICE_ID; // e.g. "sensor-42"
const client = new SignallingClient({
tokenProvider: async () => {
// Fetch a JWT from your provisioning service.
const r = await fetch(
`https://devices.yourapp.com/api/provision?deviceId=${DEVICE_ID}`,
{ headers: { Authorization: `Bearer ${process.env.DEVICE_SECRET}` } },
);
return (await r.json()).token;
},
reconnect: {
maxAttempts: Infinity, // devices never give up
maxDelayMs: 60_000, // cap backoff at 1 min — re-test the network often
},
});
await client.connect();
// Listen for commands directed at us.
await client.subscribe(`fleet/${DEVICE_ID}/cmd`);
client.on("message", ({ data }) => {
if (data.type === "reboot") rebootDevice();
if (data.type === "calibrate") calibrate(data.params);
});
// Publish telemetry on a schedule.
setInterval(async () => {
const reading = readSensor();
try {
await client.publish(`fleet/${DEVICE_ID}/telemetry`, {
ts: Date.now(),
temp: reading.temperature,
humidity: reading.humidity,
});
} catch (e) {
// Connection might be reconnecting. Buffer locally + flush on connected.
bufferLocally(reading);
}
}, 5_000);
// Flush buffered readings on reconnect.
client.on("connected", ({ isReconnect }) => {
if (isReconnect) flushBuffer(client, DEVICE_ID);
});
JWT claims for the device: channels: ["fleet/${DEVICE_ID}/**"], permissions: ["publish", "subscribe"]. The device can only publish and subscribe under its own namespace.
Backend / dashboard side
import { SignallingClient } from "@metered-ca/peer";
const client = new SignallingClient({
tokenProvider: async () => mintBackendJwt(),
});
await client.connect();
// Firehose subscribe for the whole fleet.
await client.subscribe("fleet/sensor-*/telemetry");
client.on("message", ({ channel, data }) => {
// channel is "fleet/sensor-42/telemetry"
const deviceId = channel.match(/^fleet\/(sensor-[^/]+)\/telemetry$/)?.[1];
updateDashboard(deviceId, data);
});
// Send a command to one device.
async function rebootDevice(deviceId) {
await client.publish(`fleet/${deviceId}/cmd`, { type: "reboot" });
}
Backend JWT: channels: ["fleet/**"], permissions: ["publish", "subscribe"]. Wider scope; secrets stay on the server.
High-frequency telemetry — what to watch out for
If you're publishing more than ~10 messages / sec / device:
Message quota costs add up. Every
publishcounts against your account's per-period message limit. A device publishing every 100 ms is 36,000 messages / hour. Make sure your plan covers your fleet × frequency.Bundle readings into bigger messages. Instead of one
publishper reading, accumulate 5–10 readings client-side and publish a batch. Saves message-count quota AND server CPU.
const batch = [];
setInterval(() => batch.push(readSensor()), 100);
setInterval(async () => {
if (batch.length === 0) return;
await client.publish(`fleet/${DEVICE_ID}/telemetry`, {
ts: Date.now(),
readings: batch.splice(0),
});
}, 1_000);
Server-routed
publishhas limits. Outbound message size cap iswelcome.maxMessageSize(server default 64 KB). A large batch might overflow — split or compress.Don't use the SDK as your storage layer. Messages are best-effort and ephemeral. If a subscriber is offline when a message is published, they miss it. For durable storage, write to a database (TimescaleDB, ClickHouse, etc.) — the signalling layer is for live coordination, not history.
Reliability — what to do when the network drops
Devices in the field lose connectivity all the time. The SDK's auto-reconnect handles most of it. Your code owns:
Local buffering on disconnect
Every publish while state !== "connected" will reject. Buffer locally and flush on reconnect:
const buffer = [];
async function safePublish(channel, data) {
if (client.state !== "connected") {
buffer.push({ channel, data, ts: Date.now() });
if (buffer.length > 10_000) buffer.shift(); // bound the buffer
return;
}
try {
await client.publish(channel, data);
} catch (e) {
buffer.push({ channel, data, ts: Date.now() });
}
}
client.on("connected", async ({ isReconnect }) => {
if (!isReconnect) return;
while (buffer.length > 0) {
const item = buffer.shift();
try {
await client.publish(item.channel, item.data);
} catch (e) {
buffer.unshift(item); // re-queue, will try again
return;
}
}
});
This gives you at-most-once delivery for transient drops. For at-least-once you'd need server-side dedup based on a messageId.
Bound the buffer
A device that's been offline for hours could buffer megabytes. Cap your buffer in MB or count of messages and decide your drop policy (oldest, newest, sample evenly).
Daemon-style reconnect
reconnect: { maxAttempts: Infinity, maxDelayMs: 60_000 }
The default 100 attempts will exhaust on a long outage. Devices typically want to retry forever. The 60 s cap keeps the device polite when the issue is server-side.
For the full reconnect playbook, see Reconnect Best Practices.
Use a separate connection per concern
A device that both publishes telemetry AND receives commands can use one connection for both (subscribe + publish on the same SignallingClient). That's the simplest setup and works for most devices.
For very high-frequency telemetry, you might split: one connection for fast publishes, one for command-handling. This isolates them so a burst of publishes doesn't slow command receipt. Most devices don't need this — start with one connection and split only if you hit limits.
Pitfalls
maxAttempts: 100on a device. Defaults are tuned for browsers. Devices needInfinity(or a very high number) — otherwise a 6-hour network outage takes the device offline permanently.No local buffering. Every publish during a disconnect throws away data. If your application cares about completeness, buffer.
Treating subscribers as durable. A subscriber that comes online after a message was published doesn't see the message. Persist anything you can't afford to lose.
High-frequency publishes without batching. Burns message quota fast. Batch every 1–5 seconds when feasible.
No backpressure on the publish path. If the device's CPU produces faster than the network can send, your publish queue grows. Bound it.
Using
peerMetadatafor device identity.peerMetadatais visible to every peer in the channel. Don't put serial numbers, MAC addresses, or PII there. Use the JWT'ssub(peerId) as the device identifier — it's set by your trusted backend.One JWT for many devices. Each device should have its own JWT with its own
sub. Sharing a JWT means all those devices have the same peerId, which breakspeer-joinedevents and directed sends to any one of them.No clock-sync on
tsfields. Device clocks drift. Useclient.serverTime(from theconnectedevent) as a calibration anchor if you need timestamps that align with the server's view.
See also
SignallingClientreference- Reconnect Best Practices — daemon-style settings
- Authentication — per-device JWT provisioning
- Raw WebSocket version — for devices the SDK doesn't run on