Scaling Hybrid Classrooms: Reducing Costs by 78% with WebRTC & HLS
Introduction
Building real-time collaborative applications at scale is one of the most challenging problems in modern web engineering. When we set out to create a virtual classroom platform capable of hosting thousands of concurrent users, we quickly discovered that no single technology could solve the problem elegantly. WebRTC delivers the low-latency, interactive experience users expect, but it doesn't scale beyond a few hundred participants. Traditional streaming protocols like HLS can reach millions of viewers, but the 6-10 second latency makes real-time interaction impossible.
This is the story of how we built a hybrid architecture that combines the best of both worlds—and the engineering decisions that made it possible.
The Problem: WebRTC's Scalability Ceiling
WebRTC excels at low-latency, bidirectional communication. It's the technology powering Google Meet, Zoom, and countless other video conferencing applications. However, WebRTC was designed for small-group communication, not broadcast scenarios.
A typical LiveKit SFU (Selective Forwarding Unit) can handle 200-300 participants before experiencing degradation. Beyond this point, several issues emerge:
Meanwhile, our virtual classroom platform needed to support 1000+ concurrent viewers while maintaining real-time interactivity for teachers and active students. A traditional lecture might have one teacher, 10-20 active participants asking questions, and hundreds of passive viewers watching the stream.
The challenge became clear: How do we combine the intimacy of WebRTC with the scalability of broadcast streaming?
The Hybrid Solution: Two-Tier Architecture
We designed a two-tier architecture that separates users by their role and interaction level:
| Tier | Technology | Latency | Capacity | Use Case |
|------|------------|---------|----------|----------|
| Interactive | LiveKit WebRTC | ~100ms | ~200 users | Teachers, active students |
| Passive | HLS via Egress | ~6-10s | Unlimited | Observers, late joiners |
The key insight is that in most educational scenarios, only a small percentage of users need bidirectional communication at any given time. The majority are passive consumers who benefit from the reliability and scalability of HLS.
Architecture Diagram
┌─────────────────────────────────────────────────────────────┐
│ LiveKit SFU Cluster │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Teacher │ │ Student A │ │ Student B │ │
│ │ (publish) │ │ (pub/sub) │ │ (pub/sub) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ └────────────────┼────────────────┘ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Egress Service │ │
│ │ (Composite) │ │
│ └────────┬────────┘ │
└──────────────────────────┼──────────────────────────────────┘
▼
┌─────────────────┐
│ HLS Origin │
│ (S3/CDN) │
└────────┬────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Viewer 1 │ │ Viewer 2 │ │ Viewer N │
│ (HLS) │ │ (HLS) │ │ (HLS) │
└───────────┘ └───────────┘ └───────────┘
1000+ Passive Viewers (Scalable via CDN)
The ha-api Service: Orchestration Layer
The ha-api service is the brain of our hybrid architecture. Built with Node.js and Express, it handles:
Here's how we initialize HLS egress when a room is created:
import { EgressClient, EncodedFileType, SegmentedFileProtocol } from 'livekit-server-sdk';const egressClient = new EgressClient(
process.env.LIVEKIT_URL!,
process.env.LIVEKIT_API_KEY!,
process.env.LIVEKIT_API_SECRET!
);
async function startRoomEgress(roomName: string): Promise {
const egress = await egressClient.startRoomCompositeEgress(
roomName,
{
segmentOutputs: [{
protocol: SegmentedFileProtocol.HLS_PROTOCOL,
filenamePrefix: hls/${roomName}/stream,
playlistName: 'playlist.m3u8',
segmentDuration: 4,
s3: {
bucket: process.env.S3_BUCKET!,
region: process.env.S3_REGION!,
accessKey: process.env.S3_ACCESS_KEY!,
secret: process.env.S3_SECRET_KEY!,
},
}],
},
{
layout: 'speaker',
audioOnly: false,
videoOnly: false,
}
);
return egress.egressId;
}
Deep Dive: NATS JetStream Signaling
The most interesting engineering challenge we faced was this: How does a passive HLS viewer (who has no WebRTC connection to the room) signal the moderator that they want to speak?
In a traditional WebRTC setup, participants use data channels or the signaling server to send messages. But our passive viewers are completely disconnected from the LiveKit infrastructure—they're just watching an HLS stream through a CDN.
We solved this with a side-channel signaling system using NATS JetStream.
Why NATS JetStream?
We evaluated several options for our signaling backbone:
| Technology | Pros | Cons |
|------------|------|------|
| Redis Pub/Sub | Simple, fast | No persistence, no replay |
| Kafka | Durable, scalable | Heavy, complex setup |
| RabbitMQ | Mature, reliable | Doesn't fit event-streaming model |
| NATS JetStream | Lightweight, persistent, exactly-once | Perfect fit |
NATS JetStream gave us the best of all worlds: the simplicity of Redis with the durability of Kafka, all in a single lightweight binary.
Signal Flow Architecture
┌──────────────────┐ ┌──────────────────┐
│ Passive Viewer │ │ Moderator │
│ (HLS Client) │ │ (WebRTC Client) │
└────────┬─────────┘ └────────┬─────────┘
│ │
│ HTTP POST /api/signal/raise-hand │
▼ │
┌──────────────────┐ │
│ ha-api │ │
│ (Express.js) │ │
└────────┬─────────┘ │
│ │
│ js.publish('room.{id}.signal.raise-hand') │
▼ │
┌──────────────────┐ │
│ NATS JetStream │ │
│ (Stream) │ │
└────────┬─────────┘ │
│ │
│ Consumer subscription │
▼ │
┌──────────────────┐ │
│ ha-api │──────── SSE Push ─────────────────────▶
│ (SSE Endpoint) │ 'RAISE_HAND' event │
└──────────────────┘ ▼
┌──────────────────┐
│ Moderator sees │
│ raise-hand UI │
└──────────────────┘
Implementation Details
NATS Stream Configuration:
import { connect, JetStreamManager, RetentionPolicy, StorageType } from 'nats';async function setupNatsStreams() {
const nc = await connect({ servers: process.env.NATS_URL });
const jsm = await nc.jetstreamManager();
// Create stream for room signals
await jsm.streams.add({
name: 'ROOM_SIGNALS',
subjects: ['room..signal.'],
retention: RetentionPolicy.Limits,
storage: StorageType.Memory,
max_age: 3600 * 1e9, // 1 hour in nanoseconds
max_msgs_per_subject: 1000,
});
return nc;
}
Publishing a Raise-Hand Event:
app.post('/api/rooms/:roomId/raise-hand', authenticate, async (req, res) => {
const { roomId } = req.params;
const { userId, displayName } = req.user;
const subject = room.${roomId}.signal.raise-hand;
const payload = JSON.stringify({
type: 'RAISE_HAND',
userId,
displayName,
timestamp: Date.now(),
metadata: {
viewerType: 'passive',
connectionId: req.headers['x-connection-id'],
},
});
await js.publish(subject, payload, {
msgID: raise-hand-${userId}-${Date.now()}, // Deduplication
});
res.json({ success: true, message: 'Hand raised successfully' });
});
SSE Endpoint for Moderators:
app.get('/api/rooms/:roomId/events', authenticate, requireModerator, async (req, res) => {
const { roomId } = req.params;
// Set up SSE headers
res.setHeader('Content-Type', 'text/event-stream');
res.setHeader('Cache-Control', 'no-cache');
res.setHeader('Connection', 'keep-alive');
res.flushHeaders();
// Subscribe to room signals
const consumer = await js.consumers.get('ROOM_SIGNALS', moderator-${roomId});
const messages = await consumer.consume();
for await (const msg of messages) {
const event = JSON.parse(msg.data.toString());
res.write(event: ${event.type}\n);
res.write(data: ${JSON.stringify(event)}\n\n);
msg.ack();
}
req.on('close', () => {
messages.stop();
});
});
The Seamless Promotion Flow
The crown jewel of our architecture is the promotion flow—the ability to instantly upgrade a passive HLS viewer to an active WebRTC participant without any page reload or context loss.
The User Journey
canPublish: true and canSubscribe: trueClient-Side Implementation (React)
import { useEffect, useState, useCallback } from 'react';
import { LiveKitRoom, VideoConference } from '@livekit/components-react';
import HLSPlayer from './HLSPlayer';type ViewerMode = 'passive' | 'interactive' | 'transitioning';
interface RoomViewerProps {
roomId: string;
hlsUrl: string;
userId: string;
}
export function RoomViewer({ roomId, hlsUrl, userId }: RoomViewerProps) {
const [viewerMode, setViewerMode] = useState('passive');
const [livekitToken, setLivekitToken] = useState(null);
const [isHandRaised, setIsHandRaised] = useState(false);
// SSE connection for receiving events
useEffect(() => {
const eventSource = new EventSource(
/api/rooms/${roomId}/user-events?userId=${userId}
);
eventSource.addEventListener('PROMOTE', (event) => {
const data = JSON.parse(event.data);
console.log('Promotion received!', data);
setViewerMode('transitioning');
setLivekitToken(data.token);
// Small delay to ensure clean transition
setTimeout(() => {
setViewerMode('interactive');
setIsHandRaised(false);
}, 500);
});
eventSource.addEventListener('DEMOTE', (event) => {
console.log('Demotion received');
setViewerMode('transitioning');
setTimeout(() => {
setLivekitToken(null);
setViewerMode('passive');
}, 500);
});
eventSource.onerror = (error) => {
console.error('SSE connection error:', error);
// Implement reconnection logic
};
return () => eventSource.close();
}, [roomId, userId]);
const handleRaiseHand = useCallback(async () => {
try {
await fetch(/api/rooms/${roomId}/raise-hand, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
});
setIsHandRaised(true);
} catch (error) {
console.error('Failed to raise hand:', error);
}
}, [roomId]);
// Render based on current mode
if (viewerMode === 'transitioning') {
return (
Connecting to live session...
);
}
if (viewerMode === 'interactive' && livekitToken) {
return (
token={livekitToken}
serverUrl={process.env.NEXT_PUBLIC_LIVEKIT_URL}
connect={true}
audio={true}
video={true}
>
);
}
// Passive mode - HLS viewer
return (
{/ Raise Hand Button /}
onClick={handleRaiseHand}
disabled={isHandRaised}
className={px-6 py-3 rounded-full font-medium transition-all ${
isHandRaised
? 'bg-yellow-500 text-black'
: 'bg-blue-600 hover:bg-blue-700 text-white'
}}
>
{isHandRaised ? '✋ Hand Raised' : '🙋 Raise Hand'}
{/ Passive mode indicator /}
📺 Watching (HLS)
);
}
Server-Side Promotion Handler
app.post('/api/rooms/:roomId/promote/:userId', authenticate, requireModerator, async (req, res) => {
const { roomId, userId } = req.params;
// Generate LiveKit token with publishing permissions
const token = new AccessToken(
process.env.LIVEKIT_API_KEY!,
process.env.LIVEKIT_API_SECRET!,
{
identity: userId,
ttl: '24h',
}
);
token.addGrant({
room: roomId,
roomJoin: true,
canPublish: true,
canSubscribe: true,
canPublishData: true,
});
const jwt = token.toJwt();
// Publish promotion event via NATS
await js.publish(room.${roomId}.user.${userId}.event, JSON.stringify({
type: 'PROMOTE',
token: jwt,
roomId,
timestamp: Date.now(),
}));
res.json({ success: true, message: 'User promoted successfully' });
});
Performance Results and Lessons Learned
After extensive load testing and production deployment, here are our results:
Metrics
| Metric | Target | Achieved |
|--------|--------|----------|
| Max concurrent users | 1,000 | 1,247 |
| Promotion latency | <5s | 2.3s avg |
| HLS stream delay | <15s | 8.2s avg |
| Infrastructure cost vs pure WebRTC | -50% | -78% |
| Client CPU usage (passive) | <10% | 4.2% |
Key Lessons
Conclusion
Building scalable real-time applications requires thinking beyond single-technology solutions. Our hybrid WebRTC-HLS architecture demonstrates that by intelligently combining technologies based on use case, we can achieve both the interactivity users expect and the scalability businesses require.
The key architectural decisions that made this possible:
This approach has allowed us to scale our virtual classroom platform from hundreds to thousands of users while actually reducing infrastructure costs. The same patterns could be applied to webinars, live events, gaming spectator modes, or any scenario where you need to combine real-time interaction with broadcast scale.
---
Have questions about implementing similar architectures? We'd love to hear from you in the comments below.