Streaming Protocols
We’ve highlighted the most popular protocols in the previous topic. While those are versatile and suitable for various use cases, they aren’t particularly optimized for streaming media data like video or audio. Given the increasing importance of media streaming today, let’s take a brief look at some protocols better suited for this purpose.
WebRTC
WebRTC (Web Real-Time Communication) is a protocol that enables peer-to-peer communication for audio, video, and data. The core principle of WebRTC is to connect clients directly without relying on a central server.
Public Address
By default, clients sit behind network routers, with their public addresses represented by the router’s address via NAT and an ephemeral port.
In theory, if two clients can exchange their public addresses, they may establish a direct connection.
STUN Server
STUN (Session Traversal Utilities for NAT) is a lightweight service that helps reveal a machine’s public address.
Why not rely solely on the router’s address? Because the nearest router might not be a public-facing one, sometimes it only serves a local network.
By using a STUN server, clients can discover their public addresses and attempt to establish direct connections. We’ll cover how they exchange these addresses in the next section.
TURN Server
Sometimes, addresses provided by a STUN server aren’t enough. If two clients haven’t communicated before, many routers are configured to reject unfamiliar addresses. If both clients reject each other, a direct connection becomes impossible.
A TURN (Traversal Using Relays around NAT) server helps in these cases by acting as a relay server between clients.
For example, Client 1
and Client 2
can both connect to a TURN (Traversal Using Relays around NAT) server.
This establishes the TURN server as a recognized intermediary, allowing them to indirectly exchange messages by relaying them through it.
Interactive Connectivity Establishment (ICE)
Interactive Connectivity Establishment (ICE) is responsible for identifying potential pathways for peer-to-peer connections.
Both clients (let’s call them A
and B
) gather possible ways to let the other connect. These are called ICE candidates and usually include:
The local address
A public address obtained via a STUN server
TURN candidates for fallback when direct connections fail
ICE A:
Local address: 192.168.1.1
Public address: 1.2.3.4:80
TURN candidates: [3.3.3.3:70, 4.4.4.4:90]
ICE B:
Local address: 192.168.1.1
Public address: 4.5.6.7:90
TURN candidates: [5.5.5.5:70, 4.4.4.4:90]
Signaling
Finally, they will exchange their ICE candidates through signaling, which is a separate mechanism not handled by WebRTC itself.
This could be a lightweight WebSocket server.
Or done manually via QR codes, chat messages, etc.
Once candidates are exchanged, they attempt to connect using the most efficient path: Local address (if on the same network) → Public address → TURN server (as a last resort).
WebRTC Use Cases
How do clients discover STUN and TURN servers? Some large services (like Google ) offer public servers, often hardcoded into browsers for seamless user experiences. However, it’s possible to deploy custom servers if needed.
WebRTC is a powerful protocol that removes the need for a central media server and pushes more responsibility onto the clients. That said, it’s best suited for one-to-one scenarios. In complex applications, like group calls with hundreds of participants, it can quickly overwhelm end-user devices or home networks.
HTTP Live Streaming (HLS)
HLS is a media streaming protocol developed by Apple that delivers video and audio content effectively. Unlike protocols such as WebSocket , which depend on a central, continuous live server, HLS offers a resilient and distributed system.
It works through segmentation, splitting audio or video into small, independent segments (files), usually a few seconds long.
- These segments are stored independently, potentially different servers.
- Master Record (e.g., a SQL row) manages and indexes these segments.
To play a video, the user first fetches the master record. When seeking a specific moment, only the necessary segments are downloaded.
For example, to watch the 11th
second, only Segment_2.mp4
would be retrieved.
In fact, to maintain a smooth experience, several sequential segments are usually preloaded in advance.