Communication Protocols
The communication protocol fundamentally shapes the way a service is built. With so many types available, selecting the right one requires a thorough understanding of their workflows.
Hypertext Transfer Protocol (HTTP)
Hypertext Transfer Protocol (HTTP) is built on top of Transmission Control Protocol (TCP) and is widely regarded as the most common solution in many systems.
The concept is simple: clients send a request and receive an associated response immediately.
HTTP/1.0
The initial version of HTTP establishes a separate TCP connection for each request.
Creating a TCP connection is resource-intensive, especially when using SSL/TLS. This becomes inefficient when clients need to make multiple requests simultaneously, as numerous connections will be established as a result.
HTTP/1.1
HTTP/1.1 introduced an improvement by keeping a connection open for a short duration before disposing of it. This behavior is controlled by the Keep-Alive header, which specifies the connection’s lifespan.
HTTP has some potential drawbacks:
- Synchronous limitation: HTTP requires waiting for the request to complete, which is inefficient for long-running tasks better suited to the asynchronous manner.
- One-way communication: Requests always originate from the client side, with no mechanism for the server to actively send messages back.
However, the simplicity and lightweight make it beneficial in many scenarios. This protocol is ideal for simplifying communication between the client and server sides, such as in:
- Client-facing services.
- Public APIs exposed to external systems.
Polling
Short Polling
To enable bidirectional communication with HTTP , a naive approach involves having clients continuously request to the server side to pull new notifications.
This approach is known as Short Polling . It is highly inefficient in terms of bandwidth, hundreds of requests might be made just to retrieve a single notification.
Long Polling
To improve efficiency, the server side should hold requests for a short duration before responding, this brief retention significantly reduces the number of unnecessary requests. This pattern is known as Long Polling .
Long Polling is a traditional method for real-time notifications from the server side. Since requests originate from the client side, Long Polling is well-suited for:
- Decoupling the server from the client side and increasing its availability.
- Back pressure-aware clients, allowing them to control their polling behavior autonomously, such as setting delays between polls or specifying the number of messages per poll.
Long Polling is best implemented as a stateless service. Connections are short-lived; clients can conveniently switch to any server to crawl data from a shared store.
For example, the instances of a stateless service share and poll the same store.
Despite being more efficient than Short Polling , Long Polling remains resource-intensive, often generating many redundant requests before retrieving any actual piece of data.
Furthermore, it doesn’t fully provide the real-time capability. Since clients decide when to pull data, making messages can’t be transmitted immediately after their creations.
WebSocket
This is a more modern technology than Long Polling . In short, a WebSocket server maintains long-lived connections, allowing both sides to actively exchange messages through these connections.
Basically, WebSocket offers better performance than Long Polling by exchanging messages only when necessary, resulting in lower latency and reduced bandwidth usage. It’s excellent for bidirectional and low-latency communication, e.g., gaming service, chat service.
Some critical drawbacks of WebSocket include:
- Availability: the server side depends on the client side and worsens its availability.
- Resource utilization: a WebSocket connection is long-lived and tied to a specific server, making it bad for resource utilization. For example, a client relentlessly interacts with a fixed server, making others slack; although it’s better to distribute and share the load among them.
Stateful Misconception
Do you think maintaining long-lived connections makes a service stateful? The answer is no!
The communication protocol doesn’t represent this property, Stateful or Stateless is actually based on how we implement the service. Get back to the chat example in the previous topic, we’ve mentioned it as a stateful service due to keeping user connections on different servers.
Let’s approach from a different angle. Instead of sending messages directly between instances, we let them periodically pull from a shared store. Now, it’s stateless! All instances perform the same; it doesn’t matter which one a client connects to.
Use Cases
In fact, people tend to use WebSocket for real-time notification, when messages are delivered immediately after their creation. An indirect paradigm (e.g., polling) is impossible for the task as it creates brief delays; a direct and stateful model is mandatory.
Server-Sent Events
As the name suggests, Server-Sent Events (SSE) is a half-duplex protocol, that means it maintains long-lived connections yet only allowing data to be sent from the server side.
Behind the scenes, Server-Sent Events is built on top of the HTTP protocol. Thus, developing and maintaining an SSE application is simpler than WebSocket , as it can leverage existing HTTP tools, such as connection management and caching.
Additionally, a unidirectional connection incurs less overhead than a full-duplex connection. Server-Sent Events is recommended if the application only needs to send data from the server side, e.g., live scores, news websites.
Similar to WebSocket (maintaining long-lived connections), Server-Sent Events also introduces the same problems about coupling and resource balancing.
Google Remote Procedure Call (gRPC)
gRPC
is a modern technology developed by Google
,
enabling both bidirectional and unidirectional
communication over Remote Procedure Call (RPC) and
HTTP/2
protocol.
Remote Procedure Call (RPC)
Normally, to call an HTTP endpoint, an application must handle various details, such as the URI, headers, and parameters to construct a proper request string. While this approach offers flexibility, it can also be complex and prone to errors.
GET /docs?name=README&team=dev HTTP/2
In contrast, RPC is more structured, requiring both the client and server to agree on a shared contract representing exposed endpoints. This contract is usually built as a shared library, making the interaction convenient, like working with local functions.
For example, the Chat Service
exposes a Chat
function;
This exposure is wrapped as a shared library for consumers.
// Exchange schema
message ChatRequest {
string content;
}
message ChatResponse {
string messageId;
}
// Service definition
service ChatService {
rpc Chat (ChatRequest) returns (ChatResponse);
}
// Service is called from the client side conveniently
var chatService = new ChatService();
var chatResponse = chatService.Chat(new ChatRequest("Hello Bro!"));
Another advantage of RPC is fast serialization. Typically, JSON and XML are commonly used to exchange data due to their versatility across many use cases, but their serialization process is slow because they are text-based and unstructured. With a prepared definition, RPC can optimize by pre-generating a byte-based efficient serializer, such as Protocol Buffers.
One drawback of RPC is the coupling it creates between the server and client sides. Any change in the contract requires redeployment on both ends. Therefore, gRPC is rarely used for public-facing applications, when a server may serve multiple types of clients.
HTTP/2
HTTP/1.1 establishes a connection between the server and client, with all data transferred in order through this pipeline. To enhance, HTTP/2 divides a connection into independent streams, allowing multiple requests and responses to be sent concurrently. For example:
In the HTTP/1.1 context,
dog.png
is only downloaded afterindex.html
has been fetched.In the HTTP/2 context, the requests are sent simultaneously through
Stream 1
andStream 2
, and the resources can be downloaded together.
Behind the scenes, it still uses a single TCP connection, with each message tagged by a Stream ID. Messages with the same Stream ID are reassembled together, allowing multiple streams to run in parallel over one connection.
Use Cases
Back to gRPC , it’s a protocol built on top of HTTP/2 and RPC , making it highly efficient for transmitting parallel requests simultaneously.
Similar to WebSocket and Server-Sent Events , gRPC also maintains long-lived connections. However, the more complex the network connection is, the more resources it consumes. gRPC generally requires more computing power to manage parallel transmissions and assemble responses. If the service doesn’t require parallelism but valuing in-ordered actions, consider using WebSocket or Server-Sent Events instead.
Webhook
Webhook is an effective protocol for handling long-running requests. Its concept is similar to a function pointer in programming.
The client side registers callbacks (usually a URL ) with the server, later invoked to notify responses.
For example, a client registers with an address.
Whenever the server needs to notify the client,
it will request to site.com/callback
.
This approach is ideal for tasks with unpredictable execution time, helping avoid resource waste due to long waits. For example, in payment processing, when a client pays, it may pass through multiple banking systems (possibly different countries), and that can take a long time to complete.
Webhook is an elegant protocol. It’s highly efficient for real-time notification by reducing server load, as data is sent only when events occur, without the need for long-lived connections or polling mechanisms.
Use Cases
Miserably, this is impractical for serving end users, as they typically don’t have a public address for the callback purpose. Furthermore, in this model, the server side becomes the originator, and its availability is negatively impacted.
In practice, this protocol is often used to support external services, like Stripe Payments, where the system interacts with numerous uncontrolled clients. In such cases, solutions like a live WebSocket server or Long Polling would consume significant resources.