Event Streaming Platform

Event Streaming Platform

Messaging

We’ve previously used Messaging to illustrate system decoupling in the first topic.

Typically, Messaging refers to a shared message channel that facilitates communication between multiple processes or services.

Service 1Message BrokerService 2

Let’s look at two fundamental aspects of Messaging : Delivery and Retention.

Message Delivery

There are two common strategies for delivering messages to consumers:

Streaming

Streaming means messages are consumed one by one, immediately after being produced. This approach enables systems to react and process events as soon as they occur.

Event StreamMessage 1Message 2Service

Batching

In batching, messages are accumulated and processed together in groups. This can significantly reduce resource usage, since it removes the need for services to run continuously. Consumers can instead be lightweight, short-lived processes triggered on demand or at specific intervals.

Event StreamBatchServiceMessage 1Message 2

Message Retention

Another critical aspect is message retention, which describes how messages are stored and retained over time.

Message Queuing

The first approach uses a message queue, usually based on the classic first-in, first-out (Queue Data Structure). Messages are temporarily stored and removed once they are consumed. For instance, messages can be sequentially consumed by services as shown below:

Service 1Message QueueService 2Message QueueService 1Message QueueMessage 1Message 2Message 3Message 2Message 3Message 3ConsumeConsumeConsume

While this is efficient in terms of resource usage, it isn’t suitable for systems that require high reliability or audit trails. In those scenarios, messages are often considered valuable records of what occurred within the system.

Message Durability

To address this, more robust solutions persist messages the physical storage, ensuring they are retained even after being consumed. A key feature is that a single message can be consumed many times by multiple consumers.

Message BrokerMessage BrokerService 1Service 2Message 1Message 2Service 1Service 2Message 1Message 2ConsumeConsumeConsumeConsume

Event Streaming Platform

An Event Streaming Platform is a distributed implementation of Messaging , designed to offer high availability and fault tolerance.

Before diving deeper, let’s first clarify the concept of an Event.

Event

The term Message broadly refers to any piece of information exchanged within a system. Messages generally fall into two main categories:

  1. Command – A directive sent to the system, requesting it to perform a specific action.
  2. Event – A record of something that has already occurred.

Let’s consider a payment transaction as an example:

  • Command: The client begins the transaction by issuing a command such as InitiateTransaction.
  • Event: As the system processes the transaction, it generates events like AccountBalanceChanged, TransactionCompleted, or TransactionFailed.
ClientPayment ServiceAccountBalanceChangedTransactionCompletedTransactionFailedInitiateTransaction
ℹ️
Many architectures prioritize durable storage of events and may even bypass persistent storage of commands entirely. This explains why the term Event is often preferred over Message. We’ll delve deeper into this in the Event-driven Architecture topic.

Streaming Platform

Briefly, Event Streaming Platform is a messaging system that combines two key features:

  • Streaming: messages are delivered and consumed immediately after they’re produced, enabling near-realtime processing.
  • Durability: messages are durably stored in the underlying storage layer, allowing for replay.

Let’s explore how to build an Event Streaming Platform .

ℹ️
In the following section, we’ll focus on core concepts popularized by Apache Kafka, the industry’s most widely adopted solution today.

Event Streaming Cluster

An Event Streaming Platform operates as a decentralized cluster composed of multiple servers, commonly referred to as brokers.

Broker 1Broker 2Broker 3

For example, Kafka prioritizes consistency over availability. One broker is elected as the Controller node using the Raft algorithm.

Broker 1 (Controller)Broker 2 (Follower)Broker 3 (Follower)Replicate logsReplicate logs

Topic

A Topic is a logical grouping of events of the same type, simplifying event organization and management. For example, a topic named AccountCreated might contain events like the following:

AccountCreated:
- event1:
    accountId: acc1
    email: mylovelyemail@mail.com
    at: 00:01
- event2:
    accountId: acc2
    email: nottoday@mail.com
    at: 00:02

Append-only List

At a fundamental level, an Event Streaming Platform supports two operations: Produce (write) and Consume (read).

Events are stored as append-only files on brokers. Modifying an event in the middle of the log would require shifting all subsequent entries, which is inefficient and generally avoided. Thus, there are no update or delete operations.

BrokerAccountCreated.events fileBalanceChanged.events fileevent1:  accountId: acc01  email: mylovelyemail@mail.com  at: 00:01event2:  accountId: acc02  email: nottoday@mail.com  at: 00:02...(space for new events)...event1:  accountId: acc01  email: mylovelyemail@mail.com  at: 00:01event2:  accountId: acc02  email: nottoday@mail.com  at: 00:02...(space for new events)...event1:  accountId: acc01  balance: 100event2:  accountId: acc01  balance: 50event3:  accountId: acc02  balance: 30...(space for new events)...event1:  accountId: acc01  balance: 100event2:  accountId: acc01  balance: 50event3:  accountId: acc02  balance: 30...(space for new events)...

Partition

Storing an entire topic on a single broker is inefficient. The storage and access load for each topic may vary significantly, potentially causing uneven load distribution across brokers.

To address this, topics are split into smaller units called partitions, which are distributed across brokers. This concept is similar to Sharding in a Peer-to-peer cluster.

TopicPartition 1Partition 2Partition 3Broker 1Broker 2Broker 3

Partition Replication

To ensure fault tolerance and prevent data loss, partitions must be replicated across multiple brokers.

For instance, Partition 1, Partition 2, and Partition 3 are each assigned to a primary broker and replicated to another broker:

TopicClusterPartition 1Partition 2Partition 3Broker 1Broker 2Broker 3Partition 1 primaryPartition 2 replicaPartition 2 primaryPartition 3 replicaPartition 3 primaryPartition 1 replicaReplicateReplicateReplicate

Producing

Producing simply means appending events to the primary partition and subsequently synchronizing it to the replicas.

MQ clusterClientBroker 1 (Primary)Broker 2 (Replica)1. Produce an event2. Replicate

In-Sync Replica (ISR)

Replicas periodically fetch and compare data from the primary partition. This ensures that any newly added or previously corrupted replicas can catch up with the latest data.

In-Sync Replicas (ISR) are those replicas currently in sync with the primary partition. This is governed by a configurable time threshold. If a replica’s last fetch exceeds the threshold, it is considered out-of-sync.

For example, with an ISR threshold of 2 seconds, a replica that fetched data last at 00:02 while the primary is at 00:05 is out-of-sync.

Primary (Time = 00:05, ISR Threshold = 2s)Replica 1 (LastFetch = 00:04)Replica 2 (LastFetch = 00:02)

Acknowledgement (ACK) Levels

Similar to Quorum-based Consistency, a produce request includes an acknowledgement (ACK) setting. This determines how many in-sync replicas (including the primary partition) must successfully save the event before the producer receives a response.

There are three ACK levels:

  • ACK=0: No persistence is required. The primary partition responds instantly upon receiving the data from the producer, resulting in the lowest possible latency. However, this approach comes with the risk of data loss.
ProducerPrimary1. Produce (ACK = 0)2. Respond3. Save
  • ACK=1: Only the primary partition must save the data. If the partition fails before successfully replicating, data may be lost.
ProducerPrimaryReplica1. Produce (ACK = 1)2. Save3. Respond4. Crash before replicating (data loss)
  • ACK=ALL: All ISRs must save the data. This ensures no data loss even if the primary partition fails.
ProducerPrimaryIn-sync ReplicaOut-of-sync Replica1. Produce (ACK = ALL)2. Save3. Replicate4. Respond5. Crash here but no data loss

Using ISRs (in-sync replicas) instead of all replicas is important because out-of-sync replicas might be slow or unavailable due to crashes or partitioning. Waiting for all replicas can degrade performance or block the partition entirely. Once a replica becomes in-sync again, it will fetch any missed events from the primary partition.

ℹ️
Please keep the ACK settings in mind, this mechanism is crucial for understanding Delivery Semantics.

Consuming

Event Streaming typically uses Long Polling for event delivery. This approach decouples producers and consumers, improving system availability.

Commit Offset

In streaming systems, Offset refers to the sequential position of an event in an append-only log. Please note that event offsets are managed at the partition level, not globally across the entire topic.

Partition 1Partition 2AccountCreated.Events fileAccountCreated.Events fileevent1:  offset: 1  accountId: acc1event3:  offset: 2  accountId: acc3event1:  offset: 1  accountId: acc1event3:  offset: 2  accountId: acc3event2:  offset: 1  accountId: acc2event4:  offset: 2  accountId: acc4event2:  offset: 1  accountId: acc2event4:  offset: 2  accountId: acc4

To prevent duplicate processing, each partition keeps track of the last consumed offset for each consumer.

Partition 1Partition 2OffsetsAccountCreated.Events fileOffsetsAccountCreated.Events fileconsumer1:    lastOffset: 1consumer2:    lastOffset: 2consumer1:    lastOffset: 1consumer2:    lastOffset: 2event1:  offset: 1  accountId: acc1event3:  offset: 2  accountId: acc3event1:  offset: 1  accountId: acc1event3:  offset: 2  accountId: acc3consumer1:    lastOffset: 1consumer2:    lastOffset: 1consumer1:    lastOffset: 1consumer2:    lastOffset: 1event2:  offset: 1  accountId: acc2event4:  offset: 2  accountId: acc4event2:  offset: 1  accountId: acc2event4:  offset: 2  accountId: acc4

Consumers periodically fetch new events from partitions (those with offsets greater than the last one they processed). After handling these events, consumers commit or advance their offsets to ensure they do not reprocess the same data in future cycles.

ConsumerPartition1. Consume2. Return an event3. Handle the event4. Commit offset (offset = offset + 1)

Consumer Group

A topic can have multiple partitions, making it inefficient for a single consumer to handle all alone. Instead, a consumer group allows multiple consumers to read different partitions in parallel.

All consumers in a group share the same name and commit offset collectively, ensuring each event is processed only once by the group.

For example:

  • The AccountCreated topic is divided into two partitions, and each partition keeps track of its own consumer offsets.
  • In Group A, each consumer is assigned to a different partition.
  • In Group B, there is only one consumer, so it processes all partitions.
  • In Group C, there are three consumers, which is more than the number of partitions, so one consumer remains idle.
AccountCreatedConsumersPartition 1Partition 2Consumer Group AConsumer Group BConsumer Group COffsetsAccountCreated.Events fileOffsetsAccountCreated.Events fileConsumer 1Consumer 2Consumer 1Consumer 1Consumer 2Consumer 3consumerGroupA:    lastOffset: 1consumerGroupB:    lastOffset: 2consumerGroupC:    lastOffset: 2consumerGroupA:    lastOffset: 1consumerGroupB:    lastOffset: 2consumerGroupC:    lastOffset: 2event1:  offset: 1  accountId: acc1event3:  offset: 2  accountId: acc3event1:  offset: 1  accountId: acc1event3:  offset: 2  accountId: acc3consumerGroupA:    lastOffset: 2consumerGroupB:    lastOffset: 1consumerGroupC:    lastOffset: 2consumerGroupA:    lastOffset: 2consumerGroupB:    lastOffset: 1consumerGroupC:    lastOffset: 2event2:  offset: 1  accountId: acc2event4:  offset: 2  accountId: acc4event2:  offset: 1  accountId: acc2event4:  offset: 2  accountId: acc4

Partition replicas serve solely for backup and recovery purposes, consumers must always read from the primary broker. Unlike traditional databases, where read operations do not alter the state of the system, in Event Streaming Platform , consumption is non-idempotent because each read updates the consumer’s offset.

Last updated on