Master-slave Architecture

Many systems experience heavy read workloads, where read operations significantly outnumber write operations. To meet this demand, we can design a database architecture with a single writer (Primary) and multiple readers (Replicas).

The writer propagates changes to the replicas.
The replicas can serve read requests independently, offloading the primary and improving read scalability.

This setup is commonly known as the Master-slave Architecture (good name 🧐).

Multi-master

Now, what happens if we allow multiple writers? Would that significantly improve write performance?

However, this paradigm does not enhance write throughput. Every write operation must still be synchronized across all nodes in the cluster. This contrasts with read replicas, where each read request can be independently handled by a single replica.

The key advantage of a Multi-Master setup lies in higher availability. If one master fails, others can continue to process writes, avoiding downtime.

SQL Paradox

The most widely adopted form of the Master-slave model is SQL databases, as a single writer makes it easier to maintain strong consistency for ACID transactions.

Because of this, Multi-Master setups are rarely used in practice. They don’t offer enough benefits to justify their complexity:

If the masters collaborate to maintain ACID , they must compromise availability.
If they asynchronously replicate, they risk violating ACID principles.

Standby Promotion

Back to the Master-slave model, the master handles all updates, becoming a Single Point of Failure that can affect system availability. To mitigate the impact of master failure, we can introduce a Standby Server that is synchronously replicated from the master. In the event of a failure, we can quickly promote the standby to become the new master.

Centralized Cluster

The Master-slave model is often deployed as a centralized cluster, with a Coordinator that acts as the cluster’s entry point.

Since each server has a predefined role (master or replica), the Coordinator can:

Route write requests to the master.
Distribute (aka load balancing) read requests across replicas.

Moreover, if the Master node becomes unresponsive, the Coordinator detects the failure and promptly promotes a server to take over its responsibilities.

Connection Pooling

Opening a new database connection is both slow and resource-intensive. If each user request triggers a new connection, it leads to performance issues.

Connection Pooling is a fundamental design pattern that enables the reuse of database connections. Database connections are not immediately terminated but instead maintained in a pool for subsequent use. A Pool Manager component serves as the central authority responsible for managing and coordinating these shared connections.

This functionality is integrated into the Coordinator to improve performance.

Problems

The Master-slave model is simple and intuitive. Each component has a well-defined role, and the direct communication between nodes results in low latency and fast responses.

However, this simplicity conceals several critical issues, most of which stem from the centralized control of the master server:

The master becomes the Single Point of Failure . Its failure halts all write operations, therefore, the Master-slave model does not guarantee High Availability .
The master quickly becomes a performance bottleneck, especially in write-heavy applications.

In the next section, we’ll dig deeper into this challenge and explore a decentralized approach to building robust database clusters.

Last updated on June 26, 2025

Distributed Database Peer-to-peer Architecture