logo
1750522290992_images (1).png

Scalability and Availability Basics

  • Author: Trần Trung
  • Published On: 21 Jun 2025

Scalability and Availability Basics

When building a software system, the two most important factors are scalability and availability. This article will help you better understand these two concepts, along with the factors involved and how they affect system design.

1. Understand Performance and Scalability

Performance is often understood as the ability of a system to perform a particular task quickly and efficiently. For example, the time it takes for a web page to load, or the number of database queries a system can process in a second.

Scalability is the ability of a system to handle increasing workloads by adding resources. This means that as the number of users or data increases, the system can maintain acceptable performance without changing the underlying architecture.

For example:

  • A website performs well when it loads quickly for a small number of users.
  • A website is said to be scalable when it still loads quickly even when the number of users increases many times over.

 

2. Latency and Throughput

Latency is the time it takes to complete a single operation. For example, the time it takes for an API request to be processed and the result returned. Low latency is important for user experience, especially in real-time interactive applications.

Throughput is the number of operations a system can process in a given amount of time. For example, the number of HTTP requests a web server can process in one second. A high throughput indicates that the system can handle a large amount of work efficiently.

For example:

  • A low latency system can process a request in milliseconds.
  • A high throughput system can handle thousands of requests per second.

 

3. Availability and Consistency

Availability is the ability of a system to operate and serve users continuously. A highly available system will be less likely to experience interruptions or downtime. For example, a system designed to be 99.99% available (four nines) means that it can only be down for a maximum of about 52 minutes per year.

Consistency ensures that all copies of data are the same at all times. This is especially important in distributed systems, where data is stored on multiple servers.

For example:

  • A highly available system is always ready to serve user requests, even if some components fail.
  • A highly consistent system ensures that when one user changes data, all other users see that change immediately.

 

4. CAP Theorem

The CAP theorem is one of the fundamental principles in distributed system design. It states that in a distributed system, you can only guarantee two of the following three properties:

  • Consistency: All nodes in the system see the same data at the same time.
  • Availability: The system always responds to requests, even if some nodes fail.
  • Partition Tolerance: The system continues to operate even if a network failure causes the nodes to be separated.

In practice, partition tolerance is a prerequisite in distributed environments. So you often have to choose between consistency and availability.

CAP Theorem Availability (A) Consistency (C) Partition Tolerance (P)

5. CP - Consistency and Partition Tolerance

The CP system prioritizes consistency and partition tolerance. This means that in the event of a network failure, the system can reject some requests to ensure that the data remains consistent.

For example:

  • Traditional relational databases (RDBMS) typically follow the CP model.
  • In the event of a network failure, the system may lock some records to ensure consistency.

 

6. AP - Availability and Partition Tolerance

The AP system prioritizes availability and partition tolerance. This means that in the event of a network failure, the system continues to serve requests, but may return inconsistent data.

For example:

  • NoSQL databases like Cassandra and Couchbase often follow the AP model.
  • In the event of a network failure, the system still accepts write requests, but the data may not be synchronized immediately across all nodes.

 

Conclude

Understanding scalability, availability, and the CAP theorem is important when designing software systems, especially distributed systems. The choice between CP and AP depends on the specific requirements of the application and the importance of consistency and availability.

To better understand how to apply these principles in practice, you can refer to the articles on microservices , database scaling , and asynchronous processing .

  • Share On: