Abstract


Availability (可用性)


Scalability (可扩展性)


  • Refers to the capability of a system to handle a growing amount of work, or its potential to be enlarged to accommodate that growth
  • It ensures system can handle increased load efficiently by adding resources or optimizing existing ones. It ensures that the system can grow to meet the demands of a larger user base or increased data volume, ensuring Availability
  • Can be achieved with Cache Server, Stateless Compute Server, Message Queue (消息队列) & Database Scaling

Vertical Scaling

  • Basically adding more CPU and Main Memory to a single Server
  • Simple to implement, great option when traffic is low

Vertical Scaling Limitations

Hard Limit

No Failover

Expensive

  • Powerful servers are much more expensive

Horizontal Scaling

CAP Theorem in distributed systems

You can only choose two out of the three.

Consistency

  • All nodes display identical data, guaranteeing that reads always reflect the most recent write.

Availability

  • Every request receives a response, without guarantee that it contains the most recent writes.

Partition Tolerance

  • The system continues to operate despite network failures.

If a system prioritizes consistency (CP), it may become unavailable during a partition to ensure that data remains consistent across nodes.

If a system prioritises availability (AP), it may sacrifice consistency during a partition, allowing nodes in different partitions to respond, even though they might not have the latest data.

Fault Tolerance (容错性)


Single Point of Failure

  • A part of a system that, if it fails, will stop the entire system from working

Reliability (可靠性)


  • Refers to the ability of a system to perform a specified function without failure over a specified period
  • It ensures consistent and predictable behavior of a system. It involves minimizing the chances of failures and, in case of failures, having mechanisms in place for quick recovery
  • Can be achieved with Monitoring and automation like ci/cd pipeline

Efficiency


Latency

Delay in first response.

Throughput

Operations per time unit.

References