CS Notes

Recent Updates

Database Search
Jan 20, 2025
Cron Jobs and Enhanced Monitoring Tools
Jan 20, 2025
Race Condition (竞态条件)
Jan 17, 2025

See 607 more →

❯

❯

Monitoring

122 words, 1 min read
Last updated on Nov 16, 2024
🌟 Edit This Page! 🗓️ History

system_design
devops
binance

Abstract

This system tracks overall health by observing predefined metrics and alerts you when something goes wrong based on established thresholds, preventing issues before they escalate

4 Golden Monitoring Signals

Latency: The time it takes for a request to travel from the client to the server and back
Traffic: The number of requests a system receives over a specific period
Error rate: The percentage of requests resulting in errors, such as 404 Page Not Found or 500 Internal Server Error
Saturation: A measure of resource utilisation, including CPU, memory, and disk space

Data points for optimisation

These data points allow us to easily evaluate overall performance and application health, enabling informed decisions about optimisation and scaling.

References

Observability vs. Monitoring - YouTube

Abstract
4 Golden Monitoring Signals
References

Backlinks

Observability
PromQL
Replicated State Machine
Alert
Trace
System Design
Datadog APM in ECS Fargate
Datadog

Graph View

Created by Xinyang YU | © 2023, 2025 | Licensed under CC BY-NC 4.0

GitHub