Abstract
- This system tracks overall health by observing predefined metrics and alerts you when something goes wrong based on established thresholds, preventing issues before they escalate
4 Golden Monitoring Signals
- Latency: The time it takes for a request to travel from the client to the server and back
- Traffic: The number of requests a system receives over a specific period
- Error rate: The percentage of requests resulting in errors, such as 404 Page Not Found or 500 Internal Server Error
- Saturation: A measure of resource utilisation, including CPU, memory, and disk space
Data points for optimisation
These data points allow us to easily evaluate overall performance and application health, enabling informed decisions about optimisation and scaling.