Abstract


  • Traces help us track the flow of requests through various services and components of the system
  • A trace is made up one or more Span
  • Once we identify the span for latency, we can proceed with optimisation
  • We can use tools like Datadog APM, Tempo and Zipkin etc

Terminologies


Runtime Metrics

Instrumented

  • Code or tools have been added to the application to monitor, measure, or analyze its behavior during execution
  • Provide insights into the application’s performance, functionality, and other operational characteristics
  • This is particularly useful for debugging, performance tuning, and monitoring purposes

Pipeline

Host-side

  • We can tune the Sampling
    • Library Sampling overrides Agent Sampling
  • Trace Metrics are the Metrics, directly connected Instrumented application, calculated based on 100% of the app’s traffic

Datadog backend side

  • Live Search allows us to search Span using any tag or Span
  • Generate Custom Metric from Span
  • Retention Filters - how long we want to retain the trace
  • Dashboard used to give a visual representation of the app for optimisation and debugging