Skip to content

Service Metrics

This tool gives you basic insights into the operational health of your ML service in production.


What is being tracked?

  • Traffic - the traffic received by the model over time
  • Latency - the average latency of the model, i.e. the time it takes to respond to prediction requests (in milliseconds)
  • Errors - the number of errors the model has made in its predictions

Why is it being tracked?

  • These are the basic high-level metrics that inform us of the overall system health.

What steps should I take when I see an outlier?

  • A dip or spike in traffic needs to be investigated. For example, a dip could be due to a production model server going down; a spike could be an adversarial attack.
  • An increase in model latency also needs to be investigated. It could be an indicator of requests building up due to high QPS.
  • An increase in error counts could, for example, point to data pipeline issues.


  1. Join our community Slack to ask any questions 

Back to top