Domino monitoring

Monitoring Domino involves tracking several key application metrics. These metrics reveal the health of the application and can provide advance warning of any issues or failures of Domino components.




Metrics

Domino recommends tracking these metrics in priority order:

Metric Suggested threshold Notes
Latency to /health 1000ms Measures the time to receive a response to a request to the Domino API server. If the response time is too high, this suggests that the system is unhealthy and that user experience might be impacted. This can be measured by calls to the Domino application at a path of /health.
Dispatcher pod availability from metrics server nucleus-dispatcher pods available = 0 for > 10 minutes If the number of pods in the nucleus-dispatcher deployment is 0 for greater than 10 minutes, its an indication of critical issues that Domino will not automatically recover from, and functionality will be degraded.
Frontend pod availability from metrics server nucleus-frontend pods available < 2 for > 10 minutes If the number of pods in the nucleus-frontend deployment is less than two for greater than 10 minutes, its an indication of critical issues that Domino will not automatically recover from, and functionality will be degraded.

There are many application monitoring tools you can use to track these metrics, including:




Alerting

Users are advised to configure alerts to their application administrators if the thresholds listed above are exceeded. These alerts are an indication of potential resourcing issues or unusual usage patterns worth investigation. Refer to the Domino application logs, the Domino administration UI, and the Domino Control Center to gather additional information.