service-fabric
service-fabric copied to clipboard
Prometheus exporter for Monitoring SF cluster
Is there any exporter developed by any community though which we can monitor the cluster health, states of service fabric applications and services?
I am still not clear on how we can monitor the cluster and its underneath infrastructure. I was considering to use windows exporter for node metrics and planning to write an exporter for cluster side metrics based on API.
Any suggestions and recommendations will be appreciated.
@naveenkumarsp Have you read https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-best-practices-monitoring?We offer guidance on how to monitor infrastructure, cluster and app in above doc for Windows and Linux.
- How to set up Windows Azure diagnostics agent https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-diagnostics-event-aggregation-wad
- How to consume events/data from logs and set up monitors: https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-diagnostics-oms-setup
Please read the above doc and let us know if the information you are looking for is not available in doc.
thanks for replying.
As ops guy, I have been looking a way to monitor the applications and services hosted on the SF cluster. I would want to get notified when any issues or error occurs in the cluster.
Would WAD and diagnostic agent help me to achieve it?
You should consider a watchdog service, as mentioned in the documentation above. FabricObserver will generate health warnings at the service (as ApplicationHealthReports) and node (VM) level (as NodeHealthReports) when things get into a bad state, where you define what the things are and what bad means. Today, out of the box, these things are machine resource metrics at process and VM level. Unlike monitoring services, FO provides port use information which is critical if you run services that eat TCP ports as part of their daily diet.
https://aka.ms/sf/FabricObserver
Any updates here guys?
I made PoC on Prometheus exporter using Python in which metrics were fetched over API and transposed as Prometheus. If many are interested, we may can develop an exporter.