promitor icon indicating copy to clipboard operation
promitor copied to clipboard

feat: SDK migration single resource scraping

Open hkfgo opened this issue 10 months ago • 13 comments

This PR implements the Azure Monitor SDK migration. It migrates away from the deprecated Azure Fluent SDK to Azure SDK for .NET. Specifically, it uses the Azure.Monitor.Query package to implement integration. Since the new SDK is essentially a different wrapper around the same REST API, I'd expect identical behavior in terms of

  • Billing(free through ARM)
  • Metric querying
  • Meta-level metrics to track ARM throttling
  • Meta-level metrics to track usage

A summary of how I did it:

  • Abstracted Azure Monitor integration through the IAzureMonitorClient interface.
  • Implemented Azure Monitor integration using the new SDK, under the IAzureMonitorClient interface
  • Implemented high-level control flow to use either the new client or the legacy client, depending on feature flag

Things I need help with:

  • If this looks like a good approach to you
  • I need some pointers on how to test this :( Specifically, how do you think end-to-end testing should be performed for a major PR like this? I was thinking either integration tests or building a custom branch image and deploying to our test environment. I'm not quite sure how to do either of them though. Some help would be much appreciated!

hkfgo avatar Apr 20 '24 02:04 hkfgo

Thank you for your contribution! 🙏 We will review it as soon as possible.

github-actions[bot] avatar Apr 20 '24 02:04 github-actions[bot]

I need some pointers on how to test this :( Specifically, how do you think end-to-end testing should be performed for a major PR like this? I was thinking either integration tests or building a custom branch image and deploying to our test environment. I'm not quite sure how to do either of them though. Some help would be much appreciated!

This should be covered by the testing sweet so no worries!

tomkerkhove avatar Apr 20 '24 12:04 tomkerkhove

I need some pointers on how to test this :( Specifically, how do you think end-to-end testing should be performed for a major PR like this? I was thinking either integration tests or building a custom branch image and deploying to our test environment. I'm not quite sure how to do either of them though. Some help would be much appreciated!

This should be covered by the testing sweet so no worries!

Are these pipeline steps something I can repeatedly trigger on my own? I read the README more carefully and it seems like I can find my branch build under :pr{pr-id}. If I can re-trigger the CI pipeline on my own then that'd definitely cover end-to-end testing!

hkfgo avatar Apr 21 '24 06:04 hkfgo

I need some pointers on how to test this :( Specifically, how do you think end-to-end testing should be performed for a major PR like this? I was thinking either integration tests or building a custom branch image and deploying to our test environment. I'm not quite sure how to do either of them though. Some help would be much appreciated!

This should be covered by the testing sweet so no worries!

Are these pipeline steps something I can repeatedly trigger on my own? I read the README more carefully and it seems like I can find my branch build under :pr{pr-id}. If I can re-trigger the CI pipeline on my own then that'd definitely cover end-to-end testing!

Hey, there are docs how to run things locally here: https://github.com/tomkerkhove/promitor/blob/master/CONTRIBUTING.md#net-development

The CI also runs all these integration tests automatically in case you were wondering

tomkerkhove avatar Apr 26 '24 08:04 tomkerkhove

I see you are actively working on this so I'll hold off on release!

Do let me know if you need to put this on hold

tomkerkhove avatar May 02 '24 07:05 tomkerkhove

Yes I am. Getting close

hkfgo avatar May 02 '24 08:05 hkfgo

Good news! This most recent build appears to be working across all of our resource types. We have a good coverage within Axon, with Redis, Storage, Load Balancer, Disk, Azure SQL, Power Functions, etc.

Let me clean up the code more(I've added so many log statements!!). Meanwhile, do you suggest anything for end-to-end testing?

hkfgo avatar May 04 '24 03:05 hkfgo

Good news!

The automated testing already happens and is reported below, the majority of the checks are passing except for a few 👌

tomkerkhove avatar May 04 '24 05:05 tomkerkhove

Looks like all tests are passing now! The failing CodeFactor was on the giant MetricScraperFactory method to find the matching scraper. There's not much we can do I think.

Do you mind taking another look? Please ignore the log statements and modification to GitHub action for now. We can remove those before pressing the merge button

hkfgo avatar May 09 '24 07:05 hkfgo

Also, any pointers on how to do remote debugging? I've found some online articles on remote debugging with VS Code + .NET + Kubernetes. Probably should have tried that to begin with instead of doing so many print statements..

hkfgo avatar May 09 '24 07:05 hkfgo

Also, any pointers on how to do remote debugging? I've found some online articles on remote debugging with VS Code + .NET + Kubernetes. Probably should have tried that to begin with instead of doing so many print statements..

No, I always use VS to run the container locally and troubleshoot. If a running instance does not provide the insights you need, then we may be missing some logs

tomkerkhove avatar May 16 '24 07:05 tomkerkhove

Also, I believe there should be two quick PRs to

  1. Update documentation
  2. Make useAzureMonitor flag available in the Promitor chart

Be on the look out :)

hkfgo avatar May 17 '24 20:05 hkfgo

Promitor documentation PR: https://github.com/promitor/docs/pull/62 Promitor chart PR: https://github.com/promitor/charts/pull/168

I believe they are dependencies of the next release but not this PR getting merged. I'm making this distinction because I'm waiting on merge to master to rebase and continue batch scraping work. No rush though. Thanks!

hkfgo avatar May 25 '24 05:05 hkfgo