vmware_exporter
vmware_exporter copied to clipboard
Change serialization of alarms and sensors for objects
This fixes the root cause of 'Fix for badly behaving super-micro sensor'
This is a breaking change to the return value of helpers.batch_fetch_properties.
I would like a little more about this change in the description and why the changes were made. This overall is some great changes, but I want to make sure I understand it all.
So there are a few changes in there ( they likely should have been seperate MR but my boss needed a solition asap ).
- We had a badly behaving supermicro server that reported badly. I honestly dont even remember but I think it crashed the collector. This was the origional reason for digging into this project. :sweat_smile:
- Instead of packing alarms/metrics into string and parsing them I am using objects. This removes some of the string issues and makes things easier to use moving forwards.
- The best practice at my job is to have one issue per alert. The alert metrics ( like
vmware_host_yellow_alarms) have been adjusted to show one issue per metric. They also have the s on the end removed to reflect this change. Merging of issues should be left to higher level abstractions like alertmanager. - While diving into the code I discovered that you could pass as a parameter any vmware URL. An attacker could then try to steal your credentials by directing an existing profile to a malicious server. If you want to enable the old exploitable behavior there is a flag since someone is likely relying on this behavior.
- The timeout stuff was experimental and so far has not caused issues. It did mitigate a slow vmsphere from locking up the whole process and making it unresponsive. There is still some kind of edge case where vmware could drop the connection and hang the process. Thus the systemd changes.
I wrote this up by looking over my commits. Sorry if I missed some things it was over a year ago. Feel free to ask for more details on things. It would be awesome to get this merged in.
I would like a little more about this change in the description and why the changes were made. This overall is some great changes, but I want to make sure I understand it all.
So there are a few changes in there ( they likely should have been seperate MR but my boss needed a solition asap ).
1. We had a badly behaving supermicro server that reported badly. I honestly dont even remember but I think it crashed the collector. This was the origional reason for digging into this project. 😅 2. Instead of packing alarms/metrics into string and parsing them I am using objects. This removes some of the string issues and makes things easier to use moving forwards. 3. The best practice at my job is to have one issue per alert. The alert metrics ( like `vmware_host_yellow_alarms` ) have been adjusted to show one _issue_ per metric. They also have the _s_ on the end removed to reflect this change. Merging of issues should be left to higher level abstractions like alertmanager. 4. While diving into the code I discovered that you could pass as a parameter any vmware URL. An attacker could then try to steal your credentials by directing an existing profile to a malicious server. If you want to enable the old exploitable behavior there is a flag since someone is likely relying on this behavior. 5. The timeout stuff was experimental and so far has not caused issues. It did mitigate a slow vmsphere from locking up the whole process and making it unresponsive. There is still some kind of edge case where vmware could drop the connection and hang the process. Thus the systemd changes.I wrote this up by looking over my commits. Sorry if I missed some things it was over a year ago. Feel free to ask for more details on things. It would be awesome to get this merged in.
- I think we might want to remove the allow target url param and make it so that the url is in the configuration rather then using any input. Let me know your thoughts on this?
If I understand correctly; You are proposing dropping vsphere_host and target entirely. If that were the case then the only way to select an instance would be via /metrics?section=some_string. That would definately fix the security issue.
I can likely get this done over the weekend.
@pryorda The url parameters vsphere_host and target have been removed. I have also updated the readme and merged in all the recent changes so it should be easy for you to move forward here.