telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

feat(inputs/modbus): optimise grouped requests

Open TimurDela opened this issue 2 years ago • 15 comments

Simple modification to introduce an optimise_requests flag in the configuration setup that, when set to true

  • minimises the number of requests when a large number of fields is omitted
  • removes the trailing omitted fields from every request so the request is not longer than needed

As a result, the total number of request is generally lower, also improving performance

Required for all PRs:

  • [x] Updated associated README.md.
  • [x] Wrote appropriate unit tests.
  • [x] Pull request title or commits are in conventional commit format
  • adds a new field optimise_requests in requestDefinition
  • when true, groupFieldsToRequests switches to groupFieldsToRequestsOptimised that
    • always starts requests with non-omitted fields
    • shortens the requests if their last fields are flagged as omitted

This results in fewer and shorter requests when the user specifies many (several hundred) fields, many of which are omitted, hence increasing performance.

TimurDela avatar May 17 '22 06:05 TimurDela

Hi @srebhan, I was not aware that some devices required the address 0 to be queried. However, my problem is that I have a device with around 1300 signals. My team is only interested in ~50 of them scattered over the entire address range. Sitting down to try to find the most efficient way to group my queries, taking into account the maximum size of one query (which depends on the data types) is a lot of work that would need to be repeated every time I am asked to add or remove a signal. I would prefer to simply provide the entire list of addresses to the Telegraf config file and set the few signals I am interested in to omit=false.

If I use the code I pushed, my configuration leads to 7 requests. If I use your suggested code and throw out only the empty requests, I end up with 21 requests, which becomes incompatible with how often I need to log the data.

Could we think of a way to have both behaviours and let the users decide which one is more appropriate for their needs?

TimurDela avatar May 21 '22 04:05 TimurDela

@TimurDela how do you get 21 requests? In my suggestions I'm dropping completely empty requests, so I cannot think of any configuration where 50 registers in 1300 entries cause 21 requests. Can you please provide an example?

The only change I request to your code is to not drop leading omits in requests. I can see that this makes sense, but it would need an additional config option for the user to explicitly enable it. I'm open to discuss this.

My suggestion is to split this PR. Please move fixing the bug with not omitting the first field to an own PR and keep the "empty-filtering" in this PR. Does that make sense to you?

srebhan avatar May 23 '22 08:05 srebhan

For the holding registerType, maxBatchSize is 125. Most of my entries are float32 and take 2 addresses each, so that is maximum 62 fields per request. 1300 / 62 = 21. But In fact when I follow your suggestion I get 12 requests which is better than 21 but still suboptimal.

I will split this pull request in two as you ask. Please have a look at #11202 . How should I proceed with this pull request? Wait for you to merge #11202 and rebase this one or add new changes directly taking the chance of having merge conflicts?

TimurDela avatar May 27 '22 05:05 TimurDela

My suggestion of the further procedure would be to first get #11202 ready. In the meantime, I want to ask you to fix the "omit first field" bug, maybe in this PR or you split it out to another.

If you furthermore, need the above optimization, you should do it in this PR. Please guard the "new" behavior with a flag (e.g. optimize_heading_omits or similar) or the other way, add a workaround option, e.g. use_strict_omit.

How does that sound to you?

srebhan avatar May 27 '22 09:05 srebhan

Hello @srebhan , is there something else you want me to modify in this pull request?

TimurDela avatar Jun 07 '22 05:06 TimurDela

@TimurDela after thinking a bit longer on your request I realized that there are solutions closer to the optimum and came up with #11273. Can you give it a try so we can discuss the further proceeding? Your test is integrated and the result matches your expectations. However, you can optimize further by setting optimization = "rearrange" in the request which tries to find optimal request boundaries wrt the total request size.

Looking forward to your comments.

srebhan avatar Jun 09 '22 07:06 srebhan

Thank you @srebhan I will have a look at it as soon as I can but it may take a few days.

TimurDela avatar Jun 09 '22 07:06 TimurDela

I have left a few fmt.Printf here and there for comparing with #11273 but of course they are meant to be removed

TimurDela avatar Jun 18 '22 06:06 TimurDela

Can you please use #11273 as a basis for your implementation by adding another strategy there? I intentionally kept my PR open for additional strategies to make it extensible.

srebhan avatar Jun 22 '22 13:06 srebhan

Just to be clear, do you want me to rebase this pull request on top of srebhan:modbus or do you want me to add my changes into pull request #11273 ?

TimurDela avatar Jun 22 '22 14:06 TimurDela

I want to add your optimization scheme on-top of #11273. Give it another name (e.g. optimization = "max_insert) and use the groups concept to implement it. Does that make sense to you?

srebhan avatar Jun 27 '22 11:06 srebhan

@TimurDela I've marked this ready for review per your request in, but note that CI is failing, so take a look at those failures please. Thanks!

powersj avatar Sep 14 '22 14:09 powersj

I am sorry but I do not understand the error message from the failing job. Locally, when I run make lint-branch none of the messages concern files in plugins/inputs/modbus. I am unsure how to proceed if I can't find where the linting "error" is.

TimurDela avatar Sep 15 '22 11:09 TimurDela

fwiw the windows CI issue I think had a fix in master, so you may need to update your branch

powersj avatar Sep 20 '22 15:09 powersj

Hello @srebhan, any chance you take a look at this PR?

TimurDela avatar Sep 27 '22 05:09 TimurDela

Hello @srebhan ,

did you have time to look at this: https://github.com/influxdata/telegraf/pull/11106#discussion_r982052477 ?

TimurDela avatar Oct 24 '22 05:10 TimurDela

@TimurDela sorry for the late reply!

srebhan avatar Nov 14 '22 16:11 srebhan

@TimurDela not that #11273 is merged, can you please rebase this PR on latest master and add your optimization?!?

srebhan avatar Nov 14 '22 19:11 srebhan

Thank you @srebhan. This PR is now rebased and the outputs have been moved to Init() and are only active with the -debug flag.

Tests fail for the Windows test suite but the error messages mention plugins I have not modified (google_cloud_storage and statsd) so I am not sure what to do about it.

TimurDela avatar Nov 15 '22 12:11 TimurDela

Thank you @srebhan I have now implemented your suggestions and cleaned up the test file

TimurDela avatar Nov 16 '22 08:11 TimurDela

Hello @srebhan I just solved new conflicts with master but I would really appreciate if we could merge this PR before the next change on modbus so I don't have to do that again.

TimurDela avatar Nov 23 '22 07:11 TimurDela

Thank you @srebhan for the very useful feedback. I am really happy this is going to be on the main branch.

TimurDela avatar Nov 30 '22 11:11 TimurDela