InfluxData.Net icon indicating copy to clipboard operation
InfluxData.Net copied to clipboard

Functionality to allow waiting for an IBatchWriter to finish

Open Zero3 opened this issue 7 years ago • 7 comments

Would it be possible to add functionality for waiting for an IBatchWriter to finish writing? There are many ways this could be made possible, for example:

  1. A blocking .stop() method.

  2. A .size() method for determining how many unsent entries the IBatchWriter has enqueued.

  3. Some async/await magic.

Or perhaps there is already some way of doing this that I am missing?

Zero3 avatar Apr 03 '17 22:04 Zero3

Hi,

  1. do you mean .stop() as in pause, and not as in full stop? Because full-stop already exists.

  2. I'll add that method, it's a good idea. But I don't think it will help you with knowing if the previous write finished. It will only tell you if the current _pointCollection is empty or not. You might end up with a state where it popped all the items from the collection, but the current request is still being executed.

You probably already have, but if you haven't, you can check how it works.

There is currently no way to wait for the previous request to finish, they work in a fire-and-forget way atm. I'll think about how what you want could be supported otherwise.

tihomir-kit avatar Apr 04 '17 08:04 tihomir-kit

Hi @pootzko

  1. I was thinking along the lines of .blockUntilAllPendingEntriesHaveBeenSentAndThenDontAcceptAnyNewEvents(). So not pausing the writer, but wait until everything enqueued has been sent, and then return :).

  2. Arh. My intention was indeed to check the number of events not yet sent (including those currently being sent).

My current use case is a one-off script that imports events into InfluxDB. It does so by parsing a bunch of logfiles with timestamped entries, making events for these, and sending these to InfluxDB using InfluxData.Net. The program terminates after completion, and herein lies the problem, since the enqueued events are lost on termination. If only there was a wait to wait for the IBatchWriter to complete, I would just do that.

Zero3 avatar Apr 09 '17 13:04 Zero3

@Zero3 I had a similar use case as you're describing and eventually I decided to just add a wait (as I know how fast I can shove the measurements to InfluxDB and I have an approximation of how much data I'm shoving). I've recently made some minor adjustments to the IBatchWriter and at the time I decided against implementing the blocked wait and/or the indicator for how much messages are in the Writer. Before my latest changes it wasn't even necessary to check the number of remaining messages as ALL messages were sent to InfluxDB at once.

Now that I'm reading your request I believe that there is a use case for at least a "Busy" boolean property, but preferably a long property that reflects the number of lines left in the BatchWriter.

If @pootzko agrees either you or me can add it and submit it as a pull request. It's a trivial change and it could help people that use the BatchWriter for it's intended purpose (like: writing a boatload of data) but run the processes periodically.

DJFliX avatar Apr 09 '17 13:04 DJFliX

@DJFliX Nice to hear that I'm not the only one.

I don't mind the "busy" boolean, but that would require me to implement a waiting loop around a check on it. So I would personally prefer a method that simply blocks until the writer is done. These ideas are not mutually exclusive, of course.

I don't plan on spending any time contributing code to this project, so it will be up to you guys :).

Zero3 avatar Apr 09 '17 13:04 Zero3

I see your point, it's a good idea. The current batcher is time-based. Perhaps an additional param that tells the batchWriter to wait for the previous write to finish and then internally based on that condition, the WriteBatchedPointsAsync method either works as fire-and-forget or awaits for the previous call to finish. I think that might do the trick.

tihomir-kit avatar Apr 10 '17 11:04 tihomir-kit

Hey there, I would like to bring up this idea again. I am currently working on my bachelor thesis trying to compare load performance between different time series solutions. For not having to rely on the _internal statistics I would need an event fired from the batchWriter when he is done so I can stop the time. I dont think I would run in the problem mentioned earlier since the collection is filled way faster then they can be processed. Although to be fair that is an assumtion.

or how about a counter that returns the last processed batch index since I know how big my load is gonna be.

MariaCobretti avatar Oct 28 '17 21:10 MariaCobretti

@MariaCobretti - there can't be a "done" event really because it's an infinite loop and there is no end, but an event when it reaches zero could potentially be added. Anyhow, I'm sorry but I don't believe I'll have the time to work on new stuff for the library in the next 2-3 months, so if you really need the functionality, you could just copy over the code for batching points and do something similar in your own code to get the effect you need.

tihomir-kit avatar Nov 09 '17 21:11 tihomir-kit