influxdb-client-csharp icon indicating copy to clipboard operation
influxdb-client-csharp copied to clipboard

Out Of Memory when writing lots of datapoints

Open CordisSSegers opened this issue 2 years ago • 9 comments

Hi, We are having an application that crashes with an OutOfMemory exception.

We are having trouble when writing data to Influx with the "WritePoint" method. We are writing ~2000 datapoints per 10ms per thread(there are 2 threads at this point in time writing data). When writing these datapoints memory keeps piling up by 10mb per minute and never going down untill we stop writing for a short while or eventually hit an OutOfMemory. Setting the FlushInterval higher seems to migitate the problem, but memory still increases(albeit at a slower rate).
Using only one thread for writing data also migitates the problem, but it takes 3 days to run out of memory.

We are using: Client version: 4.0.0.0 InfluxDB Version: 2.1.0 Platform: Windows 10 (different systems with different specs). FlushIntervals used: 10, 500, 1000, 2000.

Would it be possible to improve the API that the API notices that it cannot handle the load?

InfluxOutOfMemory

CordisSSegers avatar Mar 31 '22 11:03 CordisSSegers

Hi @CordisSSegers,

thanks for using our client.

Can you share a little bit more about your code that ingest data into InfluxDB? Does the InfluxDB sufficient hardware resources? The problem may be also caused due to slow response from the server.

Here is a document about hardware sizing: https://docs.influxdata.com/influxdb/v1.8/guides/hardware_sizing/#influxdb-oss-guidelines.

Regards

bednar avatar Mar 31 '22 11:03 bednar

Hi Bednar,

I've added two code examples which roughly reflect on how we send data to Influx.

InfluxOutOfMemoryCodeExample1 InfluxOutOfMemoryCodeExample2

We have deployed InfluxDB on several systems 3 systems with 6 cores and 16GB RAM. 1 system with 8 cores and 16GB RAM.

CordisSSegers avatar Mar 31 '22 12:03 CordisSSegers

Just after a quick look, I think you have everything set up right. For importing large amounts of data it would be better to set FlushInterval to a large value and set BatchSize to something as 10000. Something like:

var writeOptions = WriteOptions
    .CreateNew()
    .BatchSize(50000)
    .FlushInterval(10000)
    .Build();

using var writeApi = Client.GetWriteApi(writeOptions);

This settings will be less stressful for server side.

bednar avatar Mar 31 '22 14:03 bednar

Hi team, great work so far! I appreciate the API.

I have a question regarding the writeApi. Maybe its not related to this topic but perhaps this could lead to memory issue if the API is not used correctly. Im pushing a lot of points, more than 200 points every seconds. I'm not sending the timestamp because I would like the point to have the server timestamp. Then I think I can't manually batch the points. Concerning the performances and best practices, is it better to have the writeApi cached in a field and dispose manually when we dispose the main class or its fine to call using var writeApi = Client.GetWriteApi(writeOptions); for every single push.

Thank you for your help

aboccag avatar Jun 29 '22 16:06 aboccag

Hi @broadside74,

the writeApi should be cached in a field and disposed when your application end.

For your use cases (~200 points per sec) you can use WriteApiAsync and write all points in one batch by WritePointsAsync.

For more info see:

  • https://docs.influxdata.com/influxdb/v1.8/guides/hardware_sizing/#influxdb-oss-guidelines
  • https://influxdata.github.io/influxdb-client-csharp/api/InfluxDB.Client.WriteApiAsync.html#methods

Regards

bednar avatar Jun 30 '22 05:06 bednar

Thank you @bednar

One precision, I'm using influxdb server version 1.8.

The system producing the points can have its clock desynchronized. So I do not use the system Timestamp but the server timestamp. (Timestamp = null). I must know "exactly" when the point has been produced. The machine cycle time is 300 ms then a batch flushtime must not exceed 300 ms right ?

So which solution is better -> should I : Use InfluxDbClient.GetWriteApiAsync() (which does not support WriteOptions)

  • call WriteMeasurementAsync<TM>(TM) for every single points
  • make my own batch system lower than 300 ms and call WriteMeasurementsAsync<TM>(TM[])

use InfluxDbClient.GetWriteApi(_influxWriteOptions) with a FlushInterval less than 300 ms

  • call WriteMeasurement<TM>(TM) for every single points
  • make my own batch system lower than 300 ms and call WriteMeasurements<TM>(TM[])

aboccag avatar Jun 30 '22 09:06 aboccag

@broadside74 For your use case will be better to use InfluxDbClient.GetWriteApiAsync() and use WriteMeasurementsAsync(TM[]). The WriteMeasurementAsync(TM) could be to stressful for the server

bednar avatar Jun 30 '22 12:06 bednar

Thank you, but it is said that the async version does not support batch For writing data we use WriteApi or WriteApiAsync which is simplified version of WriteApi without batching support.

aboccag avatar Jun 30 '22 13:06 aboccag

Im pushing a lot of points, more than 200 points every seconds. I'm not sending the timestamp because I would like the point to have the server timestamp. Then I think I can't manually batch the points.

The ~200 points per second is suitable write rate for the server. You can simply use WriteMeasurementsAsync(TM[]) with 200 measurements.

bednar avatar Jun 30 '22 19:06 bednar