Possibly exceed MAX_SIZE_BYTES in Log Analytics Ingestion
Describe the bug We identified that, the 30 MB limitation in data ingestion can be exceeded when we use logstash data connector ~~with "amount_resizing = false" settings.~~ (Added : when data has big size difference, this also possibly happens)
The 30 MB limit is set at [1], but it is used only specific function [2]. But When amount_resizing=false, the [2] code should be skipped at [3]. Therefore, subsequent data posting should ignore 30 MB limit and can be exceed.
[1] https://github.com/Azure/Azure-Sentinel/blob/1ba90605ba41f662829b51c049ffd655859888c9/DataConnectors/microsoft-logstash-output-azure-loganalytics/lib/logstash/logAnalyticsClient/logstashLoganalyticsConfiguration.rb#L15
[2] https://github.com/Azure/Azure-Sentinel/blob/1ba90605ba41f662829b51c049ffd655859888c9/DataConnectors/microsoft-logstash-output-azure-loganalytics/lib/logstash/logAnalyticsClient/logStashAutoResizeBuffer.rb#L100
[3] https://github.com/Azure/Azure-Sentinel/blob/1ba90605ba41f662829b51c049ffd655859888c9/DataConnectors/microsoft-logstash-output-azure-loganalytics/lib/logstash/logAnalyticsClient/logStashAutoResizeBuffer.rb#L43
Expected behavior When the data size exceeds 30 MB, which is fixed limit of Log Analytics (Azure Monitor), the request should be separated. More ideally, the separation should happen recursively until the document size is 1.
Screenshots N/A
Desktop (please complete the following information): N/A
Smartphone (please complete the following information): N/A
Thank you for submitting an Issue to the Azure Sentinel GitHub repo! You should expect an initial response to your Issue from the team within 5 business days. Note that this response may be delayed during holiday periods. For urgent, production-affecting issues please raise a support ticket via the Azure Portal.
Let me add one more note.
Because the new file size buffer in change_message_limit_size is calculated base on average size, this possibly should have 30 MB overflow as well.
https://github.com/Azure/Azure-Sentinel/blob/1ba90605ba41f662829b51c049ffd655859888c9/DataConnectors/microsoft-logstash-output-azure-loganalytics/lib/logstash/logAnalyticsClient/logStashAutoResizeBuffer.rb#L102
Here is examples. Let's say, if we have 20MB x 1 and 10KB x 999 data. The average document size should be around 21.47 KB.
and when @logstashLoganalyticsConfiguration.max_items == 1000, this line should be true, because left side result should be around 21.47 MB (right side is around 30 MB)
https://github.com/Azure/Azure-Sentinel/blob/1ba90605ba41f662829b51c049ffd655859888c9/DataConnectors/microsoft-logstash-output-azure-loganalytics/lib/logstash/logAnalyticsClient/logStashAutoResizeBuffer.rb#L108
Then, by next line, new_buffer_size become 2000, but if we have just 2 lines of 20 MB data = 40 MB, it should exceed API limit.
This indicates that, this average calculation has assumption that "almost all data are almost same length" but in our use cases, this doesn't match.
Thank you for submitting an Issue to the Azure Sentinel GitHub repo! You should expect an initial response to your Issue from the team within 5 business days. Note that this response may be delayed during holiday periods. For urgent, production-affecting issues please raise a support ticket via the Azure Portal.
Thank you for submitting an Issue to the Azure Sentinel GitHub repo! You should expect an initial response to your Issue from the team within 5 business days. Note that this response may be delayed during holiday periods. For urgent, production-affecting issues please raise a support ticket via the Azure Portal.
Thank you for submitting an Issue to the Azure Sentinel GitHub repo! You should expect an initial response to your Issue from the team within 5 business days. Note that this response may be delayed during holiday periods. For urgent, production-affecting issues please raise a support ticket via the Azure Portal.
Thank you for submitting an Issue to the Azure Sentinel GitHub repo! You should expect an initial response to your Issue from the team within 5 business days. Note that this response may be delayed during holiday periods. For urgent, production-affecting issues please raise a support ticket via the Azure Portal.
Hi @mitsuo0114, thank you for flagging this. Apologies for the delayed response. If you still need assistance, please reply here within 5 business days.
Since we have not received a response in the last 5 days, we are closing your issue #5353 as per our standard operating procedures. If you still need support for this issue, feel free to re-open at any time. Thank you for your co-operation.
Thank you for submitting an Issue to the Azure Sentinel GitHub repo! You should expect an initial response to your Issue from the team within 5 business days. Note that this response may be delayed during holiday periods. For urgent, production-affecting issues please raise a support ticket via the Azure Portal.