elastic
elastic copied to clipboard
[Feature] Make `estimateSizeInBytes(r BulkableRequest) ` externally accessible
To be able to batch bulk loads to AWS, you need to keep an eye of the size of the bulk and keep it under the limits AWS sets.
While you can get the size of the bulk request after you have added items, it be better to also get the size of the requests before being added to the BulkService, so you can judge if adding the next one will take you over the limit.
If the internal function *BulkService.estimateSizeInBytes was made available externally to the BulkService object then it easy to tell the size each request was..
Of course we might want to EstimatedSizeInBytes to each request type..
Which version of Elastic are you using?
[ x ] elastic.v5 (for Elasticsearch 5.x)
// bulkable request, i.e. BulkIndexRequest, BulkUpdateRequest, and
// BulkDeleteRequest.
func (s *BulkService) estimateSizeInBytes(r BulkableRequest) int64 {
lines, _ := r.Source()
size := 0
for _, line := range lines {
// +1 for the \n
size += len(line) + 1
}
return int64(size)
}
I see what you're after. I'm a bit hesitant to opening it up because using it without thinking twice may result in a performance problem. That estimateSizeInBytes is quite slow, you know.
Let me sink that in. We don't need to rush because there's an easy workaround. You can create a helper for that function in your own code: there are no internals used in that function.
One idea is have each bulk request action size its self on creation and then bulk service could just tally the metadata. Then each action could have a accessor to that metadata.
That could definitely work. Will have to tinker with that idea.
@ArcticSnowman I finally found some time to tinker around with this issue. Do you mind reviewing the bulk.estimated-size.issue-931 branch? It currently only has this commit.
Certainly looks ok to me...
Go question... Each of the BulkableRequest sub types share the int64 for the estimatedSize int64. Is there a reason not to put that into the BulkableRequest interface? Or can that only been the functions?
An interface type is defined as a set of method signatures.
https://tour.golang.org/methods/9
Thanks for clarifying...