data icon indicating copy to clipboard operation
data copied to clipboard

document of parameter buffer_size in MaxTokenBucketizer is wrong

Open ling0322 opened this issue 2 years ago • 1 comments

According to the document MaxTokenBucketizer buffer_size – This restricts how many tokens are taken from prior DataPipe to bucketize

However, in the code, bucketbatcher.py#L277 The unit of buffer_size is sample not token

ling0322 avatar Oct 14 '22 08:10 ling0322

Thanks for reporting it. Feel free to open a PR to fix the inline doc.

ejguan avatar Oct 14 '22 12:10 ejguan