NeMo-Curator
NeMo-Curator copied to clipboard
Make `max_text_bytes_per_part` configurable
trafficstars
Is your feature request related to a problem? Please describe. #77 adds support for longer strings and as a part of those discussions it makes sense to expose some advanced config options to users as well: https://github.com/NVIDIA/NeMo-Curator/pull/77#discussion_r1747727708.
Describe the solution you'd like Expose these hardcoded values as config options that can be tweaked if needed.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.