nanoGPT icon indicating copy to clipboard operation
nanoGPT copied to clipboard

multi-bin dataset loading. [Feature Request]

Open Coriana opened this issue 2 years ago • 0 comments

Sorry to raise this as an issue, as its really a request. but any plans to support multiple training files? as I keep running into memory limits processing 100gigs into a single file. (book3.tar.gz) and being able to split into smaller chunks that are then available for training would really help.

Thank you and regards.

Coriana avatar Feb 14 '23 00:02 Coriana