nanoGPT
nanoGPT copied to clipboard
multi-bin dataset loading. [Feature Request]
Sorry to raise this as an issue, as its really a request. but any plans to support multiple training files? as I keep running into memory limits processing 100gigs into a single file. (book3.tar.gz) and being able to split into smaller chunks that are then available for training would really help.
Thank you and regards.