Sebastian Raschka

Results 818 comments of Sebastian Raschka

I think that's currently not possible, but I'm in favor of adding it as a dataset config and maybe having the default `num_workers="auto"`. Since you have much more experience with...

@ebektas Does the issue occur when you use `litgpt pretrain ...` or is this an issue you encounter when preparing the dataset, e.g., `python litgpt/data/prepare_slimpajama.py ...`

I am mainly asking because it looks like we already expose the number of workers for the pretraining itself: ``` ⚡ main ~/litgpt litgpt pretrain --data.help litgpt.data.TinyLlama usage: litgpt [--data.init_args.data_path...

> It looks like it is taking issue with two lines in test_association_rules.py. It is formatting just those two lines rather than the entire block. We could go back and...

Thanks for reporting. Yes, this is weird. My first thought would also be that it's something with the KV cache. I assume the maximum new tokens length is the same,...

Good observation, this could be related. We can debug this by first making these two consistent.

This reminds me, at some point we discussed an option like `optimize="compute" | "memory"` for less advanced users, and this could be a good trade-off for such a setting

The reason why auto-download is so far down there (compared to the other places) is that I had to move it below the LoRA merging, because otherwise it will download...

I also didn't notice the missing CPU warning because I was running it on GPU 😅. The following reorg might work ...

I think this should be good now. Feel free to merge if you agree