Jiawei Ou

Results 10 comments of Jiawei Ou

What a bummer. I was planning to start using Hydra to manage the config files. There is just so many irritating things around sagemaker, I should just look for an...

I just installed Zed this morning and love the style and speed, but after finding that Notebook is not supported... I have to go back to VSCode.

I am going to try what @tchaton suggested to reload the dataloader each epoch. I am also going to try to increase the on-disk cache by a lot to see...

I figured out what was the issue. In some cases, I accidentally set num_worker=0. Setting num_worker=1 will solve the problem.

I have observed the same issue for some of my datasets. In one case, over about 4 days, the training time grew from about 50 mins per epoch to almost...

![Screenshot 2024-05-21 at 7 38 55 PM](https://github.com/Lightning-AI/litdata/assets/1028148/4f8b4c59-cfa2-47ce-8f18-8f971f8d9007) Another thing I noticed is that litdata, compared to the streaming dataset from MosiacML, underutilized the memory. The slow-down potentially coming from heavily...

@tchaton, yes. I misread this issue, which referred to the slowdown in preparing the data. I will file a new issue, but I don't know if I can provide a...

Filed a new issue: https://github.com/Lightning-AI/litdata/issues/138

Looks like the `setup()` method on `NoHeaderTensorSerializer` and `NoHeaderNumpySerializer` wasn't called before `deserialize` was called.

Okay... found a workaround. The problem is the the numpy array is a 1D array. The fix is to reshape that to a 2D array to create an "header"? 🤯...