Fartash Faghri comments

Results 6 comments of


                                            Fartash Faghri

Unable to reproduce the accuracy of WRN-28-10 on Cifar-100

The problem is probably with using self.dropout the same way for both train and eval. Typically people use F.dropout in the forward function and pass self.training as an argument. I...

Instructions to download from Hugging Face Hub

MobileCLIP is now supported in Timm and OpenCLIP with models on HuggingFace. Thanks to all the maintainers and support team for facilitating. Please see the updated Readme in this repo...

Memory leak at the end of an epoch of training with OpenCLIP on a WebDataset

Thanks @tmbdev for suggestions. Without `wds.shuffle`, the memory leak is still the same. Any memory tracing tool you'd recommend?

Memory leak at the end of an epoch of training with OpenCLIP on a WebDataset

I made a small script that reproduces the leak. I tried `tracemalloc` and `memray` but they don't show what's taking up the memory. One can monitor the memory usage going...

Memory leak at the end of an epoch of training with OpenCLIP on a WebDataset

I investigated this a bit more. Here are some observations: - The leak in the above code and similarly in [OpenCLIP](https://github.com/mlfoundations/open_clip/blob/fc5a37b72d705f760ebbc7915b84729816ed471f/src/open_clip_train/data.py#L328) happens because in each epoch a new iterator is...

Memory leak at the end of an epoch of training with OpenCLIP on a WebDataset

The simplest workaround in OpenCLIP is to reduce the number of epochs so that the dataloader doesn't reset too frequently. Specifically for OpenCLIP, adjust `--train-num-samples` (number of samples in dataset)...