Miguel Angel Alba Acosta

Results 8 comments of Miguel Angel Alba Acosta

@travellingsasa You can do some sort of anti-text or placeholder text to do multi-label classification, ex: your objective is checking in there is the presence of "red" in an image...

did you tried training with standard clip loss?

I did some research on CPU memory leaks, and people say most of the time memory leaks appear when tensors are accumulated without being detached (as they carry with them...

I'm trying to finetune the HF implementation of 336px (so 440M Params) with LoRA (4mill additional parameters are trained, the rest is frozen), I'm training using lighting fabric as it...

Same issue of utilisation here, I train on GCP using 4 nodes (n1-highmem-16) on GCP each with 2 (V100) GPUs , ![image](https://github.com/mosaicml/streaming/assets/34891351/8d857edf-f7c7-417e-bf3c-7bbbbc24c716) First 2 nodes are busy at 98-99% utilisation...

> @miguelalba96 In the past we've seen that treating GCSFuse as "local" can be slow. Have you tried treating it as remote, or moving your data to local disk? I...

I'm experimenting similar issues when loading image/text pairs (local). The RAM usage starts to increase non-stop (GPU is stable). I managed to "solve" it partially decreasing the number of workers...

I tested again training for longer period: my implementation: ```python from ast import literal_eval import torch from PIL import Image from streaming import StreamingDataset, StreamingDataLoader import utils.visual_attribution # None: super().__init__(...