Ross Wightman comments

Results 497 comments of


                                            Ross Wightman

trafficstars

Sampling with replacement in webdataset

@penfever it should not revert to sampling with replacement, it wraps. I am not aware of a sane way to achieve what you want with WDS and deal with the...

Support a sampling strategy for multiple training datasets

@zerovl thanks, couldn't this logic be placed in a dataset wrapper so we don't have repeat the train loop and incur more long term maintenance? Either one that covers both...

Support a sampling strategy for multiple training datasets

@zerovl thanks for updating this and your other PR, I'll try to find some time to take a closer look next week.

log every n seconds rather than n steps

could make it an arg so it can be changed for different setups, that's common, measuring time in the loop and having varying # of steps per log is pretty...

Why stair-like loss curve?

I can't speak to the training run for the graph as I didn't do them, @mitchellnw would have a better idea... but looks like it could be a shuffling issue...

CsvDataset should be shuffled every epoch, pre-shuffle isn' really relevant. Might be worth checking that https://github.com/mlfoundations/open_clip/blob/d9ee4aa0431b9ea5c99c8617e9b3e0f3f12c458f/src/training/train.py#L62 set_epoch is def being called in distributed case...

Why stair-like loss curve?

@mitchellnw I've noticed that the scale param has interesting relationship with the LR/loss, I wonder if it's almost behaving in a slightly oscillatory control systems fashion. The scale is strongly...

Generalizable Text Transformer Usage

@Zasder3 that could be useful, although as far as pretrained is concerned, the LiT paper suggests it's much more useful to use pretrained + (optionally) frozen vision backbones while keeping...

Generalizable Text Transformer Usage

Some additional thoughts on this... * `TimmModel` and related imports should move from open_clip/model.py to open_clip/timm_model.py * create a huggingface_model.py with maybe a `HuggingFaceTransformer` (or HfTextTransformer?). This adapter module will.....

Generalizable Text Transformer Usage

Oh yeah, and there was somthing annoying getting in the way of the text tower encapsulation... Some of the logic and params for the text tower are directly in the...