Ross Wightman comments

Results 497 comments of


                                            Ross Wightman

trafficstars

m0 / v1 init

@lucidrains I've also been doing some not very scientific comparisons (restart train with same seed) and see what happens for in the case of one network (a vit-cnn hybrid), one...

m0 / v1 init

@lucidrains so, two network archs now, running through the variations, all zeros with no special case init definitely appears to be the winner in these tests of limited scope. Hmm...

With dataloader RSS memory consumed by HF datasets monotonically increases

Does it crash with OOM at some point? If it doesn't, it isn't a leak, just agressive caching or a custom allocator that doesn't like to give memory back (not...

With dataloader RSS memory consumed by HF datasets monotonically increases

@lhoestq that does indeed increase in memory, but if you iterate over array again after the first time, or re-open and remap the same file (repeat `table = memory_mapped_arrow_table_from_file(ARROW_PATH)`) before...

With dataloader RSS memory consumed by HF datasets monotonically increases

@stas00 my point was, I'm not convinced @lhoestq last example illustrates the leak, but rather the differences between memory mapping and in memory usage patterns. If you destroy arr, memory...

With dataloader RSS memory consumed by HF datasets monotonically increases

FWIW, I revisted some code I had in the works to use HF datasets w/ timm train & val scripts. There is no leak there across multipe epochs. It uses...

With dataloader RSS memory consumed by HF datasets monotonically increases

@stas00 my 2 cents from having looked at a LOT of memory leaks over the years, esp in Python, .3% memory increase over that many iterations of something is difficult...

With dataloader RSS memory consumed by HF datasets monotonically increases

@stas00 if you aren't using memory maps, you should be able to clearly see the increase in the virtual mem for the process as well. Even then, it could still...

how to change the text encoder and fine-tuning on custom dataset?

Please see https://github.com/mlfoundations/open_clip/pull/93 ... there is definitely interest in doing this, but it's not so straightforward to get good results