Bagheera

Results 447 comments of Bagheera

those are pretty minimal and eg. it doesn't implement cosmap/logit-norm or any of the SD3 training details, just about the same as cloneofsimo/minRF implementation. in fact it's basically identical -...

you don't need an A100 for flux. imo kohya should release sooner than keep trying to add the million features. you can train on 16G VRAM without any quantisation at...

it's not like that at all though. fp8 is fine, especially in pytorch 2.4. you can read back through the comments in this issue to see.

also, NF4 is definitely not "on par with Pro" 🤪

mps has correctness issues and can't be relied on for training a model. however MLX or Tinygrad do not rely on MPS and have proper results. i've never seen good...

the problem is probably an overflow inside pytorch's MPS code that has yet to be discovered. if you go to the pytorch issue tracker and search for `label:mps is:open` you...

actually the MLX project created an example trainer and it outperforms pretty much anything we can currently do in pytorch. i think that this can be closed in preference of...

i updated it to use file extensions instead of str_pattern to search. it still works. @williamzhuk can you follow-up with the other changes?

everything except atomic rename should work. in fact, we had atomic rename in the past. but it can also happen such that the instance is stopped **in the middle of...

probably --delete_bad_checkpoints would be better? it would account for filesystems where rename isn't atomic-on-crash (basically everywhere): > This trick doesn't work. People seem to think that this is safe becaus...