Thomas-MMJ
Thomas-MMJ
### Describe the bug running pytest if I suppress test_attention_block_default, two additional unet tests pass ``` test_layers_utils.py ................. test_models_unet.py ...............FF..............F.........s....s.. vs test_layers_utils.py .............s... test_models_unet.py ...............................F.........s....s.. ``` can test by either...
This fixes PIL.Image Resampler warnings, fixing bug https://github.com/huggingface/diffusers/issues/784
Would lm.int8 allow this to be trained on low end cards? https://huggingface.co/blog/hf-bitsandbytes-integration https://github.com/TimDettmers/bitsandbytes
# 🐛 Bug numerous tests in test_mem_eff_attetion.py failing due to assertion errors, here is the first one ``` pytest ./tests/test_mem_eff_attetion.py::test_backward[cutlass-cuda-torch.float32-1,32,32,1,128,128-False-None-BMHK] E AssertionError: qkv: out=0.0 and ref=25.226354598999023 (diff=25.177289962768555 > 0)/ atol=0.04654095001184158,...
# 🐛 Bug the parameter static_argnums is passed in memory_efficient_fusion in the following three files ``` xformers/components/nvfuser/bias_act_dropout.py: aot_fn = memory_efficient_fusion(_fn, static_argnums=(2, 3)) xformers/components/nvfuser/bias_dropout_res.py: aot_fn = memory_efficient_fusion(fn=_fn, static_argnums=(2)) xformers/components/nvfuser/bias_dropout_res_layernorm.py: aot_fn =...
# 🐛 Bug In test_core_attention the tests test_switch_blocksparse_dropout[0.0-True-cuda], test_switch_blocksparse_dropout[0.0-False-cuda], test_switch_blocksparse_dims[cuda], test_switch_blocksparse_dropout[0.3-False-cuda], test_switch_blocksparse[data_type1-cuda] , test_switch_blocksparse_dropout[0.3-True-cuda] all fail. here is the output, ``` pytest tests/test_core_attention.py =================================================================== test session starts ==================================================================== platform linux...
# 🐛 Bug The test `pytest tests/test_pytorch_transformer_parity.py::test_pytorch_encoder_parity ` fails with the result ``` # Catch a broken training > assert fit_ratio_xformer > 120 E assert 85.07492282178951 > 120 ``` #...
# 🐛 Bug running the test `pytest -v tests/test_swiglu.py ` 168 of the tests fail ( test_forward_backward ) with E torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 66.00 MiB...
your calculation of time is based on steps, but you need to multiply steps times accumulations to get the correct eta estimate, so currently accumulation 2, and steps 1000, it...
would be nice to specify a checkpoint being generated ever 100 or 200 or whatever steps