Leo Jiang
Leo Jiang
# What does this PR do? Implement Flash Attention for NPU machine Add cleaning memory function for NPU machine ## Before submitting - [ ] This PR fixes a typo...
### What does this PR do? 1. Qwen3-vl with FSDP will need transformers > 4.57.1. Otherwise, bug will occur in transformers. 2. Add npu patch to Qwen3-vl to increase the...
# What does this PR do? In this way, the computing time cost for NPU will reduce about 3%-5%. ## Before submitting - [ ] This PR fixes a typo...
# What does this PR do? I'm training the flux2 img2img with 8 GPUs, the running scripts is using the script from examples/dreambooth/README_flux2.md: ``` bash accelerate launch train_dreambooth_lora_flux2_img2img.py \ --pretrained_model_name_or_path=black-forest-labs/FLUX.2-dev...