Ayush Mangal

Results 19 comments of Ayush Mangal

Is this thread still active? I also can't seem to download the weights :cry:

Yes @lucasgris I am using accelerate and have played around with num_workers. Even in the graph you shared, the util hits very low points (

@borrero-c thanks for looking into this, I didn't seem to observe anything changing after 1 epoch, it stays low for me. Also `accelerate.backward()` call might be taking time since its...

Added a pr #37 to avoid using flash Attention

Okay, got all the code which is needed in two files, and used existing diffusers primitives in some easy to catch places. Now will work on integrating it in the...

### Attention! Seems like we can use the diffusers Attention class directly, but need to add a new Processor to support RoPE embeds on selective heads as in F5

### Tokenization F5 uses a character level tokenizer for the text, might want to write a simple tokeniser class for it. Might just be fine to keep it in a...

### Tests Basic structure looks good now, let's add some tests, and then make it more diffusers friendly! Adding tests would also force me to follow the structure more strongly...

### Flow matching/Schedulers Will also need to use one of the schedulers from Diffusers, I think they use simple Euler method only, but the sway sampling step needs to be...