Julia Turc

Results 11 comments of Julia Turc

Thanks @BenjaminBossan for looking into this. > Let's first focus on the GPU usage: What looks strange to me is that in your first graph, it continually increases, but then...

Here's a minimal script that reproduces the issue. Note that I'm setting `num_inference_steps=1` for the sake of speed, so the absolute numbers will not be comparable to the ones above....

Hi there, just wanted to check in and see if there are any new developments

Sorry, I didn't understand your plot above. Is it suggesting that the memory consumption is actually constant around 700? And what is the unit of the Y axis / memory...

I managed to trace where the memory leak is coming from. It's the `pipe.load_lora_weights` method. As a reminder, the high-level algorithm here is: ``` for n in num_loras: load lora...

Here's a full repro in a Colab notebook (just added some print statements to the code snippet above): https://colab.research.google.com/drive/1u-DTQFZHGSiR-287CS3ELUYOC6jfhMqU?usp=sharing You can see that, on each iteration, *two* LoRAs are loaded...

Thanks so much for the fix and sorry for the delayed response. I will try it out in the next day or two.

Thanks again @BenjaminBossan for the PR! I've rerun the notebook above with a higher number of MAX_LORAS. The good news is that, indeed, I'm not seeing 2 LoRAs being loaded...

With this fix, I'm seeing that memory isn't linearly going up anymore. There's a spike on each call, and overall memory consumption is uniform across calls. That's great! As mentioned...