Jianwei Li
Results
2
issues of
Jianwei Li
To save GPU memory, I want to load the multilingual model in 4bit mode, the code is as follows. ```python import torch from transformers import AutoTokenizer from mplug_owl.modeling_mplug_owl import MplugOwlForConditionalGeneration...
I have trained several LoRA models using the Flux model, and I want to switch between these LoRA models dynamically without reloading the base Flux model. I saw in #1185...