Vladimir Albrekht

Results 12 comments of Vladimir Albrekht

Yeah, today want to do it. If it will work I will response here. > Maybe you can try training using v2 and see if you still face the same...

This model work when use Google Collab, but not working locally, I have 1060, might be problem with it graphic card.

@LDLINGLINGLING Thanks for response. So it's not possible to connect MiniCPM vision part with the 70B quantized model for example? Because previously I connected to llm part 8B fine-tuned version...

Thanks. We are planning to fine-tune on the task specific datasets for our language after we will be able to connect the vision encoder and 70B llm part. What do...

We will check if that will not work we will just try other approach data might be the key. Maybe you can provide me some directions of how to connect...

> ### Reminder > * [x] I have read the above rules and searched the existing issues. > > ### System Info > ValueError: Some keys are not used by...

It works with latest ms-swift with this env. ```bash uv venv --python 3.11 --seed .venv source .venv/bin/activate git clone https://github.com/modelscope/ms-swift.git # 3.9.0.dev0 cd ms-swift uv pip install -e . cd...

When I'm running full training with DeepSpeed stage 3, for some reason it stuck on the 0 step. I tested same thing with 5B thinker random initialized model and it...

@shuoyinn Hi shuoyinn. If you will be able to solve this, can you let me know. It's something related to Qwen3MOE on DeepSpeed as I understand, with FSDP it might...