Qsingle comments

Results 61 comments of


                                            Qsingle

KeyError: 'visual.patch_embed.proj.weight'

> Does anyone have insight into why transformers >=4.52 throws this error? I'm working with a finetuned 3B Qwen model that is only compatible with transformers 4.52 and higher, so...

The important of num classes argument

We do not use the mask decoder, so we use one segmentation head to fine-tune the model. So we must tell the task head how many classes it needs to...

The important of num classes argument

> @Qsingle Thank you for your reply. In my case, each image has about 10-14 structures of the same class (the exact number is unknown unless I run some object...

Can this repo run gemma3 full series model?

This repo does not support multi-modal training. You can use the [repo](https://github.com/Qsingle/verl) to run Gemma3 with higher performance.

Can this repo run gemma3 full series model?

> hello, I still encouter the following question, how do you solve it? I use continue replace this raise tips. > > File "/verl-main-Qsingle/verl/utils/fsdp_utils.py", line 123, in get_fsdp_wrap_policy > raise...

Can this repo run gemma3 full series model?

> hello, I can run the GRPO with gemma3-1b-it based on gsm8k dataset. during the training, after training some steps, the grad norm will be NaN. this is my start...

Can this repo run gemma3 full series model?

> 1. I have modified the attention implementation at line 214, I still encounter the same eror, which grad_norm is Nan. > 2. when I running the gemma3-1b-it, I confirm...

Verl integration

The PR [#2327](https://github.com/volcengine/verl/pull/2327) could be used for the RL training, and [trl](https://huggingface.co/docs/trl/index) is better for the SFT process. I've checked it on Blackwell Series GPU like Pro 6000, so it...

Verl integration

> I'll try, I am researching about open domain, especially for medical domain, i see your work very excellent. Can we discuss more via other platforms? Email me at [[email protected]](mailto:[email protected])...

Video memory requirement

Good question, 24 GB memory is enough for all of the models. If you use the vit-b, 12GB memory can run the model.