Jack BAI comments

Results 37 comments of


                                            Jack BAI

"RuntimeError: p.attn_bias_ptr is not correctly aligned" when using ``` in a prompt with images

Same problem, I'm using multi-image multi-turn generation with huggingface, would appreciate any help here! All model types (4b, 12b, 27b) have this problem. But this problem is random and contingent...

"RuntimeError: p.attn_bias_ptr is not correctly aligned" when using ``` in a prompt with images

@FredrikNoren did you try using the gemma repo instead of HF? Is it also the case for the original repo or is it only a problem with HF?

"RuntimeError: p.attn_bias_ptr is not correctly aligned" when using ``` in a prompt with images

I just tried this: ``` self.model = Gemma3ForConditionalGeneration.from_pretrained(..., attn_implementation="eager" ).eval() ``` maybe you can also try manual attn implementation (`eager`) first

[BUG]Zero stage3 can not save model weights correctly!

Also encountered this error. The saved checkpoint is quite small and not at all usable.

[BUG] Re-initializing the Engine

Dear @tjruwase , thanks, I will try using destroy method. I meant exactly what you describe. Basically we launch the `.py` file with `deepspeed`, and within this `.py` file I...

[BUG] Re-initializing the Engine

Dear @tjruwase, It seems like the CPU memory was not freed after destroying the model engine under ZeRO-3 - the GPU memory is freed though. Below is the minimum reproduction...

[BUG] Re-initializing the Engine

For a bit more context - I have to destroy the engine in each epoch because I will need to run vllm after each epoch, which is omitted in the...

[Bug]: Gemma-3 (27B) can't load save_pretrained() checkpoint: AssertionError: expected size 5376==2560, stride 1==1 at dim=0

Edit: the problem happens on 12b and 27b gemma-3, but not 4b. **Also when inferencing the updated 4b model, the inference speed gets extremely slow when batch size > 32.**...

[Bug]: Gemma-3 (27B) can't load save_pretrained() checkpoint: AssertionError: expected size 5376==2560, stride 1==1 at dim=0

Sure, eg this is the saved 12b config: ``` { "architectures": [ "Gemma3ForConditionalGeneration" ], "boi_token_index": 255999, "eoi_token_index": 256000, "eos_token_id": [ 1, 106 ], "image_token_index": 262144, "initializer_range": 0.02, "mm_tokens_per_image": 256, "model_type":...

[Bug]: Gemma-3 (27B) can't load save_pretrained() checkpoint: AssertionError: expected size 5376==2560, stride 1==1 at dim=0

Yes this is 12b. 12b also does not work with save/reload.