Daniel Han

Results 1134 comments of Daniel Han
trafficstars

Yep I auto removed it!

Oh wait this is a new arch - I'll check and get back to you

@mmathew23 So Gemma 3 is OK as well with this change?

The main issue is Gemma 3 requires token_type_ids since it's utilized for bidirectional attention

It should work I think! Most cloud services should function fine

Do you know what version of TRL you are using?

Not using vLLM correct? Hmmm I'll verify batched inference - maybe something broke

Oh no I don't think thats correct - better wait for my fix!

@xudou3 @kings-crown @StarLight1212 Apologies just fixed the gibberish output! For Colab / Kaggle, please restart and run all. For local machines, please do: ``` pip install --force-reinstall --upgrade --no-cache-dir --no-deps...