Raushan Turganbay

[email protected]

Hugging Face France ML Engineer at 🤗 | Generation & Multimodality

Results 117 comments of


                                            Raushan Turganbay

Inconsistency in logit values between generation and direct model prediction

Hey! I think you can get identical logits without double precision if you disable caching by "use_cache=False" 🤔

Inconsistency in logit values between generation and direct model prediction

@lowlypalace I don't think there's anything else to make them identical. As @younesbelkada said, there will always be some small numerical precision errors. Disabling cache and recalculating keys/values every time...

Error running inference on CogVLM2 when distributing it on multiple GPUs: Expected all tensors to be on the same device, but found at least two devices

Hey! CogVLM uses custom code from the hub when you set `trust_remote_code=True` and the model is not yet added to transformers. There is an [open PR here](https://github.com/huggingface/transformers/pull/28196) to port the...

Chameleon: add model

The modeling /processingcode is done and passes all the tests with dummy weights. I looked in transformers for similar models to replace VQEncoder with a simple call to vision backbone,...

Chameleon: add model

Ready for review! The model conversion is fixed, thanks to Arthur for spotting the bug. Now we have to convert and upload the weights to Meta org on hub, so...

Chameleon: add model

1. Yes, maybe we don't need assertion then. A bit weird that outputs are completely different though, I will check out and change it. 2. Hmm, that's weird, I will...

Chameleon: add model

The PR is ready. The only moment that needs to be done is uploading weights to the hub (after we find what's the issue with 30b model's image module with...

Chameleon: add model

Hmm, probably we need to manually move tye residual to the same device as hidden states after attn module. Btw, I was running on one A100 gpu, it fits perfectly...

Chameleon: add model

@EwoutH almost there, just need to apply changes for sharded inference in 30b model. I was off for a week and will work on it tomorrow.

Chameleon: add model

Pushed changes for qk layernorm and tested that it works for both checkpoints. Locally tests are all passing, except for slow ones. So the last step is now to run...

‹
1
2
3
4
5
6
7
8
9
10
11
12
›