Raushan Turganbay comments

Results 117 comments of


                                            Raushan Turganbay

KOSMOS-2 Entities giving null

Hi @andysingal. To use Kosmos-2 for image grounding, you have to add a special `` token before the prompt, as they do in the [paper](https://arxiv.org/abs/2306.14824). Also you can use ``...

Quantized KV Cache

As we discussed quantized cache can be started to be integrated to the library, given the results we got so far. All the possible speed optimizations/pre-fill stage optimizations can be...

Quantized KV Cache

Thanks for the comments! > except for guarding quanto imports (also I would say safer to make local imports whenever possible - e.g. at QuantCache init) Okey noted! > You...

Quantized KV Cache

@gante added benchmark results on the PR description. Right now int4 has almost same performance as fp16, sometimes a bit better. Also added some comparison with the KIVI paper.

Quantized KV Cache

I made the KV cache work with HQQ as a backend. It can be simply plugged in if a user writes their own "CacheClass". I am not planning to add...

Phi: static cache & compile compatibility

@gante > This is with static cache AND compile, correct? Without compile it has no problems, correct? (I haven't seen them yet, if it happens without compile a reproduction example...

Phi: static cache & compile compatibility

@gante as we discussed, I will not dig into the gibberish generation for fp32. In that case the PR should be ready to merge when we get the slow-test passing....

Setting compute_metrics in Trainer with Idefics2ForConditionalGeneration leads to AttributeError: 'DynamicCache' object has no attribute 'detach'

I think the cache problem should be fixed by converting `DynamicCache` back to legacy_cache in Idefics2's backbone language model, like it's already [done in llama](https://github.com/huggingface/transformers/blob/91d155ea92da372b319a79dd4eef69533ee15170/src/transformers/models/llama/modeling_llama.py#L1025-L1029). These changes are partially related...

Setting compute_metrics in Trainer with Idefics2ForConditionalGeneration leads to AttributeError: 'DynamicCache' object has no attribute 'detach'

We discussed this with @gante the cache input-output format yesterday. Maybe llama-format cache is not what we need, by anyway @gante will take care of it 😄

Setting compute_metrics in Trainer with Idefics2ForConditionalGeneration leads to AttributeError: 'DynamicCache' object has no attribute 'detach'

@amyeroberts I am not sure what should be the correct format of cache objects we return for language models since now we do not have consistency, so I wanted @gante...