Raushan Turganbay comments

Results 117 comments of


                                            Raushan Turganbay

[FEATURE] Benchmark existing preference tasks (UltraFeedback, UltraJudge, JudgeLM)

I ran the datasets for each of the tasks. From the hhh alignment dataset, I took only the "other" part, and for the mt bench the first 100 questions. The...

[FEATURE] Benchmark existing preference tasks (UltraFeedback, UltraJudge, JudgeLM)

Yes, I opened PR [131](https://github.com/argilla-io/distilabel/pull/131)

[Usage] Batch inference with Llava 1.5

@hxhcreate @NielsRogge yes, that is a known issue and I merged a fix few days ago. Unfortunately refactoring broken some things, let me know if updating to the latest `main`...

[Usage] TypeError: LlavaLlamaForCausalLM.forward() got an unexpected keyword argument 'cache_position'

Hey! This should be solvable by popping the `cache_position` from `inputs` in [this method](https://github.com/haotian-liu/LLaVA/blob/c121f0432da27facab705978f83c4ada465e46fd/llava/model/language_model/llava_llama.py#L144). `inputs.pop("cache_position")` The error is raised because calling "super()" returns kwargs that are not used in the...

Quantized KV Cache

@ArthurZucker yes, making a versatile cache class will go on another PR. In that case we can leave `quanto` as the only choice available, and the rest can be implemented...

Quantized KV Cache

@ArthurZucker @gante I made a few changes from the last review: 1. Now we support HQQ and quanto (quanto by default as it is a bit faster, we'll work on...

Quantized KV Cache

Cool, merging 🤞🏻 Ran slow tests in quantization and generation locally, everything is passing.

Quantized KV Cache

@ydshieh This PR actually results in slow-down because of quantization 😅 But we can check the memory usage probably. Here is a [script](https://gist.github.com/zucchini-nlp/56ce57276d7b1ee666e957912d8d36ca) I used, but you'd have to replace...

Reduce by 2 the memory requirement in `generate()` 🔥🔥🔥

@Cyrilvallez Right, QuantizedCache stores most of the past kv in a private list, so I think these methods would not work even before your changes. Thanks for noticing! I will...

Phi: static cache & compile compatibility

@hegderavin sure, we will be porting models one by one (#28981). Right now I am waiting for this PR to be merged, so that we can work on other models...