Xuan-Phi Nguyen comments

Results 25 comments of


                                            Xuan-Phi Nguyen

Mathematical Typesetting Exporting Issue

Would it be possible to support LoRA fine-tuned models?

I added this quick solution below for llama-hf model. The steps are 1. Load original llama to vllm with `llm = LLM("llama-7b")` ... 2. Load lora states dict `lora_state_dict =...

Would it be possible to support LoRA fine-tuned models?

@nivibilla I don't use ray so I'm not sure. But you need to locate and apply the `reassign_weights` function to the [VLLM LlamaForCausalLM model](https://github.com/vllm-project/vllm/blob/aa39e42c5a8a2359363529571cb553cc30e26d58/vllm/model_executor/models/llama.py#L189) here, wherever it is in Ray.

Would it be possible to support LoRA fine-tuned models?

@zuxinqi Sorry I forgot, transpose here: ```python def transpose(weight, fan_in_fan_out): return weight.T if fan_in_fan_out else weight ```

Would it be possible to support LoRA fine-tuned models?

> > Hey, I tried to do this, but when the model is loaded using Ray it doesn't work. I get this error > > ``` > > --------------------------------------------------------------------------- >...

Reproducing results for IWSLT En-De

Hi, Thank you for your interest in the paper. There're few possible reasons. 1. Many dependencies such as bleu calculation (which is not sacrebleu but a bleu with special tokenization...

Cannot find dwnstack_merge2seq_node_iwslt_onvalue_base_upmean_mean_mlesubenc_allcross_hier arch in fairseq

Hi, sorry I didn't have time to revise the code. I will check it later. In the meantime, can you try using fairseq 0.8.0 and use --user-dir and parse the...

Reproduce results on SST-2 / SST-5

Hi, very sorry we did not have time to clean up the codes. As in shown in the instruction, please follow the configuration `dwnstack_merge2seq_node_iwslt_onvalue_base_upmean_mean_mlesubenc_allcross_hier` to find its implementation in the...

Better llava next.

@NielsRogge Thanks. Let me check it out. I thought batched generation require left-padding, unless the 2 of the sample are exactly same # of tokens, because otherwise pad tokens will...

Better llava next.

@NielsRogge I have added batched generation with left padding in the latest commit. Try it here: ```python import torch from huggingface_hub import hf_hub_download import requests from PIL import Image from...