Shuo Yang

Results 21 comments of Shuo Yang

@merrymercy My PR didn't fix the problem, how can we solve it?

@kennymckormick please download v1.1 weight [here](https://huggingface.co/lmsys/vicuna-7b-delta-v1.1) The old weight had no eos_token

> I have the same question even though I download v1.1 weight. v1.1 weight had changed several times, and you can remove your weight and download it again

It might be caused by CUDA OOM, try with: ~~~bash export WORKER_API_EMBEDDING_BATCH_SIZE=1 ~~~ and restart the server & API?

I have found where the problem lies. The max_seq_length of the fake model we specified differs from the actual deployed model. Therefore, langchain did not call 'get safe len' when...

@hyunkelw You are right. You can change the `CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)` to `CharacterTextSplitter(chunk_size=400, chunk_overlap=0)`

@hyunkelw Can you deploy the latest version? The new error message can help me to debug it.

IC, it is caused by cuda oom, your gpu memory is limited 😿 Try ~~~bash export WORKER_API_EMBEDDING_BATCH_SIZE=1 ~~~ and restart API & controller & model worker. If it still doesn't...

Nice work! @RedmiS22018 I encountered one issue when running your code, and I would like to bring it to you: - On Hugging Face, there are two versions of llama...

Hi @RedmiS22018, Thank you for your prompt response! I wanted to provide you with the link to the other version of the [llama weights](https://huggingface.co/decapoda-research/llama-13b-hf) > After the delta files are...