Wei-Lin Chiang

Results 111 comments of Wei-Lin Chiang

this is expected behavior as @surak explained. we've been serving models for long time on [chat.lmsys.org](chat.lmsys.org) and did not find issue. however, if you find evidence for memory leak. let...

@nd7141 @WGB0304 You may add `--share` to return a public url link, which may bypass this issue. Although the link only lives for 72hr.

@Ejafa feel free to open if you have any other question.

Thanks for reporting this issue. we looked into it and found the issue is coming from the too small gradient norm in the beginning, which leads to a infinite loop...

could you run the following code in your python environments? ``` python import sys print(sys.platform) ``` I think that will identify the problem

Hi, I have tested Octave 4.0.0 on Debian 9. It works well for 8-thread parallelism. Do you add the option "-lgomp" in make.m?

You're welcome. Actually you can check the faq here, there is a step-by-step teaching for parallelizing libsvm using matlab/octave. http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#f8032

Hi @huseyinatahaninan Sorry for the confusion. Let me try to clarify this. In our [paper](https://arxiv.org/abs/2306.05685) we study reference-based judge which the LLM judge generates a reference answer independently first and...

Strong +1. Reka has been offering great language + vision models. Would love to see litellm supports it.

this is unexpected. can you try to add `--debug` ``` python3 -m fastchat.serve.cli --model-path qwen/qwen-72b-chat --debug ``` and check what exactly is the loaded chat template?