Aflah
Aflah
@rtaori @davides Did you manage to replicate the code for the eval metrics? I'm also looking for those!
Update: I just noticed the missing part and changed that which fixes the old issue but now I get a new error - ``` Exception in ModelRpcClient: Traceback (most recent...
I did some further testing. It runs perfectly for - https://github.com/aflah02/sglang/blob/main/examples/usage/choices_logprob.py But fails for https://github.com/aflah02/sglang/blob/main/examples/quick_start/srt_example_chat.py with the error above Seems like the issue might be elsewhere
@merrymercy Any thoughts? Not sure why one tutorial works while the other doesn't
@merrymercy For Part 2 It seems that the error mainly occurs in the last few layers/last layer. Some of the logs are here for the chat example - [logs.txt](https://gist.githubusercontent.com/aflah02/e5b949096572c78269ddf73856268933/raw/9606aa845dbd750770859f004bb0f28bc0229770/logs.txt) The...
@merrymercy Any thoughts on what might be going wrong here? I don't know whether a template can make such breaking issues
@merrymercy Sorry for being inactive, life got really busy the past few months. I don't have the bandwidth nowadays to take this on and if you want to then feel...
Thanks! I'll try this out. How long did the model loading take for you btw?
@merrymercy So I ran this command and it seems the loading does complete but it's stuck here - ``` .... Loading model.layers.25.block_sparse_moe.experts.3.w3.weight Loading model.layers.25.block_sparse_moe.experts.4.w1.weight Loading model.layers.25.block_sparse_moe.experts.4.w2.weight Loading model.layers.25.block_sparse_moe.experts.4.w3.weight Loading model.layers.25.block_sparse_moe.experts.5.w1.weight...
Something similar happens for llama-65b while llama-2-chat-70b loads just fine