Murdad Esmaeeli

Results 14 comments of Murdad Esmaeeli

@nullscc , what quantization level are you using(FP64, FP32, FP16, BFLOAT16)? * typically you would need 4X the parameter count for 32-bit and 2 times the parameter count for 16-bit...

@nullscc I've had that before, could you try using a bit bigger GPU? it should not hang if utilization is 33GB/40GB

@joshuasundance-swca, @weiji14, If I'm understanding this correctly, the code below wouldn't be recommended to due to dependency headaches? If that's the case, what solution would there be to see the...

@9throok, any update on the issue that you mentioned?

Hi @Flossertoday , any update the problem that you mentioned?

@PeterTF656 , this shouldn't happen, as I looked at the code. Could you check if it is still happening?

@AyushExel , were you able to answer your question?

Hi @cirezd, you might be right in that there might not be a specific reason for this hard coding. @mkozakov , @lfayoux can you confirm? If there is no specific...

could you past the completion object,_conversation and one iteration of what chunk is? It seems the problem is a data type problem `'async for' requires an object with __aiter__ method,...

Hi @van51, are you still facing the issue?