jaywongs
jaywongs
> @jaywongs , did upgrading deepspeed work for you? not work for me,i use the deepspeed 0.14.2
> > > @jaywongs , did upgrading deepspeed work for you? > > > > > > not work for me,i use the deepspeed 0.14.2 > > Hello, have you...
> I recall that you may be able to with deepspeed 3 and cpu offload Apologies for the confusion. I attempted to use deepspeed 3 with CPU offload, but the...
The batch size set to 1 is not working. I haven't tried the 8-bit optimization. Will using 8-bit affect the quality of the trained model?
any update on this problem?
I'm also confused about this. The issues with the build process and version compatibility are driving me crazy.
@sleepwalker2017 Hi, did you solve this? I'm facing the same problem as you and have no idea what happened.
``` I0328 09:48:48.039005 1588 server.cc:677] +------------------+---------+--------+ | Model | Version | Status | +------------------+---------+--------+ | ensemble | 1 | READY | | postprocessing | 1 | READY | | preprocessing...
> Hi @plt12138, it is a known bug in v0.8.0 release. It has been fixed in the recent main branch. Could you, please, try it? TensorRT-LLm :0.9.0.dev2024031900 Confirmed, it didn't...
> Hello, would you mind spending some time testing the parameter length_penalty? In my case, the parameter length_penalty doesn't make sense in Mistral. I'm not sure if the bug is...