jaywongs

Results 21 comments of jaywongs

> @jaywongs , did upgrading deepspeed work for you? not work for me,i use the deepspeed 0.14.2

> > > @jaywongs , did upgrading deepspeed work for you? > > > > > > not work for me,i use the deepspeed 0.14.2 > > Hello, have you...

> I recall that you may be able to with deepspeed 3 and cpu offload Apologies for the confusion. I attempted to use deepspeed 3 with CPU offload, but the...

The batch size set to 1 is not working. I haven't tried the 8-bit optimization. Will using 8-bit affect the quality of the trained model?

I'm also confused about this. The issues with the build process and version compatibility are driving me crazy.

@sleepwalker2017 Hi, did you solve this? I'm facing the same problem as you and have no idea what happened.

``` I0328 09:48:48.039005 1588 server.cc:677] +------------------+---------+--------+ | Model | Version | Status | +------------------+---------+--------+ | ensemble | 1 | READY | | postprocessing | 1 | READY | | preprocessing...

> Hi @plt12138, it is a known bug in v0.8.0 release. It has been fixed in the recent main branch. Could you, please, try it? TensorRT-LLm :0.9.0.dev2024031900 Confirmed, it didn't...

> Hello, would you mind spending some time testing the parameter length_penalty? In my case, the parameter length_penalty doesn't make sense in Mistral. I'm not sure if the bug is...