Gaz Iqbal
Gaz Iqbal
@glenn-jocher , @AyushExel - here is a PR against the yolov5 repo.
> @glenn-jocher , @AyushExel - here is a PR against the yolov5 repo. Please let me know if you need anything more here.
@glenn-jocher - the triton server detection broke because it was using the Path.name property for matching which would strip out any http:// or grpc:// prefixes. I also needed to change...
Good point. That's fairly straightforward to do for TritonRemoteModel. Are you invoking it via detect.py? If so, we'll need a way to relay that.
My concern with the latter is that it would be a contrived URI schema and not match canonical Triton URIs which may be confusing. That said, the approach is worth...
@tianleiwu What element_type are you specifying in bind_input? It appears that a valid numpy dtype is expected however numpy does not currently support bfloat16 to my knowledge. https://github.com/microsoft/onnxruntime/blob/0c6037b5abe571fc43a55ef7a9d2f846820fbe5d/onnxruntime/python/onnxruntime_pybind_iobinding.cc#L67
I ended up doing this - https://github.com/electron/forge/pull/1370#issuecomment-1474397250
I am running into the same issue (using diffusers==0.14.0) @sjkoo1989 which version of DeepSpeed were you able to run? On my end 0.8.1 fails with ``` File "/home/ubuntu/DeepSpeed-MII/.venv/lib/python3.8/site-packages/deepspeed/module_inject/auto_tp.py", line 35,...
There is a bit more progress after reverting to diffusers 0.11.1 and deepspeed 0.8.0 The server loads now but crashes at inference in `_fwd_kernel` ``` qk += tl.dot(q, k, trans_b=True)...
Likewise as @BogdanDarius - if I explicitly set `config.replace_with_kernel_inject = True` in InferenceEngine.__init__ then the model (**CompVis/stable-diffusion-v1-4**) loads but still crashes on inference. diffusers 0.14.0 and 0.13.0 crash with the...