Ankit Mathur

Results 23 comments of Ankit Mathur

@LukasLohm just to confirm - is this screenshot from the Databricks Serving product?

@LukasLohm thanks for pointing this out - I've created a task to fix these and it should go out with the next release

Is there any update on support for startupProbe here?

@mmaybeno is there a better way to do this now?

https://github.com/mosaicml/llm-foundry/pull/169

@moyix the conversion logic does save the weights in FP16 to disk though, so should we perhaps modify that logic to not do that?

Yeah, this is a weird behavior of Triton where it fails to load weights but does not fail to launch the server - what this tells you is that either...

@SupreethRao99 I'd love to hear by the way what the performance different between TensorRT and FasterTransformer is? My read of that response by NVIDIA is that TensorRT is tested better,...

As an update, I don't see this issue when running `./setup.sh` with 2 GPUs on the same model