Iman Tabrizian comments

Results 171 comments of


                                            Iman Tabrizian

Add llama4 disagg accuracy tests

/bot run --only-multi-gpu-test --disable-fail-fast

Add llama4 disagg accuracy tests

/bot run --only-multi-gpu-test --disable-fail-fast

Add llama4 disagg accuracy tests

/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-Others-1,DGX_H200-8_GPUs-PyTorch-[Post-Merge]"

Add llama4 disagg accuracy tests

/bot run --stage-list "DGX_H100-4_GPUs-PyTorch-Others-1,DGX_H200-8_GPUs-PyTorch-[Post-Merge]" --disable-fail-fast

Add llama4 disagg accuracy tests

/bot reuse-pipeline

Add llama4 disagg accuracy tests

/bot run --stage-list "DGX_H200-8_GPUs-PyTorch-[Post-Merge]"

Add llama4 disagg accuracy tests

/bot reuse-pipeline

fix: Fix response iterator in Python In-Process API

@GuanLuo I think the `response_iterator` was relying on a queue inside the request object which might be the root cause for this issue.

Context node crash when using PD Disaggregation

@nsealati It looks like the number of tokens is larger than expected, could you please double check that the client is sending exactly `3500` tokens to the server. It looks...

Support histogram custom metric in Python backend

Closing since this feature has been completed.