Iman Tabrizian

NVIDIA

Results 132 comments of


                                            Iman Tabrizian

Add testing for L0_iterative_tutorial

@oandreeva-nv I think we did this for Python backend but the feedback was that it makes it harder to determine which part of the test is flaky since they are...

Casting NumPy string array to np_utils.Tensor disproportionately increases latency

Unfortunately, Triton has it is own serialization/deserialization for TYPE bytes tensors which is likely why you're observing slowdown. Is it possible to use TYPE_UINT8 if you just want to transfer...

UNAVAILABLE: Internal: FileNotFoundError: [Errno2] No usable temporary directory found in ['/tp', '/var/tmp','/usr/tmp', '/tmp/python_Lsk/3'] env_6Lp

@VirginieBfd Can you share the full logs? It appears that the error is coming from your Python model. Also, are you able to reproduce the same error using the NGC...

How to extract model states stored in Triton (Implicit State Management)

You can add another output with the same name as the output state if you want to return it to the client. https://github.com/triton-inference-server/server/blob/main/docs/user_guide/architecture.md#implicit-state-management > For debugging purposes, the client can...

How to extract model states stored in Triton (Implicit State Management)

Currently, yes that's the only way. Let us know if you have ideas for other ways to extract states as well.

Interaction of timeouts, ensemble scheduler and oldest sequence scheduler causes state leakage

Thanks for proposing a fix @HennerM and filing a detailed a GitHub issue @jamied157. We'll take a look at this and get back to you.

`unable to create cuda shared memory handle` when using multiprocessing to send multiple requests

Thanks for reporting this issue. I have filed an internal issue for further investigation.

Can we include commonly used data pre-processing library in triton server docker image?

Unfortunately, we cannot install these libraries as it can increase the container size significantly and there are many other customers asking for different libraries to be included. If we accommodate...

Unclear issue in loading HWC FP16 tensorRT .plan model

Can you share the max batch size in your model configuration? From model configuration docs: >Input and output shapes are specified by a combination of max_batch_size and the dimensions specified...

Unclear issue in loading HWC FP16 tensorRT .plan model

Hi @nathanjacobiOXOS I'm sorry about the delay. Triton also supports auto-completing the model configuration for TRT models. Can you try running the model without any model configuration and `--log-verbose=1`. It...

‹
1
2
...
5
6
7
8
9
10
11
12
13
14
›