serving icon indicating copy to clipboard operation
serving copied to clipboard

Why TF Serving using one CUDA Compute Stream

Open ndeep27 opened this issue 1 year ago • 1 comments

Trying to understand why TF uses one CUDA compute stream? Is there a metric which shows if ops are waiting to be scheduled on that one compute stream? I want to understand if the ops are waiting in high QPS scenarios

ndeep27 avatar May 06 '24 22:05 ndeep27

@ndeep27, Looks like this is not an issue from Tensorflow Serving side. This question is better asked on TensorFlow Forum since it is not a bug or feature request. There is also a larger community that reads questions there. Thank you!

singhniraj08 avatar May 08 '24 08:05 singhniraj08

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar May 16 '24 01:05 github-actions[bot]

This issue was closed due to lack of activity after being marked stale for past 7 days.

github-actions[bot] avatar May 24 '24 01:05 github-actions[bot]

Are you satisfied with the resolution of your issue? Yes No

google-ml-butler[bot] avatar May 24 '24 01:05 google-ml-butler[bot]