TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

Memory type of sampling params

Open akhoroshev opened this issue 10 months ago • 4 comments

This document describes tensor datatypes for GptManager InferenceRequest

My question is: what kind of memory is needed for these tensors? Pinned/Pagable/Device? I can't find information about this.

akhoroshev avatar Apr 03 '24 19:04 akhoroshev