TensorRT
TensorRT copied to clipboard
host_runtime_perf_knobs usage issue: [TRT] [E] IExecutionContext::enqueueV3: Error Code 3: API Usage Error
I'm trying to write a unit test for flash attention using version 0.14.0.dev2024100100.
I noticed that host_runtime_perf_knobs is a new feature in recent versions. Here are how I use it and the reported error code:
` with tensorrt_llm.net_guard(net):
input_dim_range = OrderedDict([
('num_tokens', [batch_size*1, batch_size*max_seq_len]),
('hidden_size', [hidden_size, hidden_size]),
])
trt_hidden_states = Tensor(
name='hidden_states',
shape=[-1, hidden_size],
dtype=tensorrt_llm.str_dtype_to_trt(dtype),
dim_range=input_dim_range)
runtime_perf_knobs = Tensor(name='host_runtime_perf_knobs',
shape=[max_seq_len],
dtype=tensorrt_llm.str_dtype_to_trt('int64'),
dim_range=OrderedDict([('perf_knob_size', [max_seq_len, max_seq_len])])
)
`
attention_params=AttentionParams( sequence_length=sequence_length_tensor, context_lengths=context_lengths_tensor, host_request_types=host_request_types_tensor, max_context_length=context_length, host_context_lengths=host_context_lengths_tensor, host_runtime_perf_knobs=runtime_perf_knobs)
The error is:
[10/09/2024-06:22:33] [TRT] [E] IExecutionContext::enqueueV3: Error Code 3: API Usage Error (Parameter check failed, condition: mContext.profileObliviousBindings.at(profileObliviousIndex) != nullptr. Address is not set for input tensor host_runtime_perf_knobs. Call setInputTensorAddress or setTensorAddress before enqueue/execute.)
Any ideas why?