BEVFormer_tensorrt INT8 accuray issue for Multiscale deformable attention

INT8 accuray issue for Multiscale deformable attention

Open pianogGG opened this issue 1 year ago • 3 comments

hello, I'm curious about the accuray of FUNCTION:ms_deformable_im2col_cuda_int8() channels /= 4; const int value_step = num_heads * spatial_size * channels; const int output_step = num_heads * num_query * channels; const int points_step = num_query * points_per_group; const int weight_step = num_heads * num_query * num_levels * num_point; const int offset_step = weight_step * 2;

for (int batch_index = 0; batch_index < batch_size; batch_index++) { ms_deformable_im2col_gpu_kernel_int8<__half2> <<<GET_BLOCKS(num_kernels), THREADS_PER_BLOCK, 0, stream>>>( num_kernels, data_value, scale_value, data_spatial_shapes, data_reference_points, data_sampling_offsets, scale_offset, data_attn_weight, scale_weight, 1, spatial_size, num_heads, channels, num_levels, num_query, num_point, points_per_group, data_col, scale_out); data_value += value_step;// data_col += output_step; data_reference_points += points_step; data_sampling_offsets += offset_step; data_attn_weight += weight_step; } For data_value += value_step; const int output_step = num_heads * num_query * channels; channels have already been divisible by 4，this is not next batch data for data_value ， right？

Mar 28 '23 15:03 pianogGG

BEVFormer_tensorrt BEVFormer_tensorrt copied to clipboard

INT8 accuray issue for Multiscale deformable attention

BEVFormer_tensorrt
BEVFormer_tensorrt copied to clipboard