CenterPoint
CenterPoint copied to clipboard
PostProcess time is too long
Hello,I tested your postprocessGPU function.the result is 60+ms! in nvidia A6000, then I test cuda fucntion "sort_by_key" and "_raw_nms_gpu",it costs average 7.5ms "sort_by_key" every taskidx and 1~7ms "_raw_nms_gpu",so why it costs so long time
sorry for my late reply, it may be related to the difference in GPU architectures, all my samples are tested on RTX3080. Also, generally, the first sample will take some time, you may compute time from the second sample.
Haven't received responses for a long time
sorry for my late reply, it may be related to the difference in GPU architectures, all my samples are tested on RTX3080. Also, generally, the first sample will take some time, you may compute time from the second sample.
I have the same question! To be more specific, the question here is not about the difference of postprocess time across different GPU architectures. It's about the inconsistency between the whole postprocess time (60+ms) and the sum of the time consumption of individual components(_find_valid_score_num + sort_by_key + _raw_nms_gpu + _gather_all)