LDLS
LDLS copied to clipboard
Inference time and optimization
@brian-h-wang hi i have few queries on the inference time Q1. in you notebook we get timing "363 ms ± 21.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)" what does this statement infer Q2. I have obtain LDLS in real time inference which are of optimization i need to conc on since on 2080 Ti on a given frame which is sparse point cloud i am timing of "160ms" Q3 By reducing the distance i.e along x axes which would have less point cloud will be able to obtain high inference time Q4 in Segmentation.py in line 385 you have mentioned "TODO: Check if omitting this is faster later." what is supposed to be done in here Q5. can we use create_graph() for different class objects which might reduce the timing ?? Thanks in advance
- This comes from the timeit function in the iPython notebook, see here 2)What kind of inference time are you aiming to achieve? For really fast performance, my guess would be implementing the algorithm in C++ and using nvidia CuSparse for the spare matrix multiplies would be ideal - unfortunately I don't have any plans for that at the moment :(
- Down-scaling the point cloud uniformly shouldn't have any effect on inference time, the number of points is important rather than the size of the cloud. Is that what you meant?
- You can ignore that. Sorry for the mistake!
- I am not sure what you mean by this. Can you clarify?
Thanks!
@brian-h-wang thanks for the response Q5 , we are applying label diffusion on all the classes is it possible set different max iters and other hpyer params basedon the object class like car , peds and all Q2 is there C++ implementation available for LDLS or thought of implementation , i looking it as real time process approx around 40-50 ms per frame
- Yes, diffusion is treated separately for each object class, so you could use different hyperparams for different classes. The code as-is currently doesn't support this however.
- No current plans for a C++ port, sorry about that :( another consideration is that for 50 ms runtime, you'd also run into the image instance segmentation as a bottleneck and would probably need to swap Mask-RCNN for something faster - I'm not 100% sure about the current state of the art for segmentation
How did you implement it on custom data? I currently have a realsense L515 depth camera, but the author's code uses KITTI, how can I do it?