ros3djs
ros3djs copied to clipboard
Large PointCloud Jumping and Latency
I'm visualizing a velodyne point cloud scan, however the scan points look like they are jumping back and forth between two scans. Also there is a big delay between when the velodyne_points topic stops publishing and when the scene stops.
It seems like there is a growing buffer that can't get visualized fast enough (verified from htop). Unfortunately the pointcloud2 object doesn't include a rate parameter.
I've set the max points from really low to really high and it doesn't have an effect.
The pointclouds are large but I'd expect it to at least be smooth and delayed.
Here is the relevant part of the code
// Create the main viewer.
var viewer = new ROS3D.Viewer({
divID : 'viewer',
width : 800,
height : 600,
antialias : true
});
// Setup a client to listen to TFs.
var tfClient = new ROSLIB.TFClient({
ros : ros,
angularThres : 0.1,
transThres : 0.1,
rate : 10.0,
fixedFrame : '/velodyne'
});
// Setup PointCloud2 stream
pointCloud = new ROS3D.PointCloud2({
ros : ros,
topic : '/velodyne_points',
tfClient : tfClient,
//texture: image_url,
rootObject : viewer.scene,
size: 0.02,
max_pts: 10000
});
My thoughts are that
- The data rate is just too high and it's causing both jumping and a growing queue.
- There is an issue with the tf2_web_republisher (again because of long processing time) that is causing the jumping.
- I need to make a callback with no queue that only updates after the pointcloud has been processed.
- Or I could modify pointcloud.js to take advantage of the rate parameter.
Relevant Info I'm running this on ubuntu 14.04 with apache2 on an intel nuc i5 I've followed https://github.com/RobotWebTools/rosbridge_suite/pull/192 and https://github.com/RobotWebTools/rosbridge_suite/issues/212 which seem relevant.
I added queue size and rate to the config of the ros3d.PointCloud2 and that seemed to help with the jumpiness after some fiddling. The issue is that the client side doesn't seem to be able to keep up with the large data size (even after a reasonable queue size on the server). I have another topic that the client subscribes to but the websocket seems to be saturated because it is extremely laggy after the point cloud get's published. I don't know if this is a problem with the viewer rendering or some latency with the websocket but any input would be appreciated.
It certainly isn't just high rate point cloud data that is an issue.
I was recently playing with moving markers using the script from the second answer here: http://answers.ros.org/question/11135/plotting-a-markerarray-of-spheres-with-rviz/
While it's pretty to look at, the latency and buffering becomes apparently when you stop the script and realise that the markers continue to move for quite some time.
I likely won't have the chance to do so this week, but I think it'd be important (for someone, anyone) to set up a test in which we log the send time of the data, the time each item is received at rosbridge and the time it is received in ros3djs. Unless someone can think of a better way, this will at least pinpoint if it's a rosbridge issue or something inside ros3djs. I feel like it's possibly a rosbridge thing...
Interesting about the marker visualizer being latent with only 100 markers. I'm also suspicious that it is in the rosbridge side of things. I might have some time this weekend to look at logging the time-stamp. I'll let you know if I confirm our suspicions.
I found this issue for rosbridge, sounds like it confirms our problem. https://github.com/RobotWebTools/rosbridge_suite/issues/69
I've done a fair amount of debugging and I think I know where the overhead is, but not quite sure what to do about it...
My test case was discussed above in which I had 100 markers in a marker array being sent at 100Hz. I didn't actually expect rosbridge to be able to keep up, but figured this would help exacerbate the issues.
Using a combination of PyCharm's profiling tool and line_profiler, I found that the queue of messages coming in from the subscriber inside rosbridge grows. In particular in rosbridge_library\internal\subscribers.py the line json = message_conversion.extract_values(msg)
stood out as taking a huge amount of time.
I dug a little bit deeper and found that the parsing of my messages was just taking a long time (most of which takes place in message_conversion.py in rosbridge_library/internals). This is to be expected given that my MarkerArray container 100 Marker objects, and a Marker object is made up of quite a few other ros message types.
There wasn't a lot that stood out as being particularly inefficient code. I made a few small improvements to the main parser components in _from_inst(inst, rostype) but it wasn't enough to make a significant impact given the deepely nested message types and high data rate.
Your velodyne point clouds are much simpler message types, but contain large binary blobs. So perhaps it is the memory copying that hurts.
Do you have a test data set I could experiment, or perhaps let me know how much data and what rate you're running at?
Nice work! I've uploaded a bagfile here for testing. I took a cursory look at the file you mentioned but like you nothing stood out. Should we throw out messages that come in before the parser has finished? Not sure if just finding a way to make the parser faster will be the fix.
Is this problem solved? I have come across a similar situation..@FuzzAU @skeel3r
On a related topic, I have added message and point subsampling options to PointCloud2 and LaserScan and optimized it somehow in #218. A new example (examples/kitti.html) shows that subsampling helps to keep up with the incoming flow of points when visualizing the kitti velodyne, but that optimizations are not sufficient (at least on my desktop) to visualize all points of all messages. @FuzzAU @skeel3r @yuyuyu00 , this example may provide you a use case to track down where things could get more optimized.