streetscape.gl icon indicating copy to clipboard operation
streetscape.gl copied to clipboard

WebSocket latency is large in live mode

Open didibaba opened this issue 5 years ago • 9 comments

We are trying to provide the xviz server live data, including async data from different sources. However, the latency in streetscape view is up to 2 seconds.

We are wondering if this is normal situation?

Thanks a lot.

didibaba avatar Jul 18 '19 08:07 didibaba

Can you describe how much data you are sending?

Some tips to give you the best possible performance when operating live:

  • Send data in the GLB format, it parses quickest
  • Make sure you have web workers turned on, and try starting with 4
  • Adjust the time window of the buffer to ensure you don't run out memory (check in the chrome task manager)

Things we are thinking looking to improve this:

  • Better dispatching of data to web workers
  • Improved parsing performance in the workers
  • Group types for polygons, circles, etc. which can be parsed and transformed faster
  • Various ideas to improve performance and reduce memory usage of the buffer (which is still used in live mode)

We will fold this into better docs about using this live mode and probably add some of this to the roadmap.

jlisee avatar Jul 22 '19 14:07 jlisee

@jlisee I use default configuration of the example get-started. Is this correct? I have 16GB memory, so it's enough I think.

export default new XVIZStreamLoader({
  logGuid: 'mock',
  // bufferLength: 15,
  serverConfig: {
    defaultLogLength: 30,
    serverUrl: 'ws://localhost:8081'
  },
  worker: true,
  maxConcurrency: 4
});

In the server, I found encodeBinaryXVIZ is very time consuming. The latency is even smaller when I comment the relevant encodeBinaryXVIZ code, which is as follows.

    setInterval(() => {
        const xvizBuilder = new XVIZBuilder();
        xvizBuilder
            .pose('/vehicle_pose')
            .timestamp(time_stamp)
            .mapOrigin(map_origin_lon, map_origin_lat, 4)
            .orientation(0, 0, (-vehiclePos.heading + 90) / 180 * Math.PI)
            .position(vehiclePos.x, vehiclePos.y, 0);

        xvizBuilder
            .timeSeries('/vehicle/velocity')
            .timestamp(time_stamp)
            .value(canMsg.speed);

        xvizBuilder
            .timeSeries('/vehicle/wheel_angle')
            .timestamp(time_stamp)
            .value(canMsg.steer / 16);

        xvizBuilder
            .primitive('/vehicle/trajectory')
            .polyline(vehicle_trajectory);

        xvizBuilder
            .primitive('/Mobileye/lane')
            .polyline(lane1Data);

//.........................................

        xvizBuilder
            .primitive('/camera/image_right')
            .image(rightCameraMsg.data, 'jpg')
            .dimensions(rightCameraMsg.width, rightCameraMsg.height);

        if (hasVehiclePose) {
            const message = xvizBuilder.getMessage();
            message.update_type = 'INCREMENTAL';
            // console.log(JSON.stringify(message));
            // const glbFileBuffer = encodeBinaryXVIZ(message, encodingOptions);
            // context.ws.send(glbFileBuffer);
            context.ws.send(JSON.stringify(message));
        }
    }, send_interval);

In addition, I found the lag was much larger when I gave the mapOrigin data, which would connect mapbox. I am also wondering if the lag would be larger without the object id information.

Thanks a lot!!

didibaba avatar Jul 22 '19 14:07 didibaba

@didibaba possibly related, but we have noticed cases where the websocket message processing is being starved by the animation loop. We are working on a solution. I'll reach out so we can verify with you if they are related to your problem.

twojtasz avatar Aug 12 '19 21:08 twojtasz

https://github.com/uber/xviz/issues/502 related

twojtasz avatar Aug 16 '19 18:08 twojtasz

@twojtasz Any progress made so far ?

didibaba avatar Dec 18 '19 15:12 didibaba

@didibaba hi Didibaba, did you figure it out?

vivian940425 avatar Jul 06 '20 00:07 vivian940425

@xstmjh @twojtasz I am not sure if this project is still under maintainance?

didibaba avatar Jul 06 '20 01:07 didibaba

It is still being worked on. Regarding this the requestAnimationFrame issue should have been fixed, but i'm not sure we ever had a representative repro case.

GLB is slower compared to Protobuf, on both ends I believe server and client, protobuf will eventually be the default format. The other part is the processing of the data on a web worker should be faster, but at times the message rate and overhead of sending data from main to worker could lag. This points to the way data is managed and what is optimal for a "live" case vs our default logic. Specifically, in a live case, if you data is complete, then there is no reason to "process" old messages if a new one arrived, so you may want a message queue of 1, where only the latest message is processed.

So lots of little issues in the end to end flow that can attribute to this and w/o a bit more specifics it's unclear if of the work has addressed your issues.

If you have time we can try to move this forward.

  1. re-test and see if the issue is the same
  2. try using protobuf encoding
  3. we can look at message rates and client processing to see if hte latency is in the BE or FE

I would love to get a benchmarking tool for all of this, we have done one-offs in the past and I think we have some specific tools but not one that can provide a holistic BE & FE breakdown.

twojtasz avatar Jul 06 '20 17:07 twojtasz

@twojtasz thanks for the write-up. I somehow still think the requestAnimationFrame issue persists. In our C++ live server, the pattern we see here is the server can not send too much data too fast. Originally we had the server sending over 60 FPS and the client was completely overwhelmed and the rendering is below 5 FPS as it'ts busy parsing and rendering too much stuff. We then tweaked the server's sending rates to about 10 fps and now the client is able to run at ~8-10 fps. The client also shows a dealy to about 2~3 seconds depending on how much streams we are enabling. Toggling a stream that has a lot of data will also spike the delay a bit although in a while it backs down to 2~3 seconds. We are running the setup in pretty modern Linux laptop, Chrome browser and 8 web workers already. The data sent from the live server in BINARY_PBE. Also question here, would switch to JSON helps as in the Live Server is hardwired with the laptop so no bandwidth limitation.

vivian940425 avatar Jul 09 '20 05:07 vivian940425