safeplaces-dct-app icon indicating copy to clipboard operation
safeplaces-dct-app copied to clipboard

2 x problems observed in intersect.js algorithm

Open diarmidmackenzie opened this issue 4 years ago • 22 comments

https://github.com/tripleblindmarket/covid-safe-paths/blob/develop/app/helpers/Intersect.js

2 problems spotted when testing a Python version of the same algorithm.

// Close enough to do a detailed calculation. Using the the Spherical Law of Cosines method. // https://www.movable-type.co.uk/scripts/latlong.html // https://en.wikipedia.org/wiki/Spherical_law_of_cosines //

let d = Math.acos( Math.sin(p1) * Math.sin(p2) + Math.cos(p1) * Math.cos(p2) * Math.cos(deltaLambda), ) * R;

Problem 1: Based on what Wikipedia says, I think the formula should be:

let d = Math.acos( Math.cos(p1) * Math.cos(p2) + Math.sin(p1) * Math.sin(p2) * Math.cos(deltaLambda), ) * R;

(sins & cos's swapped over).

image

Problem 2: In my Python code I have seen a very weird occurrence where the following values led to the value supplied to the acos being > 1 (1.0000000000000002)

Specifically I saw this when comparing distances between identical points, with specific values:

p1 = 0.0017454475853176147 p2 = 0.0017454475853176147 deltaLambda = 0.0

Presumably some rounding error in the floating point arithmetic, ends up with an impossible value > 1 for the cosine, which then causes an exception when you feed it into acos.

I don't know if the same thing can happen with Javascript, but it seems prudent to put some defensive code in: check the value to be fed into acos, and it it is > 1, set it to 1.0. This fixed the crash in my Python code.

diarmidmackenzie avatar May 03 '20 21:05 diarmidmackenzie

OKn further reading, point 1 looks wrong.

This link is clear that the formula should be sin * sin + cos * cos * cos. https://www.movable-type.co.uk/scripts/latlong.html

I haven't really understood this (given the law of cosines has the form cos * cos + sin * sin * cos but I think it may relate to this point in the article:

"(Note that the geodetic form of the law of cosines is rearranged from the canonical one so that the latitude can be used directly, rather than the colatitude)"

So, I'm going to assume #1 is in fact correct.

#2 continues to be weird, and may be Python-only. I'm going to try to run a test in JS with these numbers & see what I find.

diarmidmackenzie avatar May 03 '20 22:05 diarmidmackenzie

Point #2 also reproduces in Javascript with the same values for p1, p2.

There is no exception, but d is computed as NaN, when it should presumably be zero - so we will miss a match that we should get.

The folllowing code seems to fix this issue:

let cos_d = Math.sin(p1) * Math.sin(p2) + Math.cos(p1) * Math.cos(p2) * Math.cos(deltaLambda); console.log(cos_d);

// Rounding errors in floating point airthmetic can result in rare // cases where cos_d > 1, which leads to d computed as NaN, when it should be zero. if (cos_d > 1) { cos_d = 1.0 } let d = Math.acos(cos_d) * R;

diarmidmackenzie avatar May 03 '20 22:05 diarmidmackenzie

ALso hit with p1, p2 as follows:

0.0349072063771003 0.0349072063771003

diarmidmackenzie avatar May 03 '20 22:05 diarmidmackenzie

FYI, my Javascript repro was using node.js 12.6.3.

I don't know if that's representative of what would be running on an Android or iPhone - I guess there is a chance that a different Javascript interpreter might give different results.

diarmidmackenzie avatar May 03 '20 22:05 diarmidmackenzie

I started playing around with performance a bit here. Since we are only measuring short distances, the curvature of the earth is irrelevant, so we could use pythagoras (with suitable conversions into metres) rather than trig.

I thought that might perform better, but it made no difference.

What did seem make a big difference was removing the "optimizations" that filter based on latitude and longitude that are a long way off.

Here's my test code - tests in 1 square degree, at a random point of long/lat. Run multiple tests to check no major impact from different gloabl locations (there don't seem to be).

var lat1 = {} var lat2 = {} var long1 = {} var long2 = {}

base_lat = (Math.random() * 90) base_long = (Math.random() * 180) console.log(base_lat, base_long);

for (i = 0; i < 10000 ; i++) { lat1[i] = base_lat + Math.random(); lat2[i] = base_lat + Math.random(); long1[i] = base_long + Math.random(); long2[i] = base_long + Math.random(); }

start_time = Date.now() var i for (j = 0; j < 10000 ; j++) { for (i = 0; i < 10000 ; i++) { areLocationsNearby(lat1[i], long1[i], lat2[i], long2[i]); } }
end_time = Date.now() console.log(end_time - start_time); console.log("done");

Running over the current code takes ~1200 msecs to complete (100M comparisons)

Commenting out all these lines, and the same 100M comparisons complete in ~400 msecs.

//if (Math.abs(lat2 - lat1) > notNearbyInLatitude) return false; //if (Math.abs(lat1) < 23) { // if (Math.abs(deltaLon) > notNearbyInLongitude_23Lat) return false; //} else if (Math.abs(lat1) < 45) { // if (Math.abs(deltaLon) > notNearbyInLongitude_45Lat) return false; //} else if (Math.abs(lat1) < 67) { // if (Math.abs(deltaLon) > notNearbyInLongitude_67Lat) return false; //}

So these "optimizations" are in fact anything but, and are slowing the code down by a factor of 3.

I suggest we remove them.

I also perf tested the cos_d < 1 check above - it has no negligible perf impact.

None of the above was done with a proper profiler, just raw elapsed time - but the results seem pretty clear nevertheless.

Also, note that testing was done on Windows - performance on other platforms may vary, so if we can repeat in an Android or iOS environment that might be worth doing before we commit to changes.

diarmidmackenzie avatar May 04 '20 08:05 diarmidmackenzie

Interesting observation, I agree that taking curvature into account might be overkill for a contact tracing application. @kenpugsley thoughts?

tremblerz avatar May 04 '20 09:05 tremblerz

We don't need to take curvature into account, but from all the testing I have done, the "law of cosines" calculation is the most performant.

The thing that is dragging down performance are the "simple" calculations we do to avoid the "complicated" law of cosines calculations.

Just remove all the up-front screening calculations, and go straight for the law of cosines, and that delivers the optimal performance (by a factor of 3.

I tried a flat pythagoras type solution, and can't get anything that performs as well as the law of cosines.

Key caveat this is all on Windows (node.js) - I don't know how performance on a real smartphone will compare.

diarmidmackenzie avatar May 04 '20 09:05 diarmidmackenzie

I tried a flat Pythagoras type solution, and can't get anything that performs as well as the law of cosines.

Does that mean errors are big enough to bring the wrong solution to proximity detection task?

tremblerz avatar May 04 '20 09:05 tremblerz

Sorry, I wasn't very clear. All the solutions are accurate enough.

The "performance" I am talking about is how many CPU cycles are needed to complete the calcuation.

The algorithm that uses the fewest CPU cycles appears to be "law of cosines" with no pre-screening of any kind.

diarmidmackenzie avatar May 04 '20 09:05 diarmidmackenzie

oh I see, my bad. And this includes euclidean distance also?

tremblerz avatar May 04 '20 09:05 tremblerz

Yes, I tried these lines of code. Without even getting the adjustment of degrees to metres for latitude correct, this is about half the speed of the law of cosines computation (800 msecs for 100M vs. 400 msecs).

lonm = Math.floor((lon2 - lon1) * 102470) latm = Math.floor((lat2 - lat1) * 111320) let dsq = lonm * lonm + latm * latm

// closer than the "nearby" distance? if (dsq < nearbyDistancesq) return true;

If you don't use the Math.floor, it slows down by another factor of 5 (> 4000msecs for 100M).

diarmidmackenzie avatar May 04 '20 09:05 diarmidmackenzie

The one thing I have not tried yet is teh "Harvesine" formula, here: https://www.movable-type.co.uk/scripts/latlong.html

That article says that despite being more complex, it usually computes faster than "law of cosines". I will try that too...

diarmidmackenzie avatar May 04 '20 09:05 diarmidmackenzie

Broadly I see the same performance between Haversine & law of consines.

But it also turns out that rewriting the pythagoras code like this gives much better performance (the additions are the "const" values) const lonm = (lon2 - lon1) * 102470 const latm = (lat2 - lat1) * 111320 const dsq = lonmlonm + latmlatm if (dsq < nearbyDistancesq) return true;

Written like this, I'd say it's a touch faster than the other algorithms (maybe 10% faster but that's about the noise levels I get from one run to the next.

diarmidmackenzie avatar May 04 '20 10:05 diarmidmackenzie

awesome, thanks for testing out all variants! My preference would be to go with simply pythagoras you have written above. Waiting for others to chime in. Another simplest and faster way could be to just quantize it appropriately, like 44.4219983 becomes 44.421998 and we just compare the two values. However, this one is doomed to miss the boundary case always.

tremblerz avatar May 04 '20 10:05 tremblerz

Pythagoras requires that you know how many metres there are in a degree of longitude your current latitude. I can't add that in without slowing down the performance considerably.

Even with this, it performs no bettter than cosines of haversine.

Here's the data I have (msecs computation for 100M entries, with my current best attempt at Pythagoras).

J:\other\node>node scream.js 55.528043612051434 110.40209151984699 Pythagoras 481 Haversine 354 Cosines 357 Cosines w/ pre-screening 607 Pre-screening only 612

The cosines code includes the fix for problem #2 above.

Based on this I think we should stick with cosines, with a few tweaks (e.g. "const" not "let" to boost performance).

But we should find a way to run my tests on the target devices first. I will post the test code somewhere.

diarmidmackenzie avatar May 04 '20 10:05 diarmidmackenzie

My test code is checked in here (wasn't sure of a good place to put it in the main Safe Paths repository, but please move it there if you want).

https://github.com/diarmidmackenzie/safepathstools/tree/master/perf-test

This runs in node.js on Windows.

I'm not at all familiar with JS dev environments, so no idea how best to get this running on an Android or iOS device, but we should do so in case the hardware behaves very differently.

As per above, I think we should stick with the cosines algorithm. No evidence that Haversine performs any better, and cosines has had more exposure.

Haversine looks like it ought not to be exposed to rounding errors such as my original point #2 above since it uses arctan, not arccosine.

Nevertheless, moving from Cosines to Haversine seems like change for no particular reason.

Let's see what perf looks like on Android & iOS before taking any final decisions, though.

diarmidmackenzie avatar May 04 '20 11:05 diarmidmackenzie

Also: if we are able to do proper performance profiling, rather than just measuring elapsed execution time, that might be a good idea too!

diarmidmackenzie avatar May 04 '20 11:05 diarmidmackenzie

I had a worry we were re-inventing the wheel here, so I decided to see what perf we got from this: https://www.npmjs.com/package/geolocation-utils

It is 20x slower than our current algorithm, so > 40x worse than what we can now achieve. It might be useful for checking results, though.

diarmidmackenzie avatar May 04 '20 11:05 diarmidmackenzie

Yeah, I have also used and tested all those formulas and "packages" and the raw trig formulas always perform better than any available package...for obvious reasons.

E3V3A avatar May 04 '20 19:05 E3V3A

A lot to digest here ... great research @diarmidmackenzie. I did some very very basic testing the last time I reworked this, but only for accuracy and only very basic work on performance. I know from what I found at the time, the haversine formula was no faster than the law of cosines (as you noted). Simple pythagorus has some significant errors than can come up at high/low latitudes and so I think we wan't rely on it for accurate distance calcs. I did not see (or look for) issue 2 above ... good catch.

Sounds like we need someone to take the time to replicate your results and profile on actual device hardware, and for repeatable locations (equator, mid-lat, high-lat). We also have to take into account the fact that the most common case is not nearby as we do this (as in fact your test seems to do that if I'm reading it properly).

If it turns out the shortcuts aren't actually needed (which seems odd to me, but profiling should tell for sure), then we should optimize appropriately.

kenpugsley avatar May 04 '20 20:05 kenpugsley

Hi Ken, few quick comments:

  • Each time the test runs, it picks a random long/lat, so by running multiple times you get exposute to different regions of the globe. Would be easy to make this a user-supplied parameter, but it actually makes little difference as far as I can tell.
  • The test randomly distributed 10,000 pairs of points across a 1 degree by 1 degree (approx 100km x 100km) area. (and then does the calculations for each pair 10,000 times). So most points are not close together in the detection sense, but they are geoographically concentrated, as we would expect data in a local HA to be.
  • The problem with the "shortcuts" is that they take twice as long to execute as the law of cosines calculation itself - see "pre-screening only" data point. So they are not only unecessary - they are actively harmful.
  • FYI, the reason "Cosines w/ prescreening" is faster than "pre-screeing only" is due to noise. Logically there is more work in the former, but not much more work (only in the small % of cases thatactually have close-together points). Overall the noise from one test to the next varies by more than this amount.

Agree we need data from target hardware. I need to hand off to someone else to do that.

diarmidmackenzie avatar May 05 '20 07:05 diarmidmackenzie

@diarmidmackenzie can we close this issue out as we are tracking internally?

Patrick-Erichsen avatar May 22 '20 18:05 Patrick-Erichsen