perl5
perl5 copied to clipboard
Math-Complex: improve numerical accuracy
-
Compute the great circle distance with a formula that has better numerical properties. The formula used now is accurate for all distances.
-
Add a few tests to verify.
-
This fixes CPAN RT #78938
requires https://github.com/Perl/perl5/pull/20215
Compute the great circle distance with a formula that has better numerical properties. The formula used now is accurate for all distances.
Add a few tests to verify.
This fixes CPAN RT #78938
Can we have someone handle the content of this pull request?
(I or someone else can handle the rebasing, etc., separately once the content is approved.)
I have a (possibly dumb) question about the great_circle_distance documentation in Trig.pm. It states (in part):
.... NOTE: this formula thinks in mathematics, not
geographically: the *phi* zero is at the North Pole, not at the Equator
on the west coast of Africa (Bay of Guinea). You need to subtract your
geographical coordinates from *pi/2* (also known as 90 degrees).
$distance = great_circle_distance($lon0, pi/2 - $lat0,
$lon1, pi/2 - $lat1, $rho);
On looking at the current great_circle_distance() implementation, I see that the second and third args are already being subtracted from pi/2, and that all strikes me as being odd.
Was that documentation correct ? This new proposed implementation doesn't subtract any of the arguments from pi/2, but still returns substantively the same result. So I guess if that documentation is correct wrt the current implementation, then it's also correct wrt the new implementation.
(I'm known to be a one-dimensional thinker ... and dealing with 3 is really pushing the boundaries.)
Cheers, Rob
On looking at the current great_circle_distance() implementation, I see that the second and third args are already being subtracted from pi/2, and that all strikes me as being odd.
I assume that the author found a formula for the great circle distance given geocentric coordinates. (In geodesy, you don't use spherical coordinates, but geocentric or geodetic coordinates.) And instead of adjusting the formula so it works with spherical coordinates directly, the author chose to mention in the documentation how to do the necessary conversion from spherical to geocentric coordinates, which is required by the formula in the function. Now, doing the conversion both in the input and inside the function is clearly not correct. The conversion should only be done once.
Was that documentation correct ?
My understanding is that all the functions for spherical trigonometry in Math::Trig assume spherical coordinates. So I chose to use an implementation which assumes spherical coordinates and doesn't need to do any conversion to geocentric coordinates. This is less prone to errors, especially when the latitude is close to pi/2, since subtracting to almost identical numbers leads to loss of accuracy.
"Converting" the formula from assuming geocentric coordinates to assuming spherical coordinates was simply a matter of applying the following relations
sin(pi/2 - x) = cos(x) cos(pi/2 - x) = sin(x) atan2(-x, y) = -atan2(x, y) atan2(x, -y) = pi - atan2(x, y) atan2(-x, -y) = atan2(x, y) - pi
And instead of adjusting the formula so it works with spherical coordinates directly, the author chose to mention in the documentation how to do the necessary conversion from spherical to geocentric coordinates, which is required by the formula in the function.
Ok ... I'm still reading it as advising me to call great_circle_distance() as:
$distance = great_circle_distance($lon0, pi/2 - $lat0, $lon1, pi/2 - $lat1, $rho);
but if it's just telling me that this was just what the great_circle_distance() implementation was doing internally, then I guess that's ok. However, while the new implementation is (in effect) doing the same thing, it's not doing it exactly in that way - or so it seems to me. I'm thinking it therefore makes better sense (and removes ambiguity) if that piece of info is removed from the documentation. Of course, I can be a bit of a blockhead, and I'm not very strong on frigginjometry - so I'll defer to @pjacklam and/or others as to what (if anything) should be done with the documentation.
The rest of it looks fine to me. On perl-5.36.0 (nvtype of double) I tested against the mpfr library, with 2000-bit precision Math::MPFR objects and found good correlation with this new proposed implementation.
Whereas the current implementation in Math::Trig started returning zero for (2, 2, 2, 2+1e-7), the new implementation kept on returning non-zero values that were verified by the Math::MPFR checks all the way to (2, 2, 2, 2+1e-15). At (2, 2, 2, 2+1e-16), Math::MPFR also confirmed that the correct result at double precision was 0.
The script I used to check on specific values is attached. Nice work, @pjacklam !!
Cheers, Rob
I believe this illustrates that the wording in the documentation should be improved. I'll try to explain.
- If your input is in spherical coordinates (using radians), then you call great_circle_distance() with no conversion.
- If your input is latitude and longitude (using radians), then you need to convert the input before calling great_circle_distance().
The reason why you need to convert is that the "latitude" $phi in spherical coordinates is zero at the north pole and increases to pi at the south pole. However, the latitude $lat in geodesy is zero at the equator and pi/2 (equivalent to 90 degrees) at the north pole and -pi/2 radians (equivalent to -90 degrees) at the south pole. The conversion from $lat to $phi is
$phi = pi/2 - $lat
The formula in great_circle_distance() is optimized for spherical coordinates. If given geodetic coordinates – i.e., latitude and longitude – it would be better to use a different formula to avoid the conversion of the latitude. In addition, latitude and longitude are normally given in degrees, not radians, so you probably need to do some conversion anyhow.
It might be better to have a function called great_circle_distance_geo() which assumes geodetic coordinates given in degrees, which I assume is the by far most common case when anyone wants to compute a great circle distance. For each of the great circle functions in Math::Trig, I wrote an equivalent function assuming geodetic coordinates using degrees. Here is the one for the great circle distance:
sub great_circle_distance_geo {
my ( $lat0, $lon0, $lat1, $lon1, $rho ) = @_;
$rho = 1 unless defined $rho; # default to the unit sphere
my $theta0 = deg2rad($lon0);
my $phi0 = deg2rad($lat0);
my $theta1 = deg2rad($lon1);
my $phi1 = deg2rad($lat1);
my $dphi = $phi1 - $phi0;
my $dtheta = $theta1 - $theta0;
my $c1 = cos($phi1)*sin($dtheta);
my $c2 = cos($phi1)*cos($dtheta);
my $c3 = cos($phi0)*sin($phi1) - sin($phi0)*$c2;
my $c4 = sin($phi0)*sin($phi1) + cos($phi0)*$c2;
return $rho * atan2(sqrt($c1*$c1 + $c3*$c3), $c4);
}
Perhaps it would be less confusing to add this function? Then the documentation for great_circle_distance() could just say something like: If you have geodetic coordinates using degrees, see great_circle_distance_geo().
Perhaps it would be less confusing to add this function? Then the documentation for great_circle_distance() could just say something like: If you have geodetic coordinates using degrees, see great_circle_distance_geo().
Yes, I think that would be a good thing.
However, I'm thinking that great_circle_distance_geo() should insist on receiving its arguments in radians - mainly for consistency with the plethora of other trig functions that we encounter in a computing environment. It would probably confuse me to have both functions take different types of arguments. (But maybe that's just me.) Arguments given in degrees might also include minutes and seconds - which requires some converting, anyway. It seems simpler to me that we just let people convert their "degree/min/sec" arguments to radians, like they're probably used to doing ... unless there's some convention that functions which receive geodetic arguments should expect their arguments to be in degrees ??
Anyway, I do like the idea of separate functions for the different coordinate systems, along with documentation that explains which variant to use.
Regarding the usual geodetic coords provided for points on the surface of the Earth, can we assume that users will be aware that while North and East bearings are treated as positive values, bearings for South and West need to be negated ? If they're initially blind to that, I'm sure they'll soon work it out. (Maybe there's already documentation that mentions this. I haven't checked.)
Cheers, Rob
Perhaps it would be less confusing to add this function? Then the documentation for great_circle_distance() could just say something like: If you have geodetic coordinates using degrees, see great_circle_distance_geo().
Yes, I think that would be a good thing.
I'm getting the feeling that the scope of this pull request is expanding considerably. I feel that adding new functionality should be excluded from this p.r. and proposed in additional ones. This is a situation where, going forward, the code has to be maintained by Perl 5 Porters both for the next production release and, presumably, for release to CPAN to work with older versions of perl. Adding new functionality in this p.r. would complicate matters.
Well, it's not really new functionality - it's just using existing functionality to deal with different arguments ;-) If it's too much for the dual-life process to handle, then I think merge it as is. Though it might be polite to wait and see whether @pjacklam wants to make any modification at all to the great_circle_distance documentation (if such a modification would be allowable).
@jkeenan, I understand and respect your concerns, but this just demonstrates the severe hobbling effect that dual-lifing modules into the 'dist' directory generally has. AIUI, if this module was located in the 'cpan' directory, then development of it would be free to proceed at its own pace under the care of its CPAN maintainer(s), and the perl source could catch up on it whenever p5p liked.
Cheers, Rob
On Sat, 5 Nov 2022 at 03:50, sisyphus @.***> wrote:
Well, it's not really new functionality - it's just using existing functionality to deal with different arguments ;-) If it's too much for the dual-life process to handle, then I think merge it as is. Though it might be polite to wait and see whether @pjacklam https://github.com/pjacklam wants to make any modification at all to the great_circle_distance documentation (if such a modification would be allowable).
Yes it would be allowable. I would even say encouraged. I have used the code you are talking about once or twice and I remember struggling with it a bit.
@jkeenan https://github.com/jkeenan, I understand and respect your concerns, but this just demonstrates the severe hobbling effect that dual-lifing modules into the 'dist' directory generally has. AIUI, if this module was located in the 'cpan' directory, then development of it would be free to proceed at its own pace under the care of its CPAN maintainer(s), and the perl source could catch up on it whenever p5p liked.
I think @jkeenan just wants to keep the "fix" part of this separate from the "new functionality" bit in terms of PR management.
FWIW, regarding dist vs cpan, personally i dont think it should matter. If you are the main dev for it, but need the wider community to backstop you from time to time, then leaving it in dist is fine. I think you just need to exercise a bit more development craft discipline than you might if it were your own repo, eg, using distinct PR's for fixes as for feature enhancements, but being in the core repo should be no more impediment to you than the necessary extra discipline required when working in a large active project with multiple devs, eg no pushing broken code before you go on holiday. But beyond that it should be no more burdensome than having the code in your own repo.
On the other hand, if you want to move this out of the repo, take ownership and change this modules status from dist to cpan I am sure nobody would object.
Personally I see you as an important contributor and I would prefer to remove whatever unnecessary impediments there are for you doing your thing in the main repo. For instance I personally would have no problem with you getting a commit bit if that helped. Would you like me to nominate you for one if you dont have one already?
cheers, Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
I'm getting the feeling that the scope of this pull request is expanding considerably. I feel that adding new functionality should be excluded from this p.r. and proposed in additional ones.
I fully agree. I have tried to tighten up the language in the documentation for great_circle_distance() and added it to this PR.
I have also improved the documentation regarding the various coordinate systems (Cartesian, spherical, and cylindrical), but I am not sure whether that belongs in this PR, so I haven't added it (yet).
One of the CI testing environments is now failing t/porting/cmp_version.t. There having been no complaints when the Trig.pm code changed, I don't know why that test should fail when the Trig.pm documentation was altered. (Maybe something to do with #20311, which was not merged into blead until a few days ago.) And I don't know why the other testing environments haven't also failed that test.
Going by what I was asked to do in #20201, I gather that Math::Complex and Math::Trig, being part of the same CPAN distro, should be at the same version number - which would mean bumping Math::Trig's version to 1.59_03.
Cheers, Rob
@sisyphus, I wasn’t sure whether to use 1.59_03 or 1.59_04, but I ended up bumping both Math::Complex and Math::Trig to 1.59_04 to signify that this is a new version of the distribution. I’m not sure if that was the correct thing to do. I’m still trying to get my head around pull requests, the CI testing environment, etc. But at least the version conflict is gone.
I wasn’t sure whether to use 1.59_03 or 1.59_04
@pjacklam,I wondered the same thing ... but, after further reflection, I think you made the right choice in electing for 1.59_04. Otherwise, if the Math-Complex distro on CPAN were to be updated to mirror this new Math-Complex distro, then the CPAN distro would still be at 1.59_03, and that wouldn't seem right.
Cheers, Rob
Merge conflicts manually resolved; merged to blead in commit 2e35afaa9f. Closing ticket; thanks.