Cesium icon indicating copy to clipboard operation
Cesium copied to clipboard

On locales and locale-specific behavior of formatting functions

Open ForNeVeR opened this issue 11 months ago • 5 comments

My general idea on locales in Cesium was the following:

  1. C's "locale" stuff should be tied to .NET's CultureInfo.
  2. We are okay to use the default .NET behavior (unwanted in most other programs, but not here) across the standard library, and rely on the global culture info being set up properly.

Note that I would really like to just drop this all into the trash bin and use the invariant culture universally. However, keeping in mind the C standard and programs written for alternative C implementations, I think that the best decision for the Cesium users is to make this user-controllable. That is, give the users the C API for locales.

@BadRyuner comments on the current behavior in #559: there are problems in our own tests that now implicitly rely on this locale stuff, and the locale treatment varies between implementations. In particular, Microsoft's C implementation doesn't seem to agree with… uhm, the Microsoft's .NET implementation on what decimal separator to use?

It's more or less easy to fix the tests by setting DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true when running them, but I no longer believe that the current's Cesium behavior is acceptable.

First of all, let me quote paragraph 2 from section 7.1.1 Definition of terms of the C standard we use:

The decimal-point character is the character used by functions that convert floating-point numbers to or from character sequences to denote the beginning of the fractional part of such character sequences. It is represented in the text and examples by a period, but may be changed by the setlocale function.

So, we are right in that printf should consult to locale.

But then, let's also go to 7.11.1.1 The setlocale function and read paragraph 4:

At program startup, the equivalent of

setlocale(LC_ALL, "C");

is executed.

This means that every Cesium program that adheres the C standard should have an equivalent of the default "C" locale set on by default. I believe that one is most commonly associated with the culture-invariant environment in other C implementations; for now I didn't look it up in the standard, but that's the common practice, so let's stick with that for now.

So, should we just do something like CultureInfo.CurrentCulture = InvariantCulture in the main function by default? I think we should.

ForNeVeR avatar Mar 24 '24 16:03 ForNeVeR

While we are at it, let's also review the current number formatting code and whatnot (and either fix in scope of this issue or extract into a separate one), and check whether it follows the standard.

In particular, the standard (see section 7.23.6.1 The fprintf function, paragraph 8) requires us to use the format [-]ddd.ddd (where the . character is the decimal-point character as outlined above). While .NET's formatting is able to do more rich stuff than that (like custom - sign or custom thousand separators, gosh even custom digits are supported in a convoluted way), we should limit it to only doing what's permitted by the standard.

ForNeVeR avatar Mar 24 '24 16:03 ForNeVeR

What's an easy way to test Cesium on Linux if I don't have it? QEMU, VirtualBox or WSL? I have an idea how to solve this problem, but there's no point in embarking on it without being able to quickly test this on linux.

BadRyuner avatar Mar 25 '24 18:03 BadRyuner

The easiest way (supposing you are on Windows) is WSL, but why do you need that exactly? I'd say that it should be enough to just implement the proposal, and let the tests to do the rest.

ForNeVeR avatar Mar 25 '24 21:03 ForNeVeR

Is Cesium.Runtime targeting NetStandard 2.0 only used for NetFramework4 builds?

BadRyuner avatar Mar 26 '24 07:03 BadRyuner

I think that in theory also for Mono, but why are you asking?

ForNeVeR avatar Mar 26 '24 12:03 ForNeVeR