Cesium
Cesium copied to clipboard
On locales and locale-specific behavior of formatting functions
My general idea on locales in Cesium was the following:
- C's "locale" stuff should be tied to .NET's
CultureInfo
. - We are okay to use the default .NET behavior (unwanted in most other programs, but not here) across the standard library, and rely on the global culture info being set up properly.
Note that I would really like to just drop this all into the trash bin and use the invariant culture universally. However, keeping in mind the C standard and programs written for alternative C implementations, I think that the best decision for the Cesium users is to make this user-controllable. That is, give the users the C API for locales.
@BadRyuner comments on the current behavior in #559: there are problems in our own tests that now implicitly rely on this locale stuff, and the locale treatment varies between implementations. In particular, Microsoft's C implementation doesn't seem to agree with… uhm, the Microsoft's .NET implementation on what decimal separator to use?
It's more or less easy to fix the tests by setting DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
when running them, but I no longer believe that the current's Cesium behavior is acceptable.
First of all, let me quote paragraph 2 from section 7.1.1 Definition of terms of the C standard we use:
The decimal-point character is the character used by functions that convert floating-point numbers to or from character sequences to denote the beginning of the fractional part of such character sequences. It is represented in the text and examples by a period, but may be changed by the
setlocale
function.
So, we are right in that printf
should consult to locale.
But then, let's also go to 7.11.1.1 The setlocale
function and read paragraph 4:
At program startup, the equivalent of
setlocale(LC_ALL, "C");
is executed.
This means that every Cesium program that adheres the C standard should have an equivalent of the default "C" locale set on by default. I believe that one is most commonly associated with the culture-invariant environment in other C implementations; for now I didn't look it up in the standard, but that's the common practice, so let's stick with that for now.
So, should we just do something like CultureInfo.CurrentCulture = InvariantCulture
in the main
function by default? I think we should.
While we are at it, let's also review the current number formatting code and whatnot (and either fix in scope of this issue or extract into a separate one), and check whether it follows the standard.
In particular, the standard (see section 7.23.6.1 The fprintf
function, paragraph 8) requires us to use the format [-]ddd.ddd
(where the .
character is the decimal-point character as outlined above). While .NET's formatting is able to do more rich stuff than that (like custom -
sign or custom thousand separators, gosh even custom digits are supported in a convoluted way), we should limit it to only doing what's permitted by the standard.
What's an easy way to test Cesium on Linux if I don't have it? QEMU, VirtualBox or WSL? I have an idea how to solve this problem, but there's no point in embarking on it without being able to quickly test this on linux.
The easiest way (supposing you are on Windows) is WSL, but why do you need that exactly? I'd say that it should be enough to just implement the proposal, and let the tests to do the rest.
Is Cesium.Runtime targeting NetStandard 2.0 only used for NetFramework4 builds?
I think that in theory also for Mono, but why are you asking?