coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

seq: Seq doesn't use system locale for output numbers

Open Maximkaaa opened this issue 5 months ago • 6 comments

GNU seq uses printf style format, which chooses the decimal separator based on the system locale stored in LC_NUMERIC variable. Uutils seq is not aware of the locale:

$ LC_NUMERIC="ru_RU.UTF-8" seq 0 1.5 3
0,0
1,5
3,0

$ LC_NUMERIC="ru_RU.UTF-8" cargo run seq 0 1.5 3
0.0
1.5
3.0

Maximkaaa avatar Jun 19 '25 06:06 Maximkaaa

There also seems to be a bug in the GNU seq:

$ LC_NUMERIC="ru_RU.UTF-8" seq 0 1,5 3
0
1
2
3

# BUT

$ LC_NUMERIC="ru_RU.UTF-8" seq -f "%g" 0 1,5 3
0
1,5
3

# This is what is expected:
$ LC_NUMERIC="ru_RU.UTF-8" seq -f "%.1f" 0 1,5 3
0,0
1,5
3,0

According to their man:

       FORMAT must be suitable for printing one argument of type
       'double'; it defaults to %.PRECf if FIRST, INCREMENT, and LAST are
       all fixed point decimal numbers with maximum precision PREC, and
       to %g otherwise.

So it seems that if the number is given in a non-standard locale, format is chosen wrongly and the number is parsed incorrectly. I think it is reasonable for uutils to produce the correct (last of the three) output in this case, even without specifying -f argument.

Maximkaaa avatar Jun 19 '25 06:06 Maximkaaa

@RenjiSann has been working on this class of issues lately

sylvestre avatar Jun 19 '25 08:06 sylvestre

After some investigation, it looks like fixing this will require a rewrite of big_decimal's Display trait implementation. OR we can go the shitty way by replacing . with the locale's decimal separation character as a post-treatment.

I don't really know what way to go there

RenjiSann avatar Jun 30 '25 16:06 RenjiSann

OR we can go the shitty way by replacing . with the locale's decimal separation

I do not think this works in general. E.g, let's remember Indian locale that puts a decimal separator every TWO(not three) digits starting from thousands (e.g, discussion here: https://stackoverflow.com/questions/36400452/converting-number-to-india-locale-format). Though, I am not sure if GNU seq supports this formatting

VladimirMarkelov avatar Jun 30 '25 16:06 VladimirMarkelov

I do not think this works in general. E.g, let's remember Indian locale that puts a decimal separator every TWO(not three) digits starting from thousands (e.g, discussion here: https://stackoverflow.com/questions/36400452/converting-number-to-india-locale-format). Though, I am not sure if GNU seq supports this formatting

You're referring to the grouping_separator, which is different from the decimal_separator, and oh boy I haven't even started to look into this one 😅

From what I can see, GNU seq does not seem to handle it, but I agree the shitty solution is probably not robust enough to be worth it

RenjiSann avatar Jun 30 '25 18:06 RenjiSann

You're referring to the grouping_separator, which is different from the decimal_separator, and oh boy I haven't even started to look into this one 😅

Yes, my bad 😆 When I saw locale I thought about everything including thousand separator. In case of supporting only decimal separator, probably, the dirty trick with search+replace would be good enough

VladimirMarkelov avatar Jun 30 '25 21:06 VladimirMarkelov