perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

POSIX::setlocale(0, "zh_CN.GB18030") mbstring fail on darwin

Open p5pRT opened this issue 11 years ago • 14 comments

Migrated from rt.perl.org#122296 (status was 'open')

Searchable as RT122296$

p5pRT avatar Jul 14 '14 23:07 p5pRT

From @rurban

./perl -Ilib -MPOSIX -e'setlocale(0, "zh_CN.GB18030")' crashes with with asan (-fsanitize=address) with a global-buffer-overflow with READ of size 4 at /usr/lib/system/libsystem_c.dylib+0x32d63 repro with 3c1e67ac on darwin.

root-cause​: upstream darwin bug. wrong symlink from /usr/share/locale/zh_CN.GB18030/LC_COLLATE -> ../la_LN.US-ASCII/LC_COLLATE

A similar multi-byte string bug existed with solaris 2.8, with the simple reproducer LC_ALL=zh_CN.GB18030 bash -c true => SEGV (see gnulib locale-zh.m4 probe) unfortunately I haven't found a simplier reproducer on darwin yet, but the symlink should be enough to check.

I suggest we add a probe to Configure for this wrong symlink on darwin, and disable it somehow in POSIX​::setlocale. -- Reini Urban

p5pRT avatar Jul 14 '14 23:07 p5pRT

From @rurban

it fails in POSIX​::setlocale(), which calls Perl_new_collate()

full trace produced with 5.18.2

=================================================================
==85200==ERROR: AddressSanitizer: global-buffer-overflow on address 0x000102e72862 at pc 0x10001c7e8 bp 0x7fff5fbf88d0 sp 0x7fff5fbf88a8
READ of size 4 at 0x000102e72862 thread T0
#0 0x10001c7e7 in wrap_memcpy (/usr/src/llvm/r193924/build/lib/clang/3.4/lib/darwin/libclang_rt.asan_osx_dynamic.dylib+0x187e7)
#1 0x7fff92e23d63 in _GB18030_mbrtowc (/usr/lib/system/libsystem_c.dylib+0x32d63)
#2 0x7fff92e256f0 in __mbsnrtowcs_std (/usr/lib/system/libsystem_c.dylib+0x346f0)
#3 0x7fff92e22c29 in __collate_mbstowcs (/usr/lib/system/libsystem_c.dylib+0x31c29)
#4 0x7fff92e4c6ce in strxfrm_l (/usr/lib/system/libsystem_c.dylib+0x5b6ce)
#5 0x102cb3466 in Perl_new_collate (/usr/src/perl/build-5.18.2d-nt-asan/libperl.dylib+0x1811466)
#6 0x103514c79 in XS_POSIX_setlocale (/usr/src/perl/build-5.18.2d-nt-asan/lib/auto/POSIX/POSIX.bundle+0x53c79)
#7 0x102100d26 in Perl_pp_entersub (/usr/src/perl/build-5.18.2d-nt-asan/libperl.dylib+0xc5ed26)
#8 0x101d8e50d in Perl_runops_debug (/usr/src/perl/build-5.18.2d-nt-asan/libperl.dylib+0x8ec50d)
#9 0x1016b4091 in S_run_body (/usr/src/perl/build-5.18.2d-nt-asan/libperl.dylib+0x212091)
#10 0x1016af1d8 in perl_run (/usr/src/perl/build-5.18.2d-nt-asan/libperl.dylib+0x20d1d8)
#11 0x1000017db in main (/usr/src/perl/build-5.18.2d-nt-asan/./perl+0x1000017db)
#12 0x7fff907e85fc in start (/usr/lib/system/libdyld.dylib+0x35fc)
#13 0x3 (/usr/src/perl/build-5.18.2d-nt-asan/./perl+0x3)
--

Reini Urban

p5pRT avatar Jul 15 '14 04:07 p5pRT

From @khwilliamson

On 07/14/2014 10​:42 PM, Reini Urban via RT wrote​:

it fails in POSIX​::setlocale(), which calls Perl_new_collate()

full trace produced with 5.18.2

I thought that this fault was just recently introduced; but you're indicating it failed in 5.18.2?

p5pRT avatar Jul 15 '14 04:07 p5pRT

The RT System itself - Status changed from 'new' to 'open'

p5pRT avatar Jul 15 '14 04:07 p5pRT

From @rurban

On Mon Jul 14 21​:56​:57 2014, public@​khwilliamson.com wrote​:

On 07/14/2014 10​:42 PM, Reini Urban via RT wrote​:

it fails in POSIX​::setlocale(), which calls Perl_new_collate()

full trace produced with 5.18.2

I thought that this fault was just recently introduced; but you're indicating it failed in 5.18.2?

I thought at first also, that it was new error. But I compared linux where it worked against a newer darwin build, and the error only appears on darwin.

I've checked all my darwin perl's which I compiled with ASAN, and all of them fail in the same way. Because it crashes in libc, caused by the wrong symlink.

-- Reini Urban

p5pRT avatar Jul 15 '14 19:07 p5pRT

From @rurban

attached patch fixes this issue by trying to fix the wrong symlink in Configure and fix or skip the broken locale in t/loc_tools.

p5pRT avatar Sep 24 '14 17:09 p5pRT

From @rurban

0001-Fix-wrong-zh_CN.GB18030-locale-on-darwin-RT-122296.patch
From e478ab4c23bb0f69645b7a7fd57793211bd33957 Mon Sep 17 00:00:00 2001
From: Reini Urban <[email protected]>
Date: Wed, 10 Sep 2014 14:04:55 -0500
Subject: [PATCH] Fix wrong zh_CN.GB18030 locale on darwin [RT#122296]

accessing the system-provided single-byte locale on darwin
in a multi-byte context leads to a global-buffer-overflow with
READ of size 4, detected with asan.
Try to fix the wrong symlink on the system, and skip it in
the t/loc_tools helper function if broken.
---
 Configure      | 25 +++++++++++++++++++++++++
 t/loc_tools.pl | 15 +++++++++++++++
 2 files changed, 40 insertions(+)

diff --git Configure Configure
index 95909f2..68f3e77 100755
--- Configure
+++ Configure
@@ -22862,6 +22862,31 @@ case "$d_vfork" in
 	;;
 esac
 
+: Check for broken zh_CN.GB18030 locale on darwin
+case "${osname}X${osvers}" in
+darwin*X13.[34]*)
+    wrong=/usr/share/locale/zh_CN.GB18030/LC_COLLATE
+    if [ -e $wrong ]; then
+        lnk=`readlink $wrong`
+        if [ x$lnk = x../la_LN.US-ASCII/LC_COLLATE ]; then
+            echo "WARNING: Broken $wrong symlink to single-byte la_LN.US-ASCII" >& 4
+            echo "WARNING: Trying to fix (will require sudo)" >& 4
+	    tmppwd=`pwd`
+            cd /usr/share/locale/zh_CN.GB18030
+            sudo ln -sf ../zh_CN.eucCN/LC_COLLATE LC_COLLATE
+            cd $tmppwd
+        fi
+        lnk=`readlink $wrong`
+        if [ x$lnk = x../la_LN.US-ASCII/LC_COLLATE ]; then
+            echo "WARNING: Could not fix $wrong symlink to ../la_LN.US-ASCII" >& 4
+            echo "WARNING: Avoid POSIX::setlocale(0, \"zh_CN.GB18030\")" >& 4
+        else
+            echo "Fixed $wrong symlink to ../zh_CN.eucCN" >& 4
+        fi
+    fi
+    ;;
+esac
+
 : Check extensions
 echo " "
 echo "Looking for extensions..." >&4
diff --git t/loc_tools.pl t/loc_tools.pl
index fccbeeb..3801be8 100644
--- t/loc_tools.pl
+++ t/loc_tools.pl
@@ -20,6 +20,21 @@ sub _trylocale {    # Adds the locale given by the first parameter to the list
 
     $categories = [ $categories ] unless ref $categories;
 
+    # setlocale(0, "zh_CN.GB18030") crashes on darwin 13.3.0 with ASAN, wrong symlink RT#122296
+    if ($^O eq 'darwin' and grep /zh_CN\.GB18030/, @$categories) {
+        my $wrong_link = '/usr/share/locale/zh_CN.GB18030/LC_COLLATE';
+        if (-e $wrong_link and readlink($wrong_link) eq '../la_LN.US-ASCII/LC_COLLATE') {
+            warn "# Trying fix a wrong zh_CN.GB18030 LC_COLLATE link (requires sudo)\n";
+            system("sudo ln -sf -t /usr/share/locale/zh_CN.GB18030 LC_COLLATE ../zh_CN.eucCN/LC_COLLATE");
+        }
+        if (readlink($wrong_link) eq '../la_LN.US-ASCII/LC_COLLATE') {
+            my @categories = grep { !/zh_CN\.GB18030/ } @$categories;
+            $categories = \@categories;
+            warn "# skipped wrong zh_CN.GB18030 LC_COLLATE link\n";
+        } else {
+          warn "# fixed wrong zh_CN.GB18030 LC_COLLATE link\n";
+        }
+    }
     foreach my $category (@$categories) {
         return unless setlocale($category, $locale);
     }
-- 
2.1.0


p5pRT avatar Sep 24 '14 17:09 p5pRT

From @tux

On Wed, 24 Sep 2014 10​:38​:36 -0700, "Reini Urban via RT" <perlbug-followup@​perl.org> wrote​:

attached patch fixes this issue by trying to fix the wrong symlink in Configure and fix or skip the broken locale in t/loc_tools.

Reini, can this be done in hints instead? This looks *very* OS specific

-- H.Merijn Brand http​://tux.nl Perl Monger http​://amsterdam.pm.org/ using perl5.00307 .. 5.19 porting perl5 on HP-UX, AIX, and openSUSE http​://mirrors.develooper.com/hpux/ http​://www.test-smoke.org/ http​://qa.perl.org http​://www.goldmark.org/jeff/stupid-disclaimers/

p5pRT avatar Sep 24 '14 21:09 p5pRT

From @jhi

(sorry if this had been already discussed) also the locale brokenness should be reported upstream to the os vendor.

On Wednesday, September 24, 2014, H.Merijn Brand <h.m.brand@​xs4all.nl> wrote​:

On Wed, 24 Sep 2014 10​:38​:36 -0700, "Reini Urban via RT" <perlbug-followup@​perl.org <javascript​:;>> wrote​:

attached patch fixes this issue by trying to fix the wrong symlink in Configure and fix or skip the broken locale in t/loc_tools.

Reini, can this be done in hints instead? This looks *very* OS specific

-- H.Merijn Brand http​://tux.nl Perl Monger http​://amsterdam.pm.org/ using perl5.00307 .. 5.19 porting perl5 on HP-UX, AIX, and openSUSE http​://mirrors.develooper.com/hpux/ http​://www.test-smoke.org/ http​://qa.perl.org http​://www.goldmark.org/jeff/stupid-disclaimers/

-- There is this special biologist word we use for 'stable'. It is 'dead'. -- Jack Cohen

p5pRT avatar Sep 24 '14 21:09 p5pRT

From @khwilliamson

On 09/24/2014 03​:53 PM, Jarkko Hietaniemi via RT wrote​:

(sorry if this had been already discussed) also the locale brokenness should be reported upstream to the os vendor.

I believe it was reported to the vendor.

On Wednesday, September 24, 2014, H.Merijn Brand <h.m.brand@​xs4all.nl> wrote​:

On Wed, 24 Sep 2014 10​:38​:36 -0700, "Reini Urban via RT" <perlbug-followup@​perl.org <javascript​:;>> wrote​:

attached patch fixes this issue by trying to fix the wrong symlink in Configure and fix or skip the broken locale in t/loc_tools.

Reini, can this be done in hints instead? This looks *very* OS specific

-- H.Merijn Brand http​://tux.nl Perl Monger http​://amsterdam.pm.org/ using perl5.00307 .. 5.19 porting perl5 on HP-UX, AIX, and openSUSE http​://mirrors.develooper.com/hpux/ http​://www.test-smoke.org/ http​://qa.perl.org http​://www.goldmark.org/jeff/stupid-disclaimers/

p5pRT avatar Sep 24 '14 21:09 p5pRT

From @karenetheridge

confirmed this is still an issue in 10.10.5 (Darwin Kernel Version 14.5.0​: Tue Sep 1 21​:23​:09 PDT 2015; root​:xnu-2782.50.1~1/RELEASE_X86_64 x86_64)​:

$ ls -l /usr/share/locale/zh_CN.GB18030/LC_COLLATE lrwxr-xr-x 1 root wheel 28 12 Nov 2014 /usr/share/locale/zh_CN.GB18030/LC_COLLATE -> ../la_LN.US-ASCII/LC_COLLATE

p5pRT avatar Mar 23 '16 21:03 p5pRT

@karenetheridge what about now?

toddr avatar Jan 31 '20 06:01 toddr

@khwilliamson can you say who reported it to the vendor?

toddr avatar Feb 03 '20 01:02 toddr

: [ether@advocaat]$; ls -l /usr/share/locale/zh_CN.GB18030/LC_COLLATE
lrwxr-xr-x  1 root  wheel  28 18 Jul 20:08 /usr/share/locale/zh_CN.GB18030/LC_COLLATE -> ../la_LN.US-ASCII/LC_COLLATE
: [ether@advocaat]$; uname -a
Darwin advocaat 23.6.0 Darwin Kernel Version 23.6.0: Fri Jul  5 18:01:46 PDT 2024; root:xnu-10063.141.1~2/RELEASE_ARM64_T8112 arm64

What else should I be looking for? (I can't remember how to reproduce the original issue described here.)

karenetheridge avatar Sep 15 '24 00:09 karenetheridge