elixir icon indicating copy to clipboard operation
elixir copied to clipboard

Test failure in Mix.Tasks.TestTest (test --listen-on-stdin does not exit on compilation failure)

Open awilfox opened this issue 1 year ago • 8 comments

Elixir and Erlang/OTP versions

Erlang/OTP 24 [erts-12.3.1] [source] [smp:8:8] [ds:8:8:10] [async-threads:1]

Elixir 1.16.3 (compiled with Erlang/OTP 24)

Also seen on:

Erlang/OTP 24 [erts-12.3.1] [source] [64-bit] [smp:64:64] [ds:64:64:10] [async-threads:1]

Elixir 1.15.7 (compiled with Erlang/OTP 24)

Operating system

Adélie Linux 1.0-beta5

Current behavior

We at Adélie Linux are rebuilding our entire package tree against GCC 13. While building Elixir, we ran into this test failure:

==> mix (ex_unit)
Excluding tags: [windows: true]

...............................................................................................................................................................................................................................................................................

  1) test --listen-on-stdin does not exit on compilation failure (Mix.Tasks.TestTest)
     /usr/src/packages/user/elixir/src/elixir-1.16.3/lib/mix/test/mix/tasks/test_test.exs:277
     ** (ExUnit.TimeoutError) test timed out after 60000ms. You can change the timeout:

       1. per test by setting "@tag timeout: x" (accepts :infinity)
       2. per test module by setting "@moduletag timeout: x" (accepts :infinity)
       3. globally via "ExUnit.start(timeout: x)" configuration
       4. by running "mix test --timeout x" which sets timeout
       5. or by running "mix test --trace" which sets timeout to infinity
          (useful when using IEx.pry/0)

     where "x" is the timeout given as integer in milliseconds (defaults to 60_000).

     stacktrace:
       test/mix/tasks/test_test.exs:607: Mix.Tasks.TestTest.receive_until_match/3
       test/mix/tasks/test_test.exs:312: anonymous fn/0 in Mix.Tasks.TestTest."test --listen-on-stdin does not exit on compilation failure"/1
       (elixir 1.16.3) lib/file.ex:1650: File.cd!/2
       test/test_helper.exs:146: MixTest.Case.in_fixture/3
       test/mix/tasks/test_test.exs:278: (test)
       (ex_unit 1.16.3) lib/ex_unit/runner.ex:472: ExUnit.Runner.exec_test/2
       (stdlib 3.17.1) timer.erl:166: :timer.tc/1
       (ex_unit 1.16.3) lib/ex_unit/runner.ex:394: anonymous fn/6 in ExUnit.Runner.spawn_test_monitor/4

I never saw this reported for GCC 8 builds. It happens both with 1.15.7 and 1.16.3, and I can reproduce it every time building for 32-bit x86 on an Intel Skylake using an i586 chroot. We also saw it transiently (but not every time) on an Ampere arm64 builder. Downstream tracker is adelie/packages#1182.

Expected behavior

Tests to pass, as they did when Erlang and Elixir were compiled against GCC 8.5.0.

awilfox avatar Jun 13 '24 02:06 awilfox

Further details, if relevant:

Hardware information (CPU, memory)
# lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         39 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  8
  On-line CPU(s) list:   0-7
Vendor ID:               GenuineIntel
  BIOS Vendor ID:        Intel(R) Corporation
  Model name:            Intel(R) Core(TM) i7-6700T CPU @ 2.80GHz
    BIOS Model name:     Intel(R) Core(TM) i7-6700T CPU @ 2.80GHz To Be Filled By O.E.M. CPU @ 2.7GHz
    BIOS CPU family:     198
    CPU family:          6
    Model:               94
    Thread(s) per core:  2
    Core(s) per socket:  4
    Socket(s):           1
    Stepping:            3
    CPU(s) scaling MHz:  29%
    CPU max MHz:         3600.0000
    CPU min MHz:         800.0000
    BogoMIPS:            5599.85
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse
                         2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopolog
                         y nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xt
                         pr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3
                         dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpi
                         d ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsa
                         veopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_
                         l1d
Virtualization features: 
  Virtualization:        VT-x
Caches (sum of all):     
  L1d:                   128 KiB (4 instances)
  L1i:                   128 KiB (4 instances)
  L2:                    1 MiB (4 instances)
  L3:                    8 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-7
Vulnerabilities:         
  Gather data sampling:  Vulnerable: No microcode
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
  Mds:                   Mitigation; Clear CPU buffers; SMT vulnerable
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Mitigation; Clear CPU buffers; SMT vulnerable
  Retbleed:              Mitigation; IBRS
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; IBRS, IBPB conditional, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Mitigation; Microcode
  Tsx async abort:       Mitigation; TSX disabled
# free -m
               total        used        free      shared  buff/cache   available
Mem:           31980         276        8249          42       23454       31207
Swap:              0           0           0
Operating environment information (kernel, libc)
# uname -a
Linux lab-x86_64-lin-2.tpcl.atwilcox.tech 5.15.132-mc6-easy #1 SMP Sun Nov 12 04:16:02 UTC 2023 i686 GNU/Linux
# /lib/ld-musl-i386.so.1
musl libc (i386)
Version 1.2.3
Dynamic Program Loader
Usage: /lib/ld-musl-i386.so.1 [options] [--] pathname [args]
Toolchain information (GCC, binutils, Perl)
# gcc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i586-foxkit-linux-musl/13.3.0/lto-wrapper
Target: i586-foxkit-linux-musl
Configured with: /usr/src/packages/system/gcc/src/gcc-13.3.0/configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --build=i586-foxkit-linux-musl --host=i586-foxkit-linux-musl --target=i586-foxkit-linux-musl --with-pkgversion='Adelie 13.3.0' --with-bugurl=https://bts.adelielinux.org/ --enable-checking=release --disable-fixed-point --disable-libstdcxx-pch --disable-multiarch --disable-multilib --disable-werror --disable-symvers --enable-__cxa_atexit --enable-default-pie --enable-default-ssp --enable-cloog-backend --enable-languages=c,c++,objc,go,fortran,ada --enable-linker-build-id --with-arch=i586 --with-tune=pentium2 --enable-cld --enable-mmx --disable-libsanitizer --enable-shared --enable-threads --enable-tls --with-matchpd-partitions=32 --with-system-zlib --with-linker-hash-style=gnu
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.3.0 (Adelie 13.3.0)
# ld --version
GNU ld (GNU Binutils) 2.41
Copyright (C) 2023 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.
# perl -V
Summary of my perl5 (revision 5 version 34 subversion 3) configuration:
   
  Platform:
    osname=linux
    osvers=5.15.132-mc6-easy
    archname=i686-linux-thread-multi
    uname='linux lab-x86_64-lin-2.tpcl.atwilcox.tech 5.15.132-mc6-easy #1 smp sun nov 12 04:16:02 utc 2023 i686 gnulinux '
    config_args='-des -Dcccdlflags=-fPIC -Dcccdlflags=-fPIC -Dccdlflags=-rdynamic -Dprefix=/usr -Dprivlib=/usr/share/perl5/core_perl -Darchlib=/usr/lib/perl5/core_perl -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5/vendor_perl -Dvendorarch=/usr/lib/perl5/vendor_perl -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl5/site_perl -Dsitearch=/usr/local/lib/perl5/site_perl -Dlocincpth=  -Doptimize=-O2 -fno-omit-frame-pointer -ggdb -march=pentium-mmx -mtune=pentium-m -g -Duselargefiles -Dusethreads -Duseshrplib -Dd_semctl_semun -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dinstallman1dir=/usr/share/man/man1 -Dinstallman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3pm -Dcf_by=Adelie -Ud_csh -Dusenm'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=undef
    use64bitall=undef
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cc'
    ccflags ='-D_REENTRANT -D_GNU_SOURCE -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2'
    optimize='-O2 -fno-omit-frame-pointer -ggdb -march=pentium-mmx -mtune=pentium-m -g'
    cppflags='-D_REENTRANT -D_GNU_SOURCE -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong'
    ccversion=''
    gccversion='13.3.0'
    gccosandvers=''
    intsize=4
    longsize=4
    ptrsize=4
    doublesize=8
    byteorder=1234
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=12
    longdblkind=3
    ivtype='long'
    ivsize=4
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=4
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /usr/lib /lib
    libs=-lpthread -ldl -lm -lcrypt -lutil -lc
    perllibs=-lpthread -ldl -lm -lcrypt -lutil -lc
    libc=/usr/lib/libc.a
    so=so
    useshrplib=true
    libperl=libperl.so
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-rdynamic -Wl,-rpath,/usr/lib/perl5/core_perl/CORE'
    cccdlflags='-fPIC'
    lddlflags='-shared -O2 -fno-omit-frame-pointer -ggdb -march=pentium-mmx -mtune=pentium-m -g -L/usr/local/lib -fstack-protector-strong'


Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_TIMES
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_IMPLICIT_CONTEXT
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
    USE_REENTRANT_API
    USE_THREAD_SAFE_LOCALE
  Built under linux
  Compiled at Jun 10 2024 02:19:00
  @INC:
    /usr/local/lib/perl5/site_perl
    /usr/local/share/perl5/site_perl
    /usr/lib/perl5/vendor_perl
    /usr/share/perl5/vendor_perl
    /usr/lib/perl5/core_perl
    /usr/share/perl5/core_perl

awilfox avatar Jun 13 '24 02:06 awilfox

Hi @awilfox, we don’t ship with any native code ij Elixir. So if the GCC version matters, then it is most likely an Erlang bug. But the test case is only checking I/O channels, so it may even be something happening at the OS level. In other words, my bet is that we are surfacing a problem downstream somewhere with I/O channels.

josevalim avatar Jun 13 '24 06:06 josevalim

For now, I pushed a commit to main that will help us understand more the timeout cases. So we know if something has been written to the pipe or not at all.

josevalim avatar Jun 13 '24 14:06 josevalim

Thanks for the quick work. Applied tip of main to 1.16.3 and received:

==> mix (ex_unit)
Excluding tags: [windows: true]

....................................................................................................................................................................................................................................................................

  1) test --listen-on-stdin does not exit on compilation failure (Mix.Tasks.TestTest)
     test/mix/tasks/test_test.exs:277
     ** (RuntimeError) nothing received from port after 15s.
     Expected: "undefined variable \"error_not_a_var\""
     Got: "Restarting...\n"

     code: in_fixture("test_stale", fn ->
     stacktrace:
       test/mix/tasks/test_test.exs:619: Mix.Tasks.TestTest.receive_until_match/3
       test/mix/tasks/test_test.exs:312: anonymous fn/0 in Mix.Tasks.TestTest."test --listen-on-stdin does not exit on compilation failure"/1
       (elixir 1.16.3) lib/file.ex:1650: File.cd!/2
       test/test_helper.exs:146: MixTest.Case.in_fixture/3
       test/mix/tasks/test_test.exs:278: (test)

I'm not very familiar with Erlang, unfortunately; is there a guide somewhere on how I should proceed debugging this issue?

awilfox avatar Jun 13 '24 17:06 awilfox

This is interesting! The receives the I/O signal but never finishes restarting. I would try reproducing the issue first outside of tests. Do this:

  1. Run mix new foo
  2. cd foo
  3. mix test --listen-on-stdin (this will block the terminal)
  4. Now open up test/foo_test.exs and change line 6 to be simply unknown_var
  5. Go back to the blocked terminal and press enter (newline)

Does it work?

josevalim avatar Jun 13 '24 17:06 josevalim

Yes:

builder on lab-x86_64-lin-2 ~/packages/user/elixir/src/elixir-1.16.3/test/foo $ PATH="$(pwd)/../../bin:$PATH" mix test --listen-on-stdin
Compiling 1 file (.ex)
Generated foo app
..
Finished in 0.02 seconds (0.00s async, 0.02s sync)
1 doctest, 1 test, 0 failures

Randomized with seed 725321

[ file changed, then Return pressed ]

Restarting...
    warning: variable unknown_var in code block has no effect as it is never returned (remove the variable or assign it to _ to avoid warnings)
    │
  6 │     unknown_var
    │     ~
    │
    └─ test/foo_test.exs:6:5: FooTest."test greets the world"/1

    error: undefined variable "unknown_var"
    │
  6 │     unknown_var
    │     ^^^^^^^^^^^
    │
    └─ test/foo_test.exs:6:5: FooTest."test greets the world"/1


== Compilation error in file test/foo_test.exs ==
** (CompileError) test/foo_test.exs: cannot compile module FooTest (errors have been logged)

awilfox avatar Jun 13 '24 18:06 awilfox

Thank you. Unfortunately I do not know the root cause, although everything indicates to be an issue elsewhere. Note Erlang/OTP 24 is no longer officially supported, so I'd recommend migrating to a more recent version and see if the issue persists.

Btw, this pull request was recently merged to Erlang/OTP: https://github.com/erlang/otp/pull/8553 - in case you want to try latest GCC too.

josevalim avatar Jun 18 '24 11:06 josevalim

Hi @awilfox, do you have updates here on more recent Erlang/OTP on the latest GCC? Thank you!

josevalim avatar Jul 02 '24 13:07 josevalim

Upgrading to Erlang/OTP 27 and then Elixir 1.17.2 passes tests on all our environments now. Thank you!

awilfox avatar Jul 10 '24 19:07 awilfox