Halide icon indicating copy to clipboard operation
Halide copied to clipboard

fuzz-cse failure

Open steven-johnson opened this issue 1 year ago • 8 comments

From internal testing, the enclosed fuzzer fails the fuzz_cse case with SIGABRT with a traceback of:

=================================================================
*** SIGABRT received by PID 2090517 (TID 2090517) on cpu 22 from PID 2090517; stack trace: ***
PC: @     0x7f3667d9d981  (unknown)  gsignal
    @     0x5609b165ecd3        288  base/process_state.cc:1237 FailureSignalHandler()
    @     0x5609b09f7b29        192  third_party/googlefuzztest/internal/runtime.cc:361 fuzztest::internal::HandleCrash()
    @     0x7f3667f10e80  1395162080  (unknown)
    @     0x5609b0a0c55b        176  third_party/googlefuzztest/internal/coverage.cc:170 fuzztest::internal::ExecutionCoverage::UpdateMaxStack()
    @     0x5609b0a0d8f9         48  third_party/googlefuzztest/internal/coverage.cc:389 __sanitizer_cov_trace_const_cmp4
    @     0x5609a98ff28b        128  third_party/halide/halide/src/Type.cpp:18 Halide::Type::can_represent()
    @     0x5609a98ff15f         32  third_party/halide/halide/src/Type.cpp:131 Halide::Type::can_represent()
    @     0x5609a8554d4e        160  third_party/halide/halide/src/ConstantInterval.cpp:207 Halide::Internal::ConstantInterval::cast_to()
    @     0x5609a8ed3081        160  third_party/halide/halide/src/Simplify_Internal.h:112 Halide::Internal::Simplify::ExprInfo::cast_to()
    @     0x5609a907dedd        160  third_party/halide/halide/src/Simplify_Exprs.cpp:14 Halide::Internal::Simplify::visit()
    @     0x5609a8e9632b        160  third_party/halide/halide/src/IRVisitor.h:170 Halide::Internal::VariadicVisitor<>::dispatch_expr<>()

The injection point is apparently a Halide merge that include the following changes:

7ca95d865 Expose BFloat in Python bindings (#8255) 7cf2951b0 Remove max size assert from Anderson2021 (#8253) a9b8fbf7c Rework the simplifier to use ConstantInterval for bounds (#8222) 35143d206 Mark host_dirty() and device_dirty() with no_discard. (#8248) 711dc88a3 Add HVX_v68 target to support Hexagon HVX v68. (#8232) 3ea47475e [xtensa] added support for sqrt_f16 (#8247) 33d5ba953 Fix saturating add matching in associativity checking (#8220) b5f5065c8 Add some EVAL_IN_LAMBDAs to Simplify_Sub.cpp (#8230) 8a316d1df [xtensa] Added vector load for two vectors for f16 and f32 (#8226)

testcase-5210573843529728.zip

steven-johnson avatar Jun 05 '24 22:06 steven-johnson

Does this one repro outside of Google? The last six months or so of fuzzer failures found inside Google don't repro upstream, so I'm hesitant to even try this one. I think I'll just run fuzz_cse overnight instead.

abadams avatar Jun 05 '24 22:06 abadams

(The failing assert was added in #8222)

abadams avatar Jun 05 '24 22:06 abadams

Does this one repro outside of Google?

Have not tried.

steven-johnson avatar Jun 05 '24 22:06 steven-johnson

I have set up 8 processes to fuzz cse in the open source overnight. We'll see if they can find an equivalent failure.

abadams avatar Jun 05 '24 22:06 abadams

Any luck?

steven-johnson avatar Jun 06 '24 20:06 steven-johnson

Yes, but the luck was amazingly bad. First there was a power spike + outage that killed the process, and now my work machine has a dead motherboard. When it boots (which is rare), the CPUs run at 250 MHz and dmesg spews errors.

It didn't find any failures before the outage either.

abadams avatar Jun 06 '24 21:06 abadams

It doesn't repro with that .zip file on linux-bot-4. I'll leave fuzz_cse running on linux-bot-4 for a while just to see if it finds anything

abadams avatar Jun 06 '24 22:06 abadams

No failures found overnight with 24 threads. Unless the fuzzing inside Google dedicates a lot more cycles to this, I don't think this bug exists in main.

abadams avatar Jun 07 '24 16:06 abadams