crossbeam icon indicating copy to clipboard operation
crossbeam copied to clipboard

On MacOS w/M1, Rayon tests segfaults on crossbeam-deque 0.8, fixed by reverting to 0.7

Open HackerFoo opened this issue 3 years ago • 4 comments

I found that Rayon segfaults in my app after recently updating rustc, so I ran cargo test on that project. I ran git bisect to narrow the commit down to a change from crossbeam-deque from 0.7.2 to 0.8.0. I suspected the compiler, but found that tests fail on each stable rustc version back to 1.58.0 on my machine (MacOS 12.4, M1/aarch64).

Changing crossbeam-deque to 0.7.4 fixes the tests. So I have evidence that it may need to be fixed here, although I haven't narrowed down the failure yet.

https://github.com/rayon-rs/rayon/issues/956

HackerFoo avatar Jul 16 '22 00:07 HackerFoo

This seems similar to the case mentioned in https://github.com/crossbeam-rs/crossbeam/issues/860.

I said in https://github.com/crossbeam-rs/crossbeam/issues/860#issuecomment-1178511100:

Reducing MAX_OBJECTS makes it more likely to trigger any potential data races. So reverting https://github.com/crossbeam-rs/crossbeam/pull/552 may reduce the occurrence of SIGSEGV. (Of course, that does not mean that the underlying bug is fixed.)

Could you try to revert https://github.com/crossbeam-rs/crossbeam/pull/552 and test it?

taiki-e avatar Jul 16 '22 03:07 taiki-e

Reverting #552 fixes Rayon's tests and works with my app.

Add this to Rayon's Cargo.toml to try it:

[patch.crates-io]
crossbeam-deque = { git = "https://github.com/HackerFoo/crossbeam.git", branch = "revert-552" }

https://github.com/HackerFoo/crossbeam/tree/revert-552

HackerFoo avatar Jul 16 '22 21:07 HackerFoo

Thanks for confirming! I've reverted #552 as part of #879.

It is difficult for me to investigate this at this time as I could not reproduce this issue in my environment (mac m1, but not so many cores) even if reduced MAX_OBJECTS more, but I guess the underlying problem is a deque bug.

taiki-e avatar Jul 22 '22 18:07 taiki-e

I cannot reproduce the issue too. I tried the following environment:

  • macOS 12.4 arm64 on Apple M1 chip (4 × P-cores + 4 x E-cores)
  • Linux x86_64 on Intel Core i7 12700F (20 × threads = 8 × P-cores + 4 x E-cores)

but all tests passed.

I ran cargo test more than 10 times in each environment. I used Rust 1.62.1 and 1.62.0. Also, in order to use the exact same versions of the crates to the original GH issue (https://github.com/rayon-rs/rayon/issues/956), I ran cargo update -p <crate> --precise <version> several times to modify the Cargo.lock.

@HackerFoo — What is the exact M1 chip do you use? (M1, M1 Pro, M1 Max, M1 Ultra)

You can try sysctl -a | grep machdep.cpu:

$ sysctl -a | grep machdep.cpu
machdep.cpu.cores_per_package: 8
machdep.cpu.core_count: 8
machdep.cpu.logical_per_package: 8
machdep.cpu.thread_count: 8
machdep.cpu.brand_string: Apple M1

$ sw_vers
ProductName:	macOS
ProductVersion:	12.4
BuildVersion:	21F79

Also, just for sure, can you please test it again on your Mac as the followings?

  1. Revert the Cargo.toml of rayon.
  2. Use this Cargo.lock:
    • Cargo.lock.zip
    • (Please unzip it. GH Issue does not allow to attach .lock file directory)

FYI, I did the followings:

$ git clone [email protected]:rayon-rs/rayon.git
$ cd $_
$ git checkout a92f91b
$ git rev-parse HEAD                         
a92f91bf43aa3fd7f37f57bf603122a315255b9e
$ cargo update -p crossbeam-deque --precise 0.8.1
$ cargo update -p crossbeam-epoch --precise 0.9.8
## Continued running `cargo update` on different crates.
...

$ cargo tree
rayon v1.5.3 (/Volumes/data2/git-repos/rayon)
├── crossbeam-deque v0.8.1
│   ├── cfg-if v1.0.0
│   ├── crossbeam-epoch v0.9.8
│   │   ├── cfg-if v1.0.0
│   │   ├── crossbeam-utils v0.8.8
│   │   │   ├── cfg-if v1.0.0
│   │   │   └── lazy_static v1.4.0
│   │   ├── lazy_static v1.4.0
│   │   ├── memoffset v0.6.5
│   │   │   [build-dependencies]
│   │   │   └── autocfg v1.1.0
│   │   └── scopeguard v1.1.0
│   │   [build-dependencies]
│   │   └── autocfg v1.1.0
│   └── crossbeam-utils v0.8.8 (*)
├── either v1.6.1
└── rayon-core v1.9.3 (/Volumes/data2/git-repos/rayon/rayon-core)
    ├── crossbeam-channel v0.5.4
    │   ├── cfg-if v1.0.0
    │   └── crossbeam-utils v0.8.8 (*)
    ├── crossbeam-deque v0.8.1 (*)
    ├── crossbeam-utils v0.8.8 (*)
    └── num_cpus v1.13.1
        └── libc v0.2.126
[build-dependencies]
└── autocfg v1.1.0
[dev-dependencies]
├── lazy_static v1.4.0
├── rand v0.8.5
│   ├── libc v0.2.126
│   ├── rand_chacha v0.3.1
│   │   ├── ppv-lite86 v0.2.16
│   │   └── rand_core v0.6.3
│   │       └── getrandom v0.2.7
│   │           ├── cfg-if v1.0.0
│   │           └── libc v0.2.126
│   └── rand_core v0.6.3 (*)
└── rand_xorshift v0.3.0
    └── rand_core v0.6.3 (*)

## Disable sccache
$ unset RUSTC_WRAPPER

## Run tests
$ cargo test
$ cargo +1.62.0 test

Thanks!

tatsuya6502 avatar Jul 23 '22 03:07 tatsuya6502