Add support for native extensions on linux aarch64
This is based upon the work in #330 and extended to also work on aarch64 (I briefly tested this on a Nvidia GH200). There are still some minor cleanups needed, hope I'll find a spare minute soon.
thanks for the PR!
I briefly tested this on a Nvidia GH200
Thats a pretty nice machine to be testing on =)
There are still some minor cleanups needed, hope I'll find a spare minute soon.
the changes here look pretty good to me - what are the other changes you were thinking of?
I have tried building this branch on Asahi Linux (M1 Mac mini) and get the message:
./py-spy record -n -d 5 -- python -c "import numpy as np"
Collecting stack traces from native extensions (`--native`) is not supported on your platform.
Is there something I am missing?
@SimonHeim did you build with 'cargo build --release --features unwind' ?
You can also get the binaries from the GitHub actions artifacts , which should have precompiled wheels with this support built in.
@SimonHeim did you build with 'cargo build --release --features unwind' ?
You can also get the binaries from the GitHub actions artifacts , which should have precompiled wheels with this support built in.
@benfred
This was the problem. Thanks. I am running my workload now. I may take a stab at adding support for --native on MacOS arm64.
While running this on my asahi m1 mini, py-spy begins to fall behind when sampling at 100Hz. Is this expected? I have been able to profile the same workload with perf trampoline support at 10kHz.
...
py-spy> 36.81s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 37.32s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 37.90s behind in sampling, results may be inaccurate. Try reducing the sampling rate
py-spy> 38.49s behind in sampling, results may be inaccurate. Try reducing the sampling rate
....
I'm also seeing issues with falling behind when sampling at 100Hz, tested on an Orin NX and a Raspberry Pi 4. Tried running py-spy under perf on the Pi but realized soon after that I didn't really know what I was looking at. I've attached the compressed perf.data (212MB unzipped) in case it's helpful to anyone. perf.zip