nix
nix copied to clipboard
add stack sampling profiler for flamegraphs
This is code is a bit rough in the sense that I have taken shortcuts to quickly iterate. This is not ready for merging yet, but I think it's already useful for people that want to generate flamegraphs for things as large as NixOS.
Usage:
$ NIX_PROFILE_FILE=/tmp/nixos-trace nix run github:mic92/nix-1/sampling-profiler eval -v --no-eval-cache github:mic92/dotfiles#nixosConfigurations.turingmachine.config.system.build.toplevel
The result is in this example stored in /tmp/nixos-trace. It can be imported in tools that support folded stacks i.e. https://www.speedscope.app/ or the original flamegraph script (https://github.com/brendangregg/FlameGraph)
The profiler records stack trace of the nix evaluation every 10ms (100Hz).
The resulting file compresses well with zstd:
/tmp/nixos-trace : 0.27% ( 2.15 GiB => 5.95 MiB, /tmp/nixos-trace.zst)
Motivation
Context
Priorities and Process
Add :+1: to pull requests you find important.
The Nix maintainer team uses a GitHub project board to schedule and track reviews.
This is a screen shot when evaluating my NixOS machine:
Here is an example trace: https://github.com/Mic92/nix-1/releases/download/assets/nixos-trace.zst
You can download and decompress it with zstd:
zstd -d /tmp/nixos-trace.zst
Than visit https://www.speedscope.app/ and import it.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/flamegraphs-for-nixos/51183/1
cc @picnoir @Atemu who were involved in the tracy based profiler: https://github.com/NixOS/nix/pull/9967
Thanks a bunch for this! I tried going through your flame graph and have some feedback:
- File + line isn't very useful info at a glance. To actually get an idea of what's going on, you'd need the name of the binding on that line in addition to the file+line number
- There's a bunch of
«none»s in the output, what do those represent?
Thanks a bunch for this! I tried going through your flame graph and have some feedback:
1. File + line isn't very useful info at a glance. To actually get an idea of what's going on, you'd need the name of the binding on that line in addition to the file+line number
The issue is that function in nix don't really have names. Often enough they are assigned to variables but not always i.e. if they are passed to other functions.
2. There's a bunch of `«none»`s in the output, what do those represent?
If a builtin calls a function than we don't have a position at the moment for example. I might be able to provide the string name of builtins, but I don't know if I can get the position in this case as well.
The issue is that function in nix don't really have names.
True, but ExprLambda has a name that's based on its context, as it is often part of a binding. It's imperfect information, but it works well.
- There's a bunch of
«none»s in the output, what do those represent?
This may be the position of the call instead of the position of the function. ExprLambda has its own position which I believe is always available.
Rebased + macOS fix. I haven't addressed any of the comments yet.
@Mic92 sorry to bump, but since it's been a while since there's been any activity on this, do you have any advice for people interested in pushing this feature forward? I personally would appreciate any recommendations you have for what should be done next.
@ConnorBaker hopefully this https://github.com/NixOS/nix/pull/13219 can get the ball rolling once again. @Mic92 please check that the API I provided in that PR is enough to accomplish what is necessary here.
Indeed. I needed something like that. Thanks for looking into it.
I've also taken the liberty to rebase this patch on top of the proposed EvalProfiler (as a technical POC) and added more information (primop name and lambda name) https://github.com/NixOS/nix/pull/13220. Feel free to cherry-pick changes from that branch.
Lets go with your pull request in https://github.com/NixOS/nix/pull/13220