rustc-perf
rustc-perf copied to clipboard
Add a page to show the diff in symbols
To track down regressions due to LLVM's optimizations changing due to subtle code changes can be hard. One useful tool in that box is dumping the symbols in librustc_driver.so and comparing them at different commits. I think it could be helpful to add a page that shows this diff for each perf run
My current command for dumping the symbols is
nm build/x86_64-unknown-linux-gnu/stage1/lib/librustc_driver-a150cfa449c4561a.so | grep -oP '^[^ ]* . \K.*$' | c++filt
What do you think about this? Is it out of scope for the perf pages? Should I rather create a small command that downloads these binaries from try builds and compares them locally?
Hm, I think this should be reasonable to add, but it'll need some work as we only have numeric statistics right now. Would you want to do anything other than a raw text file though for the symbols? If it's just the text file we could put that in the perf S3 bucket (alongside the self profile data), but that would mean that querying many commits at the same time is likely too slow (as each would need an s3 request to load up the data, vs. a database query if the symbols were stored that way).
it's just the text file we could put that in the perf S3 bucket (alongside the self profile data), but that would mean that querying many commits at the same time is likely too slow (as each would need an s3 request to load up the data, vs. a database query if the symbols were stored that way).
I didn't even know there's an option to store things differently. I don't know enough to say anything about downsides to throwing the symbol lists into a database.
Would you want to do anything other than a raw text file though for the symbols?
The list of symbols can be quite large (250k entries) and a main use case is going through the diff, potentially filtered by some substring (module names are really practical here). So querying the relevant entries from the database and then diffing them may be the most convenient way to use this information
Hm, I guess, if you - for example - wanted a graph of number of symbols per commit over time, that would make the plain text file storage likely a bit suboptimal, as we'd need to download a bunch of them (which would be slow). So if that's something we care about, we'll want to store the count directly in the database. (Or some other statistics, perhaps).
But if you're mostly thinking that we'll be diffing two, then it should be easy to just store them in the S3 bucket; two get requests are relatively cheap.
But if you're mostly thinking that we'll be diffing two, then it should be easy to just store them in the S3 bucket; two get requests are relatively cheap.
Yea that. I don't think having a graph tracking the total number of symbols is interesting or useful, but if that changes we can revisit. So the best thing is indeed to just have these two files ready to have a diff run on.