codon
codon copied to clipboard
C++ benchmarks are poorly optimized
This project is fantastic and wonderful for the python community. However, the C++ benchmark comparisons are poorly optimized and therefore don't give very honest comparisons. I'm submitting pull requests for simple improvements that significantly improve speed. I'm not trying to bash your project, I think it is amazing, but I also think coders should be given a healthy comparison when it comes to performance.
Hi @stephenberry, thanks for your contributions. I saw your PRs but didn't have a chance to comment on them yet. There are a few important things to note here:
-
Some of benchmarks (e.g.
nbody-- see thebench/README for details) are taken from the pyperformance benchmark suite. The Codon versions use the exact same code/algorithms as from this suite, even if it may not be optimal. The C++ implementations also use the same algorithm. We don't want to change these algorithmically because we want to remain consistent with pyperformance (meaning the C++ versions should use the same, possibly sub-optimal, algorithm that Codon/Python use). -
The Codon versions are themselves actually suboptimal in a lot of cases, but the aim of these benchmarks is to represent a balance between performance and idiomatic code. You can make the Codon versions much faster in several cases by restructuring the code, using Codon-specific features, etc. In general, comparing languages on performance is very hard for this reason.
-
We can certainly accept small changes/optimizations to the C++ implementations but I don't think it makes sense to incorporate large-scale / algorithmic changes because of these reasons.
Happy to answer any other questions / give more info on anything.
One other thing I forgot to mention: we would definitely welcome C++ implementations of the benchmarks that don't have one yet. (The Codon-parallelized ones can use OpenMP in C++, which is what we use in Codon.)
Thanks for your feedback.
- I'm just making very simple optimizations that would take a normal programmer only a few minutes to spot and change. Nothing fancy. And, I think they make sense when the performance is multiple times faster.
- I guess I could submit a pull request for the pyperformance benchmark. But, this was just changing two lines of code that are are simply not smart to use. Using std::pow with 0.5 instead of std::sqrt isn't smart and not the first function a programmer would choose.
- I won't change any algorithm logic. I'm just fixing bad programming that allocates everywhere or cases like removing std::pow when it isn't needed.
- I'll look at adding benchmarks for the other cases. I think it's great that your compiled python is running so fast! I just don't want the professional world to scoff at you because your benchmarks are poor.
Thanks -- that all makes perfect sense and again very much appreciate your contributions. Let's take them on a case by case basis; there's definitely reasonable room for improvement in a few places, and if C++ turns out to be faster than we thought, then that's something we want to know anyway :)
I won't change any algorithm logic. I'm just fixing bad programming that allocates everywhere or cases like removing std::pow when it isn't needed.
Simple trivial changes (resize etc.) should be OK. Changing std::pow is changing the logic itself, and is outside of the scope. Codon uses the same std::pow AFAIK, so then we'd have to change that, and so on, and so on. This ideally should be left to compiler.
It is pretty funny to claim it is faster than C++ and then not allow people to change std::pow to sqrt in benchmarks. Because everyone knows, that is how you write C++ /s