Programming-Language-Benchmarks Problems requiring heavy library usage might not be good for benchmarking.

Some of the current problems being used for benchmarking usually result in program authors using a library to do much of the work. Examples of this are the edigits and pidigits problems that usually require an arbitrary precision math library like GMP, the regex-redux problem that usually requires a regular expression library like PCRE or RE2, and the secp256k1 problem that usually requires a crypto or arbitrary precision math library. It seems like the goal of these benchmarks is to benchmark programming languages and their implementations so it might not be a good idea to have problems that will typically be heavily dependent on libraries to do much of the work.

Many of the libraries will be implemented in a different programming language (like C, C++, or assembly) than the one the program is written in and additionally libraries can have greatly different performance from other libraries. This results in these problems being more of a benchmark of the libraries than the programming languages and their implementations.

Also if there is one highly dominant library (like GMP for arbitrary precision math), this can result in many ties. This was demonstrated about a year ago on the pidigits benchmark on the Computer Language Benchmarks game when there was roughly a 10 way tie for best performance. This is highly indicative of just how much program performance for these problems is dependent on the libraries being used.

Many people probably won't even be aware of this library usage but those who are probably won't find benchmarking of libraries to be quite as interesting as benchmarking programming languages and their implementations. I know at least a couple other people have the same thoughts that I do. I would suggest that the edigits, pidigits, regex-redux, and secp256k1 problems (as well as any others I may be missing) should be removed and future problems should try to avoid the use of libraries.

Dec 31 '21 03:12 JZerf

Many of the libraries will be implemented in a different programming language

Only when a program is using its standard library, FFI bindings are allowed (e.g. bigint in julia, crystal uses GMP under the hood). The third-party libraries I choose are all lang-native implementations (plz feel free to point out exceptions, I might have miss-read some of them) p.s. 0.go for secp256k1 is a known exception that uses FFI bindings to C libsecp256k1, which I've commented at the top of the code

it might not be a good idea to have problems that will typically be heavily dependent on libraries to do much of the work.

bigint, http server/client, coroutines are the exceptions as it's impossible to implement these in ~100 LOC. For other cases 3rd-party libs are avoided as much as possible. e.g. for secp256k1, only 0.go uses cgo bindings to C libsecp256k1, and 0.rs uses pure rust implementation, all others are implemented with only bigint library.

I would suggest that the edigits, pidigits, regex-redux, and secp256k1 problems (as well as any others I may be missing) should be removed

Again, FFI bindings(e.g. crystal bigint) / assemblies (e.g. go bigint) are accepted here only when it's part of lang's stdlib.

Dec 31 '21 04:12 hanabi1224

it might not be a good idea to have problems that will typically be heavily dependent on libraries to do much of the work.

Please note that Common Lisp was standardized in 1994, therefore all new language extensions/libraries are external in that sense. Should it be frozen in its core implementation? I myself bag to differ.

Programmers, engineers are developing tools to be reused as well to make our life less miserable. Library developers are developing optimized libraries for the same reason other reusable tools in any other field on engineering are developed.

I would also add that it is one of the core responsibility of an engineer to use the best tool for a job and that includes the selection of language used for that specific problem and the selection of the libraries used for that specific problem.

Not every language is optimal for every problem and if anyone feels that a language is at short in any way than just opt not to use it. Other people might still like it because it has other advantages apart from being the fastest running one.

And how about a language that translates itself to C and uses the C native compiler. Should it be allowed to use only native or C libraries as well:

In my humble opinion one of the site maintainer's core responsibly is to make sure that no code is cheating by reducing the problem space in any way and also giving clear instructions what is allowed and what is not.

This site maintainer is definitely doing the first one very vigorously! (maybe even too vigorously:)

Jan 01 '22 12:01 bpecsek

Only when a program is using its standard library, FFI bindings are allowed. The third-party libraries I choose are all lang-native implementations.

I hadn't noticed that nearly all the third party libraries in use are also programmed in the same language as the program. That seems like a decent improvement over the Computer Language Benchmarks Game to see that third party libraries programmed in different languages generally are not being allowed. However it still seems unfair to have problems that will typically rely on a good library to achieve good performance, allow some programs to use good libraries programmed in a different language just as long as they are part of the standard libraries, and then disallow other programs from using those same libraries. This is a valid reason for people to question the legitimacy of the Computer Language Benchmarks Games and I'm hoping these benchmarks might still address that.

0.go for secp256k1 is a known exception that uses FFI bindings to C libsecp256k1

Shouldn't that 0.go program be removed since it's using a C library that isn't a part of Go's standard library? I can't see any reason why an exception to your rules should be allowed for that Go program. If that Go program is allowed to use a C library, it seems only fair that every other language should get to use C libraries.

Lastly, I also just want to put a little more emphasis on how much the libraries can affect performance. Libraries can have vastly different performance from each other. For regex libraries, they could be implemented using NFAs, DFAs, backtracking, etc... and they could also make use of assembly code, JIT compilation, etc... For arbitrary precision math libraries, some will make use of assembly code and they can also use different algorithms for various operations (multiplication alone can be done using long multiplication, Karatsuba multiplication, Toom-Cook multiplication, or even Fourier transforms). For hash table libraries, there are a wide variety of hash functions that can be used and additionally there are also multiple ways to deals with collisions like chaining or open addressing and with open addressing you still have more variation with things like linear probing, quadratic probing, double hashing, etc... All these different choices can give people another valid reason to question the legitimacy of benchmarks which is why I think it's good idea to try to avoid heavy library usage.

On a side note, I'm kind of wondering if it's a good idea to have benchmark problems that are likely to have programs use crypto libraries that have a legitimate concern about dealing with side-channel timing attacks? I was looking at some of the source code for libsecp256k1 and it makes a couple mentions about avoiding side-channel timing attacks (which often means intentionally making the code slower). This is just another example of how libraries can have different performance from other libraries, in this case because a library was designed to be more secure.

Jan 02 '22 15:01 JZerf

it might not be a good idea to have problems that will typically be heavily dependent on libraries to do much of the work.

Please note that Common Lisp was standardized in 1994, therefore all new language extensions/libraries are external in that sense. Should it be frozen in its core implementation? I myself bag to differ.

I don't think my suggestion applies to what you mentioned. My suggestion is for trying to avoid having problems where the libraries do much of the work. I'm OK with allowing libraries to enable the use of primitive/basic type things (say like adding support for multithreading or SIMD) which may be important for performance but ultimately are not responsible for doing most of the work. What I'm not OK with is problems like the pidigits problem where often 90+% of the time is spent in arbitrary precision math libraries or the regex-redux problem where often 60+% of the time is spent in regular expression libraries.

Jan 02 '22 15:01 JZerf

What I'm not OK with is problems like the pidigits problem where often 90+% of the time is spent in arbitrary precision math libraries or the regex-redux problem where often 60+% of the time is spent in regular expression libraries.

And if those arbitrarily precision math are part of the language and regex is written in the same language?

To be frank I don’t get your point. Using languages without libraries are like building a house without tools.

Though, I kind of agree that the absolute best would be to use libraries written in the same language (it would be quite mad to do it in python and ….), however, why should all the existing libraries be rewritten for all the languages? Each and every language has FFI to tap into existing libraries. Your job is to pick the one best suited for the job.

Nevertheless, even after my last paragraph, I do agree that for this particular benchmark I tend to agree to only allow to use libraries written in the same language without exception.

Jan 02 '22 23:01 bpecsek

What I'm not OK with is problems like the pidigits problem where often 90+% of the time is spent in arbitrary precision math libraries or the regex-redux problem where often 60+% of the time is spent in regular expression libraries.

And if those arbitrarily precision math are part of the language and regex is written in the same language?

Even if the libraries are written in the same language, the libraries can still have very different performance from other libraries because one library might be using a much better algorithm (see my previous comment for examples of why this can happen). This project's website calls itself the Programming Language and compiler Benchmarks, not the the Programming, Language, Library, Algorithm, and compiler Benchmarks. These benchmarks are turning into library and algorithm benchmarks if libraries are doing most of the work instead of code made in the programming language.

Using languages without libraries are like building a house without tools.

Yes, in most cases doing programming without using libraries is a bad idea, but that's not what I'm arguing. What I'm arguing is that in this case it is a bad idea to be benchmarking programming languages and their implementations by making programs that make heavy usage of libraries that may possibly be written in different programming languages and using different algorithms.

To continue with your analogy, let's say that you and I wanted to determine who is faster at digging a hole for a house's basement. I start using a bucket to move the dirt somewhere else. You take this a step further and use two buckets (kind of like using multithreading in a program) to move dirt even faster. Fair enough, you're still doing the work, just doing it more efficiently. Then you take things a step further and get a wheelbarrow (maybe kind of like using SIMD in a program) so you can move yet more dirt faster. OK, still fair enough because you're still doing the work but even more efficiently. I get fed up with this and then hire some large company to move the dirt for me (kind of like using a library in a program) and now we're going way faster than you. In this case could I now legitimately say that I'm faster than you or would it be more appropriate to say that the large company I hired is faster than you? I think it would be the latter because they were the ones that did most of the work.

Jan 04 '22 00:01 JZerf

What I'm arguing is that in this case it is a bad idead to be benchmarking programming languages and their implementations by making programs that make heavy usage of libraries that may possibly be written in different programming and using different algorithms.

Let's make the topic more realistic, when ppl are making choices from many langs, do they take standard or most popular libraries into consideration. Let's say you're choosing a language for your new program which will heavily depend on bigint / regex, do you think pidigits / edigits / regex-redux numbers are helpful for you to make the decision or not? Do you agree that the performance of a language should be evaluated in a library-agnostic way or not? Besides these, there're also numbers for the same source code with different compilers in the same language, do you think it makes the comparison invalid by using libraries?

At the end of the day, if you write your real-life program with those libraries, do the benchmarks of those libraries reflect an important aspect of the language's performance or not?

Jan 04 '22 08:01 hanabi1224

the libraries can still have very different performance from other libraries because one library might be using a much better algorithm

How about if your program is using better algorithm and running much faster? That only means that you are a better programmer and the same applies if you are using a better library. Though as I mentioned I tend to agree in this case that the libraries should be written in the same language.

programs that make heavy usage of libraries that may possibly be written in different programming and using different algorithms.

Different language can be questionable, but different algorithm? Just use the better one as a good engineer would do.

In this case could I now legitimately say that I'm faster than you or would it be more appropriate to say that the large company I hired is faster than you? I think it would be the latter because they were the ones that did most of the work.

Were the conditions agreed upon in prior about the allowable use of “tools”? If yes and even subcontractors were allowed and you picked a better, in this case faster, one than what is the problem? If not then it is called cheating and you are out of the competition.

One more note: If the library is written in the same language isn't that automatically implies that you benchmarking that particular language? Though I do agree that not all libraries are the same quality. But than you can ask the library developers to improve it or help them out by providing better/faster code.

Jan 04 '22 10:01 bpecsek

when ppl are making choices from many langs, do they take standard or most popular libraries into consideration.

It depends. I know many Python and Java developers love how the languages include a lot of functionality in their standard libraries which usually allows for faster and easier development. On the other hand developers for high performance software (OS kernels, AAA 3D games, multimedia software, etc...) often won't care too much and they'll use whatever is the fastest thing available to them (or just make their own software if they can't find anything that meets their needs).

Let's say you're choosing a language for your new program which will heavily depend on bigint / regex, do you think pidigits / edigits / regex-redux numbers are helpful for you to make the decision or not?

They'd probably be at least partially useful to a wise person that looks at the source code of the programs and the libraries that are being used. To a person that just looks at the benchmark results, there is potential that the results could lead to a bad decision. Case in point, do you know why regex-redux is called regex-REDUX? Basically it's because over two dozen of the original regex-DNA programs weren't following the CLBG (Computer Language Benchmark Games) rules about using regular expressions to do searching and instead were just doing plain text searching. See this thread where I brought up this issue several years ago. The regex-DNA benchmark was flawed and it would have been a bad idea to make a decision based on its results. Similarly, one of the PCRE developers made some benchmarks at https://zherczeg.github.io/sljit/regex_perf.html that shows RE2 struggling pretty badly when using a regular expression like "[a-q][^u-z]{13}x" but otherwise performing well. If the regex-redux problem was only using regular expressions like that, someone might think that RE2 is no good whereas better benchmarks like Zoltan's would show that it is actually quite good.

Do you agree that the performance of a language should be evaluated in a library-agnostic way or not?

Yes, benchmarks of a programming language and its implementation should be done in a way that tries to avoid heavy influence from the quality of any libraries being used.

Besides these, there're also numbers for the same source code with different compilers in the same language, do you think it makes the comparison invalid by using libraries?

Again, it depends. If you're using the exact same compiled libraries with multiple programs, that should probably lead to the libraries performing the same for each program and any performance differences in the programs then being a result of the remaining non-library code. This should result in a valid comparison for seeing which program has the fastest non-library code. However if the goal is to compare the programming languages and their implementations and some libraries are doing ~99% of the program work (which is the case for some of the pidigits programs), then the comparison can be pretty useless (but still valid!) if the remaining ~1% of the program work is being done by actual code from that programming language and its implementation.

Comparisons of programs can easily be invalidated. You can invalidate a comparison of two programs using the same program source code, with the same library source code, and all compiled by the same compiler if you just use different compiler options for the library for instance. I'm pretty sure this is already happening for the pidigits problem in this project and, at least for a while, the CLBG. Chapel is using its own GMP library (Edit: This is correct for Chapel's preferred configuration (see https://chapel-lang.org/docs/usingchapel/QUICKSTART.html#using-chapel-in-its-preferred-configuration) and probably correct for the CLBG but it turns out this project is using the chapel/chapel:latest docker image which sets the CHPL_GMP environment variable to "system" which should make Chapel use Debian's GMP library.) that is a little more optimized than the GMP library being used by some of the other programs. I was looking into this many months ago for the CLBG and came to the conclusion that it was due to better compilation options being used and/or GMP's source code being better configured (this could be considered as using different library source code but GMP has a program for tuning some of its parameters, see https://gmplib.org/manual/Performance-optimization). I verified for the CLBG that it is in fact the library that is faster by making a program use Chapel's GMP library which causes it to run just as fast as the Chapel program.

At the end of the day, if you write your real-life program with those libraries, do the benchmarks of those libraries reflect an important aspect of the language's performance or not?

Again, it depends.

For comparing programming languages and their implementations, I think that avoiding heavy usage of libraries leads to much more useful and valid comparisons.

Jan 05 '22 01:01 JZerf

How about if your program is using better algorithm and running much faster? That only means that you are a better programmer and the same applies if you are using a better library. Though as I mentioned I tend to agree in this case that the libraries should be written in the same language.

If someone is using a better, accepted algorithm in their program, then the other programmers can just copy that same algorithm if it works better for them too. This has happened many times on the CLBG.

The same does NOT currently apply if someone uses a better library. If some programming language implementation's standard libraries are using some great library made in assembly but another programming language implementation does not use that same library in its standard libraries, then programs made with that second programming language implementation are at a disadvantage and nothing can be done about it except try to convince the developers of that programming language implementation to include that library in a future release.

programs that make heavy usage of libraries that may possibly be written in different programming and using different algorithms.

Different language can be questionable, but different algorithm? Just use the better one as a good engineer would do.

Yeah, you can likely just use the better algorithm but you can NOT just use a different programming language for the library (disallowed by the rules) and that can almost always put you at a potential disadvantage unless you're programming in assembly.

Were the conditions agreed upon in prior about the allowable use of “tools”? If yes and even subcontractors were allowed and you picked a better, in this case faster, one than what is the problem? If not then it is called cheating and you are out of the competition.

You're missing the point. What if we agree to allow the hiring of companies but for some reason I have more money than you and can hire a better company so we still are faster than you? The point is that >> I << would not be doing the work so it wouldn't be right to say that >> I << am faster than you.

One more note: If the library is written in the same language isn't that automatically implies that you benchmarking that particular language? Though I do agree that not all libraries are the same quality. But than you can ask the library developers to improve it or help them out by providing better/faster code.

Yes, you'd be evaluating the performance of the programming language implementation that the library is implemented in. However you might be adding a lot of other variables that you don't want to be adding. Maybe the library is using a different programming implementation than the one your program is using. Maybe that library's programming implementation doesn't support all the same features (like SIMD or multithreading) your program's programming implementation does. Maybe that library was compiled with less optimal compilation options. In my opinion it's better to avoid heavy use of libraries so that you reduce the influence of these other variables.

Jan 05 '22 01:01 JZerf

This is a tough one and we can argue back and forth for ages.

How about letting the site maintainer decide about the rules. It should be his responsibility.

After he or she communicates it clearly we have to abide the rules and stick with it.

At the moment I kind of thinking about dropping out all together since it looks like I can’t even convince the site maintainer about a simple thing that SBCL is always allocates on the heap without specifically requesting stack allocation even after the language maintainers confirmed it and the heap allocation profiler clearly confirmed it I am not getting any reply for long days. This is just embarrassing and very frustrating. It looks like being right from the beginning is something that is not welcomed here. All CL binary tree codes are removed for being too fast compared to Java. I don’t even want to go into the tricky area that knowing about Java’s very advanced escape analysis what guarantees do we have in Java. And if the compiler, current or future, decide to put something on stack should I intervene somehow.

I have even started improving one of the C++ codes since I am worried what can happen with the nsieve codes that are faster then the C++ using the same algorithm since C++ is using the slowish std::vector against the fast simple-bitvector in the Common Lisp code. I rewrote the C++ code using the Boost library but I don’t get reply if that is allowed or not either. So I can make the C++ code beat the CL code using external library only, such as using boost::dynamic_bitset<>.

Could you please have a look at the 3.cpp nsieve code sitting in my unaccepted PR and speed it up by at least 25% using standard library only to beat the 3.cl code that is using the same algorithm and is also sitting in my unaccepted PR?

So it looks like at the moment that I am wasting my time.

Correction: nsieve 3.cl is about 25% faster than 3.cpp and about 30% faster than 2.cpp on my CPU (i7-7700HQ) but both C.pp codes are a bit faster on the Intel(R) Xeon(R) Platinum 8272CL. Good. I don’t have to worry any more.

Jan 05 '22 17:01 bpecsek

Could you please have a look at the 3.cpp nsieve code sitting in my unaccepted PR and speed it up by at least 25% using standard library only to beat the 3.cl code that is using the same algorithm and is also sitting in my unaccepted PR?

I'm not sure if you were referring to me. If you were, I think I'd rather port your existing, accepted 5.cl program to C/C++ instead. It looks like that is using a different and more optimal algorithm.

Jan 06 '22 02:01 JZerf

There must be having a code somewhere since as far as I know the CL code is a loose port from C.

Jan 06 '22 08:01 bpecsek

I might have found the C code. https://github.com/mayerrobert/Primes/blob/drag-race/PrimeC/solution_2/sieve_5760of30030_only_write_read_bits.c You can amend it to suit out spec here for running and formatting output.

Jan 06 '22 19:01 bpecsek

I have made the amendments to the above C code to work here and it runs quite a bit faster than the CL code on my particular CPU.

However I don't see any point sending a PR since it looks like nothing is accepted from me.

No matter how nicely or what I ask about or for I don't get any kind of reply or action for something like 7 days. Not even a yes or no or a get out of here since then.

The last message I've received 7 days ago was " I believe I've clearly explained everything I could with examples and profiling reports, and all I need is a profiler report to prove the fairness which I believe is clear and actionable. If you still cannot or refuse to understand my point, please feel free to fork the code and publish your own results, as this talk is leading nowhere but wasting my time, I'm closing it."

I have contacted the SBCL language maintainers who confirmed my point and a day or two later with the help of the head SBCL maintainer I did send the profiling report clearly proving that I was right from the beginning but no reply or action since then.

And all this hassle and annoyance and time wasting is based on this:

I've shown how other FP langs can be unfair in a similar way with profiler report. The number is not possible, there's no chance CL's number can be better or even close to JVM's.

OCaml code was profiles. And if OCaml is doing something than CL must be doing the same.

The numbers are not even better!!! They are better for small trees like d=14 and close or slightly better at 18 but when the JIT can really kick in at 20 or 21 Java clearly wins.

        depth 14  depth 18 depth 21
CL 1.cl : 46ms    550ms    4733ms (new single threaded 1.cl with cons cell)
CL 5.cl : 68ms    880ms    6110ms (single threaded 5.cl with struct)
Java    : 132ms   651ms    3774ms (single threaded 2.java java-17)

Of coerce the multi-threaded CL codes are still faster than the single-threaded Java code for a reason.

I hope I just imagining it but it feels very upsetting, annoying and extremely unprofessional how this issue is handled.

At least in the CLBG site first we got something like "You are not welcomed here", "Harassment is not acceptable", "Bullying is not acceptable" "Anything you've asked for is rejected" message before cancelling my account altogether. And it was for complaining about using a 12 year-old CPU for the benchmark giving very strange relative speeds among languages. After the CPU upgrade going from an core 2 Q6600 (launched at 07) to a i5-3330 (launched at 12 still pretty old) the relative speed figures changed dramatically giving relative speeds much closer to more recent CPUs. None of my code got excepted and are just sitting among the open issues and much slower codes are listed. But the funniest was that anyone using my code as the basic and modified it was also rejected pointing out that it was originally my code. It is like knowing the history of VW and never buy a VW car because of it. What a strange word.

But anyway, I am still hoping that the site maintainer is just away from the site and then my apology is very much warranted, though he was replying to you and one of the other guy in the meanwhile so I don't know.

Jan 06 '22 21:01 bpecsek

No matter how nicely or what I ask about or for I don't get any kind of reply or action for something like 7 days. Not even a yes or no or a get out of here since then.

I would wait at least several more days before complaining. We just had a couple holidays in the last couple weeks and it's not uncommon for people to be taking extended vacations this time of the year so people might not be responding to things as quickly as usual. I've also got a pull request that hasn't had any responses or actions for six days so you're not the only one.

None of my code got excepted and are just sitting among the open issues and much slower codes are listed. But the funniest was that anyone using my code as the basic and modified it was also rejected pointing out that it was originally my code.

You did make a couple mistakes (which Isaac and I pointed out), called Isaac's results misleading and said he must have messed up something (I verified that the results were probably correct and that he didn't mess up anything), made an unreasonable request for him to benchmark on newer hardware with at least 8 GiB of memory (for a benchmark that was demonstrated to not even require much memory), were making a lot of comments/edits/new issues. It was definitely wasting a decent amount of his time and overall I could see how he considered it to be harassment. I do think it was wrong for him not to accept the code from others though unless he had a good reason to believe that they were just doing it on your behalf.

Jan 08 '22 18:01 JZerf

Jeremy,

When you see speed figures on about 15 more recent CPUs that are completely different to what is shown on a benchmark site using a, at that time, 12 year-old CPU you try to guess what is going on.

I guessed the lack of RAM as well not just the CPU. I might have been wrong there. I was indeed wrong there.

But my main point was that the 12 year-old CPU he used was absolutely not adequate for comparing languages and reporting speed figures when, I am quite certain, >95% of the developers had long forgotten about that CPU architecture.

As I mentioned I did spend a lot of time and bench-marked on about 15 more recent CPU's, to see if I was doing something wrong, ranging from something like 6-7 years-old to 1-2 year-old. The numbers I was getting were absolutely different to the ones on the site so I was trying to guess the reasons.

You quite rightly protected Isaac for using 2Gb RAM but if I remember correctly you did not do the same for the CPU.

When you do something and you wanna do it with credential you are not doing it with ancient tools. I don't feel that asking for a more recent CPU from an influential site maintainer was unreasonable request at all. I wasn't the only one requesting it. We were talking about a 12 YEAR-OLD CPU. It was a joke in my opinion to use it. 12 year in CPU technology is like half a century in a lot of technology fields. Quite a few developers stopped contributing to CLBG because of that. Just look at the age of the Common Lisp code base there and look at the ones here! How much speed improvement can be achieved on those codes. And mind you I am not even a programmer. I even wanted to send him a i7-7700 CPU that I had just replaced then but my account got cancelled so I did not post it.

The small CPU upgrade at CLBG and the relative speed figures on this site did prove my point.

I am still standing by it and in my opinion it wasn't a mistake to point out that the figures were very misleading to people. I have never accused him that he did anything wrong intentionally.
I actually started contributing to CLBG site since quite a few people I knew were refusing to consider Common Lisp for development and pointed me to the CLBG site stating that Common Lisp is just too slow to do development on it.

But anyway since I am not allowed to contribute there anymore it is just pointless to argue further about it, however, I sometimes connect back to CLBG to see what is going on there and what Isaac was doing with those other guys who might have been acted upon my request to improve and contribute my code was the most shameful thing I have seen from any site maintainer.

How shameful was it to punish someone whose only sin was to try to help someone else for sympathy?

I hope this site maintainer is indeed enjoying his well earned holiday and I cordially apologize to him for saying anything that might have offended him.

Jan 08 '22 20:01 bpecsek

You're missing the point. What if we agree to allow the hiring of companies but for some reason I have more money than you and can hire a better company so we still are faster than you? The point is that >> I << would not be doing the work so it wouldn't be right to say that >> I << am faster than you.

I am not missing the point. What if a sportsman, say a runner, have more money and hires a better trainer and uses better training facilities and becomes faster? He is abiding the rules and gets faster then you. It happens everywhere including software development.

Jan 08 '22 20:01 bpecsek

I am not missing the point. What if a sportsman, say a runner, have more money and hires a better trainer and use better training facilities and becomes faster? He is abiding the rules and gets faster then you.

I'm not sure if you're trying to support your argument or my argument here. In your example the sportsman is still doing the work, no one else is doing the work for him. Similarly I'm trying to argue that the programming language and its implementation should be doing most of the work for these benchmarks, not libraries.

Additionally if your sportsman is using a better training regimen, I should be allowed and capable of copying that training regimen to make things more fair. Similarly I'm OK with allowing programs to copy different algorithms (and I even argued that it should be allowed) as long as it's allowed for everyone to do it, feasible for everyone to do it, and is done without using a different programming language. I'm trying to argue that libraries should not be doing most of the work for these benchmarks since these are supposed to be benchmarks for programming languages and their implementations, not libraries that will likely be using different algorithms than one another and implemented in different programming languages.

P.S. I could comment a little bit more about your other comment regarding the CLBG but I'm going to ignore it so this issue doesn't get off track. If you want to continue that discussion, send me a message at https://JeremyZerfas.com/Contact.jsp .

Jan 09 '22 00:01 JZerf

I'm not sure if you're trying to support your argument or my argument here. In your example the sportsman is still doing the work, no one else is doing the work for him. Similarly I'm trying to argue that the programming language and its implementation should be doing most of the work for these benchmarks, not libraries.

I was referring to “have more money” part of you message and see how this wasn’t clear whose argument I was supporting:)

I fully understand your point and I also fully with you on, if allowed at all, only allow to use same language libraries without exception.

I can’t be though with you at the moment to altogether disallow using external libraries since one language might have lot of thing in the standard library that in the other might only be supported externally. My preferred language Common Lisp were standardized in 94 and all new language inventions and extensions can only be in external libraries.

And how should those languages be handled where even standard libraries are written in different language? It is a tricky field for sure.

Nevertheless, I could also accept your standing to limit the use of libraries and even support it as well if it is decided and can be done in a transparent and enforceable way.

This transparency and enforcement requirements though might be also not the easiest to do.

My only strong points are that the rules should be clear for everyone from the beginning and also enforceable without much room for argument. However, will it be possible?

Jan 09 '22 09:01 bpecsek

I can’t be though with you at the moment to altogether disallow using external libraries since one language might have lot of thing in the standard library that in the other might only be supported externally. My preferred language Common Lisp were standardized in 94 and all new language inventions and extensions can only be in external libraries.

I said it earlier but I'll say it again, I'm OK with allowing libraries to enable the use of primitive/basic type things (say like adding support for multithreading or SIMD) which may be important for performance but ultimately are not responsible for doing most of the work. What I'm not OK with is problems like the pidigits problem where often 90+% of the time is spent in arbitrary precision math libraries or the regex-redux problem where often 60+% of the time is spent in regular expression libraries.

And how should those languages be handled where even standard libraries are written in different language? It is a tricky field for sure.

That's fine just as long as the problems aren't making heavy usage of the libraries. Problems like fannkuch-redux, fasta, mandelbrot, nbody, and spectral-norm are good problems because they don't make heavy usage of libraries (pretty much the most complex things libraries are used for in these problems is doing fairly basic things like square roots and outputting data and they don't make up the vast majority of the work). They primarily have the program running lots of its own loops, conditionals statements, functions, etc... knucleotide is a bit borderline because of its heavy hash table usage but it still involves the program doing a decent amount of the work.

Nevertheless, I could also accept your standing to limit the use of libraries and even support it as well if it is decided and can be done in a transparent and enforceable way.

This transparency and enforcement requirements though might be also not the easiest to do.

My only strong points are that the rules should be clear for everyone from the beginning and also enforceable without much room for argument. However, will it be possible?

If you design the problems so that they don't require heavy library usage (and assuming programs are allowed to use different algorithms), then there shouldn't be a need for many rules and enforcement should be fairly easy. Pretty much the only rules you would need is that a submitted program can not have any code in a different programming language and it should only be allowed to use its own standard libraries and a few other approved libraries for enabling the use of primitive/basic type things.

Jan 10 '22 04:01 JZerf

What I'm not OK with is problems like the pidigits problem where often 90+% of the time is spent in arbitrary precision math libraries or the regex-redux problem where often 60+% of the time is spent in regular expression libraries.

But as far as I know those two benchmarks were specifically developed to test arbitrary precision math and regular expression operations. Of course most of the time is spent on those operations. Apart from Java and Common Lisp not many languages have arbitrary precision math built in not to mention regex so they have to rely on external libraries.

knucleotide is a bit borderline because of its heavy hash table usage but it still involves the program doing a decent amount of the work.

I don't think that use of hash table should be a problem at all.

Regarding your last paragraph I can relate to it but still feeling a bit uncomfortable to limit the benchmark problems to such an extent but could easily accept it if needed.

Jan 10 '22 16:01 bpecsek

But as far as I know those two benchmarks were specifically developed to test arbitrary precision math and regular expression operations.

That brings us back to my original issue, they are "Problems requiring heavy library usage" and they "might not be good for benchmarking." See my original comment for reasons why this is bad.

Apart from Java and Common Lisp not many languages have arbitrary precision math built in not to mention regex so they have to rely on external libraries.

Slight nitpick but from my count slightly over half the programming languages in these benchmarks now have built-in support for arbitrary precision math. I've pretty sure regular expression support is even better.

Jan 11 '22 00:01 JZerf

slightly over half the programming languages in these benchmarks now have built-in support for arbitrary precision math. I've pretty sure regular expression support is even better.

Slight nitpick but if there is such a strong built in language support for those than why do we need heavy library usage?:)

Someone has to decide what is allowed and what is not and then we have to stick to it. As I said again and again, if the rules are transparently enforceable then either of the decision is OK with me provided that, if allowed at all, only same language libraries to be allowed.

Jan 11 '22 07:01 bpecsek

slightly over half the programming languages in these benchmarks now have built-in support for arbitrary precision math. I've pretty sure regular expression support is even better.

Slight nitpick but if there is such a strong built in language support for those than why do we need heavy library usage?:)

We don't need heavy library usage (as demonstrated by some of the good problems I mentioned two comments ago) and we should avoid it. Although about half of the programming languages have built-in support for arbitrary precision math, most of the best performing programs for those languages on the CLBG pi-digits benchmark use the GMP C/assembly library instead. They're presumably using GMP because it is faster than the built-in arbitrary precision math code, it almost certainly wasn't done because it was easier or required less source code. Just goes to show that GMP is possibly the fastest performing arbitrary math precision library and if problems are made to require heavy library usage, then it will just turn those problems into benchmarks for comparing libraries instead of programming languages and their implementations.

Jan 12 '22 01:01 JZerf

But if we limit the use of libraries to same language ones only than the above problem just vanishes.

You have a strong enough argument and I can go with either decision, so let the site maintainer decide when he is back.

Jan 12 '22 10:01 bpecsek

But if we limit the use of libraries to same language ones only than the above problem just vanishes.

That wouldn't make all of the above problems vanish, it would only solve one part of the above problems. Multiple libraries made in the same programming language can and often will have different performance from each other due to the use of different algorithms and different code quality. I previously linked to some benchmarks from one of the PCRE developers and that alone shows how different algorithms in PCRE have different performance from one another depending on whether you use the basic, JIT, or DFA engines. It also shows how other C regex engines like TRE and Oniguruma have different performance.

It would also be impractical to limit the use of libraries to ones that are written in the same languages since some of the standard, built-in libraries for some language implementations are written in different programming languages. Chapel, GHC, Julia, and PHP usually/always include GMP for arbitrary precision math for instance. Even more implementations have arbitrary precision math code which is not written purely in the relevant programming language but instead uses a considerable amount of assembly or C/C++ code for doing the hard work.

Jan 13 '22 03:01 JZerf

I did get your point from the beginning but this logic leads us to not allowing the use of any libraries whatsoever since even the standard libraries if written in the same language or not can have wildly different performances.

So what would be your recommended solution? No library use at all not even the std libs? So all the problems should be defined to use some kind of common denominator approach, and if not possible, then if a particular language missing that certain feature than the programmer should reimplement it? And who would be the wise one to define or pick the problems and then control it in a transparent way acceptable by everyone.

As I reiterated several times I could go with any approach when clearly defined but what I foresee by this approach is an even bigger opportunity for disagreement.

Jan 13 '22 08:01 bpecsek

I did get your point from the beginning but this logic leads us to not allowing the use of any libraries whatsoever since even the standard libraries if written in the same language or not can have wildly different performances.

So what would be your recommended solution? No library use at all not even the std libs?

Remember that I keep on saying that HEAVY library usage is not good for benchmarking. I certainly don't think that ALL library usage should be disallowed, that would be somewhat crazy (it would make it harder to do input/output from the programs for one). In my opinion the problems should be designed so that probably no more than a third (and preferably less) of the CPU time is spent in libraries. The fannkuch-redux, fasta, mandelbrot, nbody, and spectral-norm problems are ones where I think this is mostly the case and why I think those are good problems (and it's probably also part of the reason why there hasn't been much controversy on the CLBG for these benchmarks as compared to the binary-trees, k-nucleotide, pi-digits, regex-redux, and reverse-complement problems which rely more on libraries). The problems should be designed to be more like those problems.

Jan 14 '22 00:01 JZerf

Dear Jeremy ( @JZerf ),

I hope you can fight out your ideas for a trustworthy pool of problems with clear and transparent set of criteria for acceptance.

I also hope, though very skeptical, that a clear and transparent specification for the maximum level of library use can also be agreed upon.

Good luck.

Kindest Regards and my Respect to You, Bela

Jan 16 '22 16:01 bpecsek