arrayfire-haskell icon indicating copy to clipboard operation
arrayfire-haskell copied to clipboard

ArrayFire benchmarks

Open dmjio opened this issue 4 years ago • 10 comments

Would be very nice to have comparisons of ArrayFire vs. libraries like hmatrix, accelerate, etc. This might even warrant its own package due to the difficulty in procuring all the dependencies.

dmjio avatar Nov 09 '19 20:11 dmjio

@lehins would this interest you as well? :)

dmjio avatar Nov 14 '19 21:11 dmjio

@dmjio Of course it would, but how did you know that? ;P

lehins avatar Nov 14 '19 21:11 lehins

@lehins saw your work on massiv :1st_place_medal:

dmjio avatar Nov 14 '19 23:11 dmjio

@lehins so would you be interested in potentially making a new repo w/ me that had massiv, hmatrix, arrayfire benchmarks (maybe accelerate, grenade too ?) Think that might be of interest to others as well.

dmjio avatar Nov 15 '19 15:11 dmjio

Adding @chessai

dmjio avatar Nov 15 '19 15:11 dmjio

@dmjio That is definitely something I'd be willing to put some effort in. I even tried starting a project that would compare performance of array libraries https://github.com/lehins/massiv-benchmarks For me it is driven by my work on massiv, of course, and desrire to ciompare it to others. That attempt ended in a couple of repa benchmarks and then stalled. This is a bit too much of a side project for a single person, so I certainly welcome your suggestion of collaboration on this.

How do you wanna do this, any thoughts, plans, ideas, etc.?

The way I'd start this is by figuring out administrative questions first:

  • Github account for the repo? A group?
  • Means of communication. Using a github issues isn't gonna work for live conversations, so something like a slack or gitter room should do.

Construct a plan

  • Figure out a set of libraries and modes to start with. The choice is quite diverse, even if you limit the set of libraries, to a short list that interests me
    • CPU (GHC native compilation single core): arrayfire, hmatrix, massiv, repa, vector (only functions on flat arrays)
    • CPU (GHC native compilation parallelized): arrayfire, massiv, repa
    • CPU (LLVM backend multi-core):
      • accelerate, arrayfire, massiv, repa
    • GPU: (here the list is pretty slim)
      • accelerate, arrayfire
  • Come up with an initial basic set of functions worth benchmarking. Something very small until we can solidify the overall structure.
  • Come up with packages structure for benchmarks. My idea is to have:
    • top level package, something like bench-arrays that generates the input data in a common format or if not feasible at least gathers common functionality
    • Separate package per library that is being benchmarks
  • Unified way to view the results of benchmarks. Criterion report probably will do at first
  • The hardest one, the source of truth for reports, we all run different hardware and different operating systems.

Last two don't need to be solved immediately. List of libraries can always be expended, but I think it would be good if we could start with just 2 or 3 tops. The initial set of functions and inputs to be benchmarked at first we can discuss later.

lehins avatar Nov 16 '19 00:11 lehins

@lehins this all sounds great. Regarding your questions.

How do you wanna do this, any thoughts, plans, ideas, etc.?

I'd say we try to contribute to the existing Data Haskell movement, and use their group to house this repo If one doesn't already exist. Since I think it would be largely beneficial to the Haskell community. So maybe we could cc @ocramz @sdiehl @chessai @NickSeagull and discuss how we can contribute.

Github account for the repo? A group?

Answered above, pending Data Haskell community response.

Means of communication. Using a github issues isn't gonna work for live conversations, so something like a slack or gitter room should do.

The seem to have a gitter.

Finally, I think I could really help by procuring all of the dependencies w/ nix into a mega-repo, and then also NixOps deployment scripts to AWS so we can run them there. AWS does support on-demand GPU instances. Can make a script to automatically create an instance, run it in a SystemD unit, upload the results to an S3 bucket, host that.

Regarding the actual benchmark suite, we could keep the hardware to whatever instance AWS, and start on Linux for now. I'd rather classify things by operation (successive matrix multiplies, convolutions, matrix decompositions), and make a histogram-like thing that shows the timings for things like massiv w/ LLVM, massiv w/ NCG. ArrayFire GPU, ArrayFire CPU. It'd be nice as well if everyone started from the same initial data set in memory. Anyways, those are my thoughts.

@NickSeagull, @lehins how does this sound ?

dmjio avatar Nov 16 '19 18:11 dmjio

I've been disconnected for a while of dataHaskell, but I'm sure that there'll be someone that will want to help with that. I'd ask in the Gitter channel 😄

NickSeagull avatar Nov 16 '19 19:11 NickSeagull

@dmjio happy to help! adding @Magalame to the thread

ocramz avatar Nov 16 '19 20:11 ocramz

Happy to help too! @dmjio to my knowledge DataHaskell has no up to date benchmark regarding array libraries, the most we have is a matrix library bencharmark. Regarding the structure of the benchmarks, based on my past experience, I strongly suggest we include memory benchmarking with weight along with time benchmarking with criterion.

Magalame avatar Nov 16 '19 21:11 Magalame