WasmBench icon indicating copy to clipboard operation
WasmBench copied to clipboard

More Googleable name for this repo

Open titzer opened this issue 3 years ago • 7 comments

This dataset of benchmarks is an absolute goldmine and I believe it needs to be known more widely. Unfortunately, its name is not very memorable or Googleable.

Is there another name that could be used that would be easier to find/google/talk about? Unfortunately "WasmBench" hides this repository's value in a bland name.

titzer avatar Jul 13 '22 17:07 titzer

I agree the name might not be perfect, but I'm not sure about renaming, since it is linked to and referenced in the paper. What do you think, @danleh ?

Creating a PR to add WasmBench to this list may be another idea to increase visibility.

hilbigan avatar Jul 15 '22 12:07 hilbigan

Thanks, Ben, for the suggestion. Any ideas for a better name? Would a longer name that includes "WasmBench" help, e.g., "WasmBench-WebAssembly-Benchmark"?

michaelpradel avatar Jul 15 '22 12:07 michaelpradel

I was thinking something even more unique. E.g. what about "Sola Wasm Binary Dataset [year]"?

As for not link rotting, maybe you could add the old repo (this one) as a git submodule? The big zips with the binaries in them are part of a GitHub release, and not "checked in" AFAICT? I am not sure where GitHub physically stores those.

titzer avatar Jul 15 '22 13:07 titzer

Creating a PR to add WasmBench to this list may be another idea to increase visibility.

@hilbigan Good idea! I submitted a PR to this list with a link to this repo and our paper.

I agree that the name is a bit generic. "Bench" also evokes the intuition of "performance testing" a bit much. Maybe something with dataset in it?

danleh avatar Jul 15 '22 13:07 danleh

@michaelpradel What might help discoverability is also adding a repo description. Could you add something like "A large dataset of real-world WebAssembly binaries, collected from the Web, GitHub, NPM and more sources. Useful for test data, for training machine learning models, or just for fun"?

danleh avatar Jul 15 '22 13:07 danleh

The big zips with the binaries in them are part of a GitHub release, and not "checked in" AFAICT? I am not sure where GitHub physically stores those.

That's right, they are too large to be under version control. I added two direct links to the dataset (full and filtered) in the beginning of the README, so they are easier accessible.

danleh avatar Jul 15 '22 13:07 danleh

@michaelpradel What might help discoverability is also adding a repo description. Could you add something like "A large dataset of real-world WebAssembly binaries, collected from the Web, GitHub, NPM and more sources. Useful for test data, for training machine learning models, or just for fun"?

Done.

michaelpradel avatar Jul 15 '22 15:07 michaelpradel