htmlparser-benchmark
htmlparser-benchmark copied to clipboard
New Benchmark
This repository started as a fork from this repo (It doesn't share much code anymore, except from the original HTMLs),
I tried to make it more accurate and also added RAM usage as a metric.
If anyone is interested: https://github.com/victornpb/benchmark-html-parser-libraries
@victornpb Are you interested in adding another parser (see node-html-parser)? I added in my pr for Andreas' repo already. You could see my pr here.
sure, ill take a look
Hi @victornpb, very cool project! I just revived this one again, before seeing yours. My main goal was to automatically update results whenever a new version is released, which would prevent things from becoming stale again. All the code is here, but unfortunately @AndreasMadsen would have to enable auto-merges to make it actually work 🙈
It would be super cool to have a benchmark that actually updates itself, without manual intervention. Are you interested in making that a reality with your parser? If so, I'm happy to link to it from the htmlparser2 README.
@fb55 Happy to enable anything needed or transfer the repository etc, although automatic merging seems like it could be abused. I'm obviously not maintaining this. Let me know what you would prefer.
@fb55 I would also like to make this happen. Can you explain how the setup works? Are you using GitHub actions / circle code or something similar?
Thanks for getting back to me, both of you!
The flow I've implemented here is:
- Dependabot will open a pull request whenever there is a new version of a dependency of this repo; see https://github.com/AndreasMadsen/htmlparser-benchmark/blob/e78cd8fc6c2adac08deedd4f274c33537451186b/.github/dependabot.yml
- Dependabot PRs will automatically be merged (*) into the repo; see https://github.com/AndreasMadsen/htmlparser-benchmark/blob/e78cd8fc6c2adac08deedd4f274c33537451186b/.github/workflows/dependabot-automerge.yml
- Whenever a commit is written to the
masterbranch, the benchmark will run and results will be written to thestats.txtfile; see https://github.com/AndreasMadsen/htmlparser-benchmark/blob/e78cd8fc6c2adac08deedd4f274c33537451186b/.github/workflows/nodejs-benchmark.yml
(*) This requires branch protections to be enabled for the master branch, and auto-merge to be enabled for the repo.
The one thing I didn't think of, and which will break this flow, is that GitHub Actions cannot trigger other GitHub Actions; merged Dependabot PRs will not actually update results. That means that (3) will have to be changed:
- On all PRs (& pushes), run benchmarks and add results to the PR.
@fb55 I would be fine with auto-merging if it could be restricted to only Dependabot. However, that does not appear possible. The problem with allowing auto-merging without "Require pull request reviews before merging", is that it becomes an easy attack vector for arbitrary code execution. Essentially anybody that passes a simple check can add their own .github/workflows, modify the benchmark script, add a malicious dependency, etc.
If this a compromise you are fine with, I suggest we just transfer the repository to one of your accounts.