Kevin M Jablonka issues

Results 336 issues of


                                            Kevin M Jablonka

record screencast on how to run benchmark

this is a bit conditional on what our timeline for refactoring the codebase is

documentation

simplify report

If we limit ourselves to one question per file, stuff will be a lot simpler it seems

Refactor the creation of multiple-choice options into utility

this seems to be reused in many places _Originally posted by @kjappelbaum in https://github.com/lamalab-org/chem-bench/pull/435#discussion_r1703126065_

interactive leaderboard based on datatable

it would be nice if we could filter based on the tags and in this way look at very many different subsets

Document how to obtain a final report

- What scripts to I have to run to obtain a final report? - Why does `all_correct` change between reports?

documentation

make link to docs more prominent

- [ ] ideally also a link on chembench.org - [ ] there should also be one in the README

generate more meaningful UUIDs for report directories

- UUIDs in the style of what we have for the chem-bench app might be nice (or what wandb does) - https://github.com/nebbles/hruid-python - https://github.com/orf/human_id - If we do datetime +...

are our tokenizers initalized correctly?

perhaps not for batch inference

low-priority

add note about gated repository to readme/docs

in readme it is not clear which `main.py` is supposed to be run

which folder?

documentation