pmlb
pmlb copied to clipboard
Allow dataset_names to update without package re-release
Currently, metadata about the datasets like df_summary and dataset_names are static. This means each time a dataset is changed/added, we would need to re-release the package so everything is updated. I propose we make these into functions that fetch related metadata in real time (perhaps with a session cache). I already made this change in pmlbr https://github.com/EpistasisLab/pmlbr/pull/5 (new release coming on CRAN in a day or two) but I'll leave the python implementation for someone else with more expertise. 🙏🏽 @lacava @weixuanfu
https://github.com/EpistasisLab/pmlb/blob/7c1f4bdc00136dc2e55c87fa6b8ba6e8af6d1a68/pmlb/dataset_lists.py#L29-L32