datasets icon indicating copy to clipboard operation
datasets copied to clipboard

Auto-merge option for `convert-to-parquet`

Open klamike opened this issue 9 months ago • 3 comments

Feature request

Add a command-line option, e.g. --auto-merge-pull-request that enables automatic merging of the commits created by the convert-to-parquet tool.

Motivation

Large datasets may result in dozens of PRs due to the splitting mechanism. Each of these has to be manually accepted via the website.

Your contribution

Happy to look into submitting a PR if this is of interest to maintainers.

klamike avatar Apr 18 '25 16:04 klamike

Alternatively, there could be an option to switch from submitting PRs to just committing changes directly to main.

klamike avatar Apr 18 '25 16:04 klamike

Why not, I'd be in favor of --merge-pull-request to call HfApi().merge_pull_request() at the end of the conversion :) feel free to open a PR if you'd like

lhoestq avatar May 06 '25 13:05 lhoestq

#self-assign

klamike avatar May 06 '25 17:05 klamike

Closing since convert to parquet has been removed... https://github.com/huggingface/datasets/pull/7592#issuecomment-3073053138

klamike avatar Jul 18 '25 19:07 klamike