datasets
datasets copied to clipboard
Add `@transmit_format` in `flatten`
As suggested by @mariosasko in https://github.com/huggingface/datasets/pull/4411, we should include the @transmit_format decorator to flatten, rename_column, and rename_columns so as to ensure that the value of _format_columns in an ArrowDataset is properly updated.
Edit: according to @mariosasko comment below, the decorator @transmit_format doesn't handle column renaming, so it's done manually for those instead.
@mariosasko please let me know whether we need to include some sort of tests to make sure that the decorator is working as expected. Thanks! 🤗
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
Hi, thanks for working on this! Yes, please add (simple) tests so we can avoid any unexpected behavior in the future.
@transmit_format doesn't handle column renaming, so I removed it from rename_column and rename_columns and added a comment to explain this.
Oops, I thought this PR was already merged and deleted from the source repository, I'll be creating a new branch out of main so as to re-create this PR... My bad :weary: