jupytext icon indicating copy to clipboard operation
jupytext copied to clipboard

Question: update .py files only

Open DLumi opened this issue 1 year ago • 2 comments

Hello. I've only started using this wonderful library today, so I probably still struggle to find some things. Namely, I don't really find a CLI option to update (only) paired .py files whenever I make changes in .ipynb. Yes, I know about the jupytext --sync --update $FilePath$ command, but it doesn't really do what I want.

So, why do I need it? I only work with .ipynb files directly (exclusively with DataSpell), and just wanna have some backup options if (or rather, when) something goes wrong. That's why I want jupytext to be triggered by DataSpell, and I use File Watcher to do that. Basically, any time the save action happens, the File Watcher runs a specified CLI command. Since DataSpell for some weird reason saves the .ipynb files every time you run a cell (and this is not considered to be an autosave, so cannot disable it), the File Watcher is triggered as well.

Why is that a problem? Well, if I use --sync with the File Watcher, and the cell takes some time to execute, the outputs are simply erased (even with --update flag provided), as the .ipynb file is updated as well. This is not a problem with a simple --to py:percent command, but I believe I might run into performance issues if I were to dump the whole notebook over and over again instead of just applying patches to it. So any advice on this would be appreciated.

DLumi avatar Jul 07 '22 09:07 DLumi

Hi @DLumi , thank you for sharing this. It is great to know that you are using file watchers in PyCharm, that might be helpful for users interested in #147 .

One quick remark, I think there is no point in using --update in collaboration with --sync (I will double check and add a warning on the CLI if confirmed).

If I understand well, you would like to --sync but only in one direction, as the action is triggered upon a modification of the .ipynb file, so you do expect that the .ipynb is newer. Is that correct? I think it would make sense to add this feature. Would an argument like --check-input-is-newer make sense (can you think of a better name?)? Should Jupytext trigger an error if the file passed as the argument is not newer than the paired file?

mwouts avatar Jul 13 '22 20:07 mwouts

I think there is no point in using --update in collaboration with --sync

If you omit --update then the outputs in .ipynb would be overwritten. Well, at least that is the behavior I experienced. Using --update does preserve the outputs, but only if the cell managed to output something before syncing actually takes place.

Regarding the actual implementation - I don't really know what would work best, as I didn't check the code under the hood. But for me the thing that would make sense would be something like an --paired (or, perpahs, --paired-to) flag. So the full command would be like jupytext --sync --paired-to $FilePath$, meaning "sync the contents of the input file with its pair (so only the pair is updated), no questions asked". Speaking of the timestamps and errors related to them - I have no idea, since I could see it both ways. But you can play around with it to see what works best.

DLumi avatar Jul 15 '22 03:07 DLumi