watermark icon indicating copy to clipboard operation
watermark copied to clipboard

[wip] initial crack at storing in metadata

Open bollwyvl opened this issue 10 years ago • 7 comments

This proposes an approach to #4.

I haven't updated the doc yet, as i got it working, and wanted some feedback.

  • adds a setup.py for easier developing with python setup.py develop
    • this can be made smoother
  • adds --output argument, which can take output (default), metadata or both
    • this is pretty awkward, open to renaming/changing behavior
  • changes everything to use yield
    • some magic naming could be nice?
  • changes string formatting a bit to be more ready for future things

Sorry there's so much in this one request... I can break it up into smaller pieces if you like!

bollwyvl avatar Sep 10 '15 06:09 bollwyvl

Wow, that was quick! Thanks a lot. I just skimmed over it and looks cool so far, but let me give it a more thorough look on the weekend and I will get back to you with some more feedback! One little request, can you please remove the docs/.ipynb_checkpoints/watermark-checkpoint.ipynb from the commit? I forgot to add .ipynb_checkpoints files to the .gitignore file when I set it up. Thanks a lot for the PR, I will get back to you soon!

rasbt avatar Sep 10 '15 18:09 rasbt

No worries, take your time!

Another approach to the string output would be to use jinja templating... it will always be there, because notebook :)

Also, everything being in a flat dictionary could introduce problems (i.e. if there was a package named machine). Putting it into a namespace/row structure would make it a little more clear.

One issue with doing javascript-based injection as i have done is that it won't work with something like runipynb... the other option would be to hack on the ipynb itself, but the kernel doesn't usually know that it is being used for a notebook... not sure if an environment variable gets set there.

I also explored allowing either notebook or cell metadata, but it's really hard to get a handle for the current cell from a particular output area, as the element hasn't actually been added to the DOM yet. However, nothing that a notebook does should change the output of watermarks (doing !pip install inside a notebook is pretty dirty), so multiple watermarks in a single notebook seem kind of silly, and it's easier to look at notebook.metadata.watermark than look through all of the cells.

bollwyvl avatar Sep 10 '15 20:09 bollwyvl

I just wanted to give a brief sign of life ... Sorry, I didn't get to it last weekend, but I will definitely check it out this weekend! There was a little delay with the final layouts of my book that I had to work on, but that's done, no excuses next time ;)

rasbt avatar Sep 17 '15 17:09 rasbt

Good luck with the book!

On 13:15, Thu, Sep 17, 2015 Sebastian Raschka [email protected] wrote:

I just wanted to give a brief sign of life ... Sorry, I didn't get to it last weekend, but I will definitely check it out this weekend! There was a little delay with the final layouts of my book that I had to work on, but that's done, no excuses next time ;)

— Reply to this email directly or view it on GitHub https://github.com/rasbt/watermark/pull/7#issuecomment-141155653.

bollwyvl avatar Sep 17 '15 17:09 bollwyvl

Thanks again for the pull request. Looks good to me so far :). Just a few comments:

  1. Thanks for adding *_checkpoints* to .gitignore and deleting .ipynb_checkpoints/*
  2. The setup.py is a good idea! Didn't make much changes after the initial version, but it comes in very handy now. Just a small note: Pls change the version number to 1.2.3dev the next time for more clarity.
  3. regarding the naming of of the --output argument. This is tricky indeed ;). I would suggest changing --output output to --output cell. Or even something like --write_to cell , --write_to metadata, and write_to meta+cell (both is shorter and more convenient, however a person who reads the notebook and is not familiar with the watermark extension may find it a little bit confusing without the context). Maybe also change "where to store the watermark (meta)data" to "where to store the generated watermark" to avoid confusion between "IPy nb metadata" and "watermark metadata."
  4. could you put the line @line_magic just above the def watermark()
  5. could you please remove the 'Authored by' from the --author flag. It's a good idea, however, it takes away some of the flexibility. E.g., if someone wants to put something like --author "Code by John Doe, reviewed by Jack Doe" or so.
  6. hm, maybe I am overlooking something, but it looks like _date_format property is not used if args.output is metadata. Is there a specific reason for that?

rasbt avatar Sep 21 '15 03:09 rasbt

I'll have a look at the other things, but on 6, the reasoning is that JSON dates should pretty much always be ISO 8601 formatted. This prevents having to parse the magic to figure out what date format was chosen, they can be reliably sorted, etc. Utc is best of all, as an omitted timezone makes the date much less portable.

On 23:34, Sun, Sep 20, 2015 Sebastian Raschka [email protected] wrote:

Thanks again for the pull request. Looks good to me so far :). Just a few comments:

  1. Thanks for adding _checkpoints to .gitignore and deleting .ipynb_checkpoints/*
  2. The setup.py is a good idea! Didn't make much changes after the initial version, but it comes in very handy now. Just a small note: Pls change the version number to 1.2.3dev the next time for more clarity.
  3. regarding the naming of of the --output argument. This is tricky indeed ;). I would suggest changing --output output to --output cell. Or even something like --write_to cell , --write_to metadata, and write_to meta+cell (both is shorter and more convenient, however a person who reads the notebook and is not familiar with the watermark extension may find it a little bit confusing without the context). Maybe also change "where to store the watermark (meta)data" to "where to store the generated watermark" to avoid confusion between "IPy nb metadata" and "watermark metadata."
  4. could you put the line @line_magic just above the def watermark()
  5. could you please remove the 'Authored by' from the --author flag. It's a good idea, however, it takes away some of the flexibility. E.g., if someone wants to put something like --author "Code by John Doe, reviewed by Jack Doe" or so.
  6. hm, maybe I am overlooking something, but it looks like _date_format property is not used if args.output is metadata. Is there a specific reason for that?

— Reply to this email directly or view it on GitHub https://github.com/rasbt/watermark/pull/7#issuecomment-141867975.

bollwyvl avatar Sep 21 '15 12:09 bollwyvl

but on 6, the reasoning is that JSON dates should pretty much always be ISO 8601 formatted.

Thanks for the clarification, makes total sense!

rasbt avatar Sep 21 '15 14:09 rasbt