maxio icon indicating copy to clipboard operation
maxio copied to clipboard

Make python based exportNotebook script (also add support for mac and potentially windows)

Open mseri opened this issue 6 years ago • 21 comments

This allows to drop the dependency on pdfinfo and makes both pdftk and ghostscript optional, at the price of introducing a dependency on PyPDF2 (completely python based) and paramiko (almost purely python based).

I also believe that the python3-based script is easier to read and maintain. I think it could even allow to run this script on Windows, but I have not tested it.

If you are happy with this PR and such a drastic change, I can update the README. We could also add a setup.py file with an entrypoint for exportNotebook and rM2svg, and package the whole thing as a standalone pip-installable python3 package. I am happy to add it if you believe this is worthwhile.

There is space for further cleanup and unit-testing. Currently I have tested this on multiple notebooks using the latest osx and ununtu, and I am very happy with the results.

Closes https://github.com/reHackable/maxio/issues/21

mseri avatar Jun 14 '18 12:06 mseri

I have been testing this on multiple PDFs, I have some examples in which the offsets are coming out incorrect (happens also with the old pdftk-based implementation though). Unfortunately I cannot publish the PDFs where this is happening. Interestingly enough the issue does not show up with remt, so there may be some incorrect computation here (I wonder if it has to do with the fact that we comput the x offset but we never touch the y offset).

It is not a regression but I will try to make some tests to see if it can be fixed.

I still believe this PR is useful as remt is problematic to use on windows due to its dependency on cairo and poppler, while this new version does not have any unusual dependency

mseri avatar Jun 20 '18 18:06 mseri

I have tweaked the exportNotebook script to work with US letter paper size (a scale factor and vertical offset are needed for this, whereas a horizontal offset is needed for A4 paper size). I have some pdf files that it does funny things on, so it is not quite ready to share. I'll hopefully update soon.

Lspencer avatar Jul 26 '18 06:07 Lspencer

@edupont is there any interest in this python conversion PR?

mseri avatar Jul 26 '18 07:07 mseri

@Lspencer neat! We should probably make exportNotebook generic on the page format!

mseri avatar Jul 26 '18 07:07 mseri

@mseri I like the idea of packaging it into a Python application. Thank you very much for your efforts and sorry for the late reply. @Ameb @edupont What do you think? Can we do this within here or should we work on a seperate repository?

florian-wagner avatar Jul 26 '18 08:07 florian-wagner

An update on my script to handle A4 and USletter paper with ExportNotebook: I was testing with a student thesis chapter and needed a vertical offset (plus a scaling factor) instead of a horizontal offset. I got this working and then tested on 2 other files. Both did not work out of the box. I found a work-around for one (us gs to scale by 1 and offset by 0, and then it would scale and offset properly), but the other one would not scale, so I had to scale the rM lines to the original page size. The latter is OK, but edge comments are clipped sometimes. I am not the author of any of these pdfs, so I do not know all of the differences, and haven't been able to duplicate these results on pdf files I create. With the pdf file that would not scale, the gs routine I was using modified the mediaBox, but not the artBox, clipBox, etc. I'll try to clean up my script and then post it, but I would like to better understand why this is failing on some pdfs. I can try to post a MWE but need these files to be my own (not student work), so I still am huntign around to recreate this problem.

LDSpencer avatar Jul 30 '18 17:07 LDSpencer

It may be useful to create a repository with a set of pairs (lines, PDF) to test, maybe even be included as a test suite ?

trou avatar Aug 01 '18 11:08 trou

@trou that would be great!

mseri avatar Aug 07 '18 18:08 mseri

I've hacked the rM2svg script to work for most (all?) PDF annotations with one minor caveat: annotations near the right or bottom edges of the tablet may be lost depending on the aspect ratio of the original PDF document (due to how the reMarkable maps the PDF to the tablet's 4:3 aspect ratio). The script now takes --width and --height arguments which should correspond to the original PDF document's geometry and the annotations are mapped accordingly. I've attached a test file I used for playing around with the updated script. Note loss of arrowhead on the right end of the top horizontal line. screendump-20180928143204

ericsfraga avatar Sep 28 '18 13:09 ericsfraga

@ericsfraga where can I find your updated script and the test file? I'm willing to play around with it

niklasb avatar Oct 22 '18 13:10 niklasb

@ericsfraga would you mind making a PR to my make-python-based with the fix?

mseri avatar Oct 22 '18 13:10 mseri

The modified python script is here along with the test file here. The script now expects a width and a height which you can extract from the PDF using pdfinfo. I have a shell script that uses rM2svg to create annotated PDF files.

The shell script expects a .lines file as the argument. It then looks for a .pdf with the same base name. If it finds the file, it extracts width and height from that file and then creates an annotated file. If no PDF is found, it assumes that the .lines file refers to a notebook on the remarkable and so simply creates a PDF from the lines file directly.

I've been using my shell script with the updated rM2svg script in anger for a few weeks now and it's working very well.

Also, my shell script will also incorporate any template that was used for notebook documents. It does assume the templates have been copied over from the remarkable tablet itself.

ericsfraga avatar Oct 22 '18 13:10 ericsfraga

@ericsfraga I tried your rM2svg script and it worked well with my letter-format test file. I'm not sure what the purpose of your hardcoded 2.3 constant is. I would suggest a slightly simpler version that just wraps everything in a <g transform> tag: https://gist.github.com/niklasb/8d4da7933b4eace54cd2c836ef62117e I've never tried to use it with multi-page documents though. It's honestly a mystery to me how the upstream PDF export can still be this buggy. Thank you @edupont and others for making these scripts.

niklasb avatar Oct 22 '18 14:10 niklasb

Ah, the 2.3 was a holdover from when I was manually playing with A4 size paper and comes from the ratio of 1404 to the width of an A4 page. It should probably be calculated directly as the 1404 divided by actual target width. This does assume that the height ratio is similar... bit of a hack, I guess.

Regarding your suggestion about wrapping within an appropriate tag, I'm sure that's probably the right way to go about it but I sure do not have the expertise in SVG...

In any case, I'm glad it worked with your letter format page as well!

And why the default export mechanism is so bad... ummm, no idea really.

ericsfraga avatar Oct 22 '18 15:10 ericsfraga

The correct "algorithm" seems to be to scale down the page to the correct width keeping the aspect ratio, and then truncate/extend the SVG so that it also has the correct target height. I tried to do this in my code.

niklasb avatar Oct 22 '18 17:10 niklasb

The correct "algorithm" seems to be to scale down the page to the correct width keeping the aspect ratio, and then truncate/extend the SVG so that it also has the correct target height. I tried to do this in my code.

On Mon, Oct 22, 2018 at 5:26 PM Eric S Fraga [email protected] wrote:

Ah, the 2.3 was a holdover from when I was manually playing with A4 size paper and comes from the ratio of 1440 to the width of an A4 page. It should probably be calculated directly as the 1440 divided by actual target width. This does assume that the height ratio is similar... bit of a hack, I guess.

In any case, I'm glad it worked with your letter format page as well!

And why the default export mechanism is so bad... ummm, no idea really.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/reHackable/maxio/pull/22#issuecomment-431868383, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFyY9FhRIy1gROJkh9IwNAB4ZYITOqRks5uneOugaJpZM4Un3Y_ .

niklasb avatar Oct 22 '18 17:10 niklasb

@mseri Not sure if mixing this with the prepare_background functionality you have makes sense. In fact the '{translate -offset 0}' ghostscript command does not work at all for the documents I was trying to export, hence I switched over to generating the appropriate foreground image in the first place.

niklasb avatar Oct 22 '18 18:10 niklasb

I like the idea of generating the "right" foreground so that there are as few rescaling as possible in the document merge. It does not seem that this will be merged anytime soon, I'll play around with it and see if it resolves some of the issues I've found

mseri avatar Oct 22 '18 18:10 mseri

This has gotten stale. I am starting working on the version supporting the new format, integrating @ericsfraga changes as well. Would it make sense to just fork this. It seems to have become unmaintained

mseri avatar Apr 11 '19 08:04 mseri

I cannot speak for the original author but do let us know if you do fork this. I am waiting impatiently for the latest upgrade to the system to see if things break again; I don't expect they will but you never know!

ericsfraga avatar Apr 11 '19 11:04 ericsfraga

I will update the code to use the new rm2svg for .rm files, generate the appropriate foreground, and move everything to a separate repo. My hopes are for all this functionality to be integrated directly in rmapi (as it seems to be happening), but I want to have working colorful export for myself in the meanwhile. I'll ping here in some weeks, as soon as I have done it

mseri avatar Apr 11 '19 14:04 mseri