How not to lose meta tags when deleting bookmarks
Hi,
if i use qpdf --empty --linearize --pages cover.pdf infile.pdf 43-85 -- output1.pdf all Metatags are gone.
The title, author and the document language
How can I keep this information? I have to delete the bookmarks without deleting metatags.
Best regards
Christoph
There isn't an easy way to do this with qpdf right now. It may be possible to do with qpdf json by manipulating the files at a low level, but it would require fairly deep knowledge of PDF.
This will be possible to do once I get through my "pages epic" which I will start on when I get through my current significant project. The pages epic has been in my head for years and is next up for qpdf.
Note that qpdf preserves metadata and outlines and everything else from the "main" file, so with qpdf in its current form, there is no way from the cli to preserve metadata from all the files. If there's one file you want to preserve metadata from, list that first. For example, qpdf cover.pdf --linearize --pages . infile.pdf 43-85 -- output1.pdf would preserve everything from cover.pdf but would drop metadata from infile.pdf. Or you could do qpdf infile.pdf --linearize --pages cover.pdf . 43-85 -- output1.pdf to preserve stuff from infile.pdf. (The . inside the --pages option is short-hand for the main input file. You can also just repeat the name of the input file.)
At this point, you will have a linearized file but it will still contain bookmarks from the file that you preserved metadata from. If you want to remove bookmarks, you could try this. This assumes you have jq installed.
#!/bin/bash
set -e
infile=$1
outfile=$2
root=$(qpdf --json-output $infile - | jq -r '.qpdf[1].trailer.value."/Root"')
qpdf --json-output $infile - --json-object="$root" | \
jq '.qpdf[1]."obj:'$root'".value."/Outlines" = null' > $infile-update.json
qpdf --linearize --update-from-json=$infile-update.json $infile $outfile
Run the above script with output1.pdf output2.pdf as options. You can drop --linearize from the first command since it is specified in this script. The script basically just snips out the bookmarks and reruns the result through qpdf to prune unused objects and re-linearize the file.
Hopefully this can help in the interim.
Thank you very much, this is a good idea. I can add the metadata to the cover file. Then I everything is ok. Best regards Christoph
This issue is part of the qpdf pages epic. If you are interested in following, please see #1104.