zotfile
zotfile copied to clipboard
When extracting annotations with pdf.js non-ascii characters are replaced with other characters
Zotero: 5.0.95.3 Zotfile: 5.0.16
What am I trying to do: I've added annotations in preview (macOS default document viewer) that included, for example the word "jūdō", and then I use zotfile's extract annotations feature.
Result: The word "jūdō" from the above example was extracted as: "JkdM". The whole faulty annotation:
ÿþSee JkdM, KendM, KyudM etc. (note on p.74)
Original:
See Jūdō, Kendō, Kyudō etc.
What I expect: I expect the annotations to preserve the text, or alternatively, to allow me to select the proper encoding (probably UTF8 but different users might need other encodings).
I have a similar problem and I provide an example that can be reproduced. THis problem only occurs from time to time when extracting annotations from a pdf with zotfile,
Here is a simple example of this behavior with the pdf file from this link: https://www.scielo.br/j/rbef/a/j8y7vZt69DpS5kKYZWyV5Yz/?format=pdf&lang=pt
If I annotate the title: "Tradução comentada de um clássico de Copérnico", I get the following extracted annotation:
"Traduc òao comentada de um cl ¥assico de Cop ¥ernico" (Dias 2004:195)
Is there is a way to correct this behavior with pdf files that behaves this way with zotfile annotation extraction?
I use zotero 5.0.96.3 on archlinux