tectonic
tectonic copied to clipboard
Add support for `pdfx`
Description
The pdfx
allows to produce PDF/X and PDF/A, yet it requires passing special options to compile correctly in XeTeX-based engines. The issue is well-known and documented in the package manual (see 3.1.1. Limitations using XeLATEX), so documents can be compiled with:
$ xelatex -shell-escape -output-driver="xdvipdfmx -z 0" <filename>.tex
Currently, people who want to produce PDF/X and PDF/A should try their luck with another package like hyperxmp
. People who want to use pdfx
are limited to TeX Live's XeLaTeX, adding extra flags and options or just use LuaLaTeX, where no special flags are needed.
https://github.com/tectonic-typesetting/tectonic/pull/708 added support for -Z shell-escape
, which put us halfway there, but tectonic only produces the extended XDV format https://github.com/tectonic-typesetting/tectonic/issues/824#issuecomment-941862219. Is there a way that tectonic behaviour could be modified or a new unstable flag to support this?
Steps to reproduce
test-pdfx.tex
:
\documentclass{report}
\usepackage{pdfx}
\begin{document}
This is a test for pdfx.
\end{document}
Compile with:
$ tectonic -p test-pdfx.tex
Error message
error: pdfx.sty:1285: Package pdfx Error: CreationDate is not properly supported;
Please note that tectonic
is no different from normal xetex
- this TeX engine produces only .xdv
(extended DVI) files by itself. But usually behind the scens the file is automatically processed by xdvipdfmx
to produce a PDF file.
It seems that the pdfx
package itself hasn't been updated for new XeTeX versions, since XeTeX since TeX Live 2019 has the primitive \creationdate
. That means that we can easily get around the -shell-escape
part, since that is actually used to write and execute a Lua file (with texlua
, which isn't provided with tectonic
) just to get the current date...
\let\pdfcreationdate=\creationdate
\documentclass{report}
\usepackage{pdfx}
\begin{document}
This is a test for pdfx.
\end{document}
Now, the xdvipdfmx -z 0
disables PDF compression for all of the PDF file, which is kind of overkill if it is needed only by one object (I don't know if this option propagates also to images, etc.). But! It seems that dvipdfmx
already accommodates this case and doesn't compress XMP metadata! Already since TeX Live 2016. Which means that the above should work automagically even with tectonic
.
Unfortunately I get this:
Running xdvipdfmx ...
[1
warning: File "sRGB.icc" not found.
warning: File "sRGB.icc" not found.
warning: Interpreting special command fstream (pdf:) failed.
warning: Interpreting special command fstream (pdf:) failed.
warning: >> at page="1" position="(133.768, 667.198)" (in PDF)
warning: >> at page="1" position="(133.768, 667.198)" (in PDF)
warning: >> xxx "pdf:fstream @colorprofile (sRGB.icc) <</N 3 /Alternate/Devic..."
warning: >> xxx "pdf:fstream @colorprofile (sRGB.icc) <</N 3 /Alternate/Devic..."
warning: >> Reading special command stopped around >> <</N 3 /Alternate/DeviceRGB>><<
warning: >> Reading special command stopped around >> <</N 3 /Alternate/DeviceRGB>><<
warning: Could not find any valid object.
warning: Could not find any valid object.
warning: File "pdfa.xmpi" not found.
warning: File "pdfa.xmpi" not found.
warning: Interpreting special command fstream (pdf:) failed.
warning: Interpreting special command fstream (pdf:) failed.
warning: >> at page="1" position="(133.768, 667.198)" (in PDF)
warning: >> at page="1" position="(133.768, 667.198)" (in PDF)
warning: >> xxx "pdf:fstream @pdfx@Metadata (pdfa.xmpi) << /Type /Metadata /S..."
warning: >> xxx "pdf:fstream @pdfx@Metadata (pdfa.xmpi) << /Type /Metadata /S..."
warning: >> Reading special command stopped around >> << /Type /Metadata /Subtype /XML >> <<
warning: >> Reading special command stopped around >> << /Type /Metadata /Subtype /XML >> <<
]
warning: Object @colorprofile used, but not defined. Replaced by null.
warning: Object @colorprofile used, but not defined. Replaced by null.
warning: Object @pdfx@Metadata used, but not defined. Replaced by null.
warning: Object @pdfx@Metadata used, but not defined. Replaced by null.
Writing `test-pdfx.pdf` (4.80 KiB)
Writing `pdfa.xmpi` (4.92 KiB)
sRGB.icc
is already known and downloaded by tectonic
, I don't know why it can't find it. pdfa.xmpi
also seems like file finding issue, although this one is generated by the TeX process and I don't know how the intermediate files are kept between xetex
/ xdvipdfmx
passes in tectonic
.
@mnrvwl You can atleast try to submit PDFs generated by xelatex test-pdfx.tex
from TeX Live to some validators, to see if getting the above errors fixed in tectonic
makes pdfx
usable for you.
Hm, this is interesting! In my day job I was just having a conversation about the difficulty of producing PDF/A files using freely-available tools, which is an issue for at least some researchers whose funders require them to deposit PDF/A versions of their articles in archival repositories.
If the issue is disabling compression in the xdvipdfmx
stage, that is very fixable — this mode is extensively exercised in Tectonic's test framework because compression results are not typically portable between platforms. I don't think that we currently have a way to enable this mode in the V1 or V2 interfaces, but it would not be hard to add.
@vlasakm intermediate files are all kept between the xetex
and xdvipdfmx
passes, so that shouldn't be an issue. I note that the pdfa.xmpi
file is being written at the end of the xdvipdfmx pass ... do we need to run it twice??
Finally, to provide a truly smooth user experience, it sounds like we would need to patch the pdfx
package, but there are more and more cases arising where I think that we are going to need to patch up the TeXLive sources, so that isn't a huge issue in and of itself.
In this case, the pdfx
package should be updated in upstream, its suggestions for ordinary XeLaTeX users are outdated, as is its \pdfcreationdate
primitive check. If we wanted to help in the transition time, we could just do the \let\pdfcreationdate=\creationdate
internally in the engine.
The uncompressed PDF while not necessary here, would be a nice option in general, although I am not sure what form should it have. Seems that just figuring out how to get the user's preference instead of false
like I have below should be all..
--- a/src/driver.rs
+++ b/src/driver.rs
@@ -1820,6 +1820,8 @@ impl ProcessingSession {
engine.build_date(self.build_date);
+ engine.enable_compression(false);
+
if let Some(ref ps) = self.unstables.paper_size {
engine.paper_spec(ps.clone());
}
Yeah. I am guessing that if pdfx
upstream is that out-of-date, it probably won't be very responsive to proposed updates, but that might not be a correct assumption.
It would be straightforward to add a compression flag to Tectonic.toml
. But, depending on what other settings need to be modified in the processing, it might make more sense to add a pdf/a
output format that not only disables compression, but also adjusts everything else at the same time.
Hm, I checked and found a report of two bug to pdfx
to the pdftex
mailing list. No reply from the maintainers. So probably no big chances of merging docs / code fixes.
To be honest I never needed PDF/A or PDF/X so I am not sure about the details, but I imagine as with the accessibility (tagging) features this seems to me that would really benefit from LuaTeX, i.e. more thorough approach, rather than the limited possibilities offered by TeX macros.
And I don't see reason for efforts to improve Tectonic's specific abilities to output PDF format variations like PDF/A and PDF/X. If you mean pdf/a
as in tectonic
tries to help with reproducibility, then yes, maybe but that should be more about other things, not about compression and having it as a separate output format doesn't seem right to me.
I just tried on a simple example, and the disabling of PDF compression seems to disable compression for everything. Due to how compression works in PDF, this also means uncompressed images. Yes this means raw uncompressed pixel data, e.g. my 162K PNG screenshot results in a 6.0 MiB PDF file (1920*1080*3 bytes for the image). Normally this would also apply to JPG files, but dvipdfmx
seems to handle them differently (so they are kept compressed).
AFAICT #953 will solve the file loading issue. So the following compiles correctly:
% test-pdfx.tex
\let\pdfcreationdate=\creationdate
\documentclass{report}
\usepackage{pdfx}
\begin{document}
This is a test for pdfx.
\end{document}
$ target/debug/tectonic test-pdfx.tex
Running TeX ...
Rerunning TeX because "test-pdfx.aux" changed ...
Running xdvipdfmx ...
warning: 1024-byte read failed
caused by: failed to fill whole buffer
warning: Could not find any valid object.
warning: 1024-byte read failed
caused by: failed to fill whole buffer
Writing `test-pdfx.pdf` (11.64 KiB)
Skipped writing 3 intermediate files (use --keep-intermediates to keep them)
The problem with no support for XeTeX's \creationdate
is in upstream pdfx
. Though we may patch either the engine or bundle in some capacity, upstream fix would be preferable IMO. The code concerned with \creationdate
in pdfx
is more tricky than I am willing to get into now. The right person to contact seems to be Ross Moore.
Hey @vlasakm thanks a lot for researching this! You always provide invaluable insight 🖤
I 100% agree pdfx
should get an update or two upstream.
Definitely \let\pdfcreationdate=\creationdate
fixes the file loading issue. I made a wrapper function in my code using latex3's \sys_if_engine_xetex:T
primitive that injects the \let
on a successful XeTeX-based engine detection, all good on that front.
I got a bit of time to provide more info on how I use pdfx
:
After being loaded, one generally passes options to the package to select a particular PDF/A or PDF/X specification, version of the standard, variant, etc. These can be passed using \PassOptionsToPackage
or the usual inlining: \usepackage[a-1b]{pdfx}
to make a PDF/A-1b or \usepackage[x-1a][pdfx]
to produce a PDF/X1a file (See docs).
Then I use a validator like Adobe Acrobat's Preflight feature. Admittedly it may not be complete and it doesn't even have all PDF/A or PDF/X versions but let's hope —crossing fingers— that the same guys who made the format are also validating it correctly 🤞🏻
I tested your suggestion with xelatex test-pdfx.tex
and while the validator passes on XeTeX, it fails on Tectonic using said Preflight feature (picture). It seems that currently xelatex
and tectonic
are producing slightly different outputs.
I believe is possible to generate apparently legitimate PDF/A and PDF/X using a combination of other means: hyperxmp
to add the metadata, xcolor
to use ICC and adding the entry manually. Whether pdfx
or any of the others could get to false positives that don't really comply I can't tell, because I haven't studied the standards in depth. The pdfx
one seemed a comfortable package but I'll check if I can come up with another way that involves multiple packages, manual settings, etc.
MWE
\let\pdfcreationdate=\creationdate
\documentclass{report}
\usepackage[a-1b]{pdfx}
\begin{document}
This is a test for pdfx.
\end{document}
How to reproduce
- Analyze the PDF with Adobe Acrobat's
Standards > Preflight
Attachments
data:image/s3,"s3://crabby-images/0efe2/0efe22c80ecefc83ebae80776e296a25d77fe11c" alt="tectonic-pdf:a-1b"
Very interesting discussion and great work on this topic! I'm afraid that I just don't have the hours in the day to help solve this problem, but if you have any questions for me I will do my best to answer them.