[Question] How to export an untagged PDF?
I noted the PDF file exported via playwright is much larger than expected, I think it's because the playwright by default export the tagged PDF, which seems introduced since chrome 85.
However, the tagged PDF is dozens of times larger than the untagged one, which is not acceptable to us.
I found if I save a html page to PDF via Edge (or Chrome) browser manualy, if the Save as PDF is the default option and if I don't touch this drop down, then the browser creates a tagged PDF, but if I change the drop down to any other option, then change it back, then it will create a untagged PDF.
Inferred from this result, I think there must be an option inside Edge/Chrome/Chromium to determine whether generate tagged or untagged PDF.
Now the quesiton is, does any of you know what this option is? How can I export a untagged PDF always via playwright?


@heku Indeed, this place suggests that we always generate tagged pdfs.
@dgozman thx, however our app users dont think so, a 20M+ size pdf (80 pages) is not acceptable to them, as a comparsion, the untagged pdf is smaller than 1M. Even tagged pdf has its benifits, it is better to have a option for tagged/untagged.
See https://bugs.chromium.org/p/chromium/issues/detail?id=1309963 for upstream Chromium issue.
Thx, glad to know that the chromium team had noticed the issue, but I'm just curious why manually change the drop down to other option then back could lead to generate a untagged PDF, very interesting.
@dgozman upstream chromium added the --disable-pdf-tagging switch in version 104.0.5101.0.
Not super cool as I guess it's not an option that can be set when calling playwright's page.pdf() method.
Thx @kapouer
Hello @dgozman @kapouer , When using playwright to convert url/webpage/html to pdf it always creates tagged pdf even on passing (--disable-pdf-tagging) as an argument to chromium. Could you please let us know how can we generate untagged PDF using playwright?
I don't know. If you really need to optimize the pdf I suggest you use ghostscript on top of it. See how it's done in https://github.com/kapouer/express-dom-pdf