openhtmltopdf
openhtmltopdf copied to clipboard
[#712] Optional Content (aka PDF Layers) implementation
This is an implementation of the missing Optional Content / PDF Layers functionality (#712).
In order to introduce the support to layers, I tried to leverage the existing code base as much as possible:
-
on input, the inclusion of HTML contents into layers is defined via non-inheritable extension CSS properties (
-fs-ocg-*
for contents belonging to optional content groups and-fs-ocm-*
for contents belonging to optional content memberships). NOTE: An alternative implementation could decouple these definitions from common rulesets to extension at-rules (@-fs-ocg
for optional content groups and@-fs-ocm
for optional content memberships), achieving a bit tidier stylesheets at the cost of dedicated structures beside existing standard at-rules (such as@font-face
and@import
rules) incom.openhtmltopdf.css.sheet.Stylesheet
. -
on output, during page painting, existing tagging calls (see
PdfBoxFastOutputDevice.startStructure(..)
andendStructure(..)
) are intercepted, because of their semantically-compatible granularity, to inject layers inside the content stream. To avoid ghost layer fragments (not all tagging calls wrap actual contents!), layer injection is lazily applied on actual content painting calls.
Layer types:
-
simple layer (aka group):
-
identity (
-fs-ocg-id
), which is used for reference (as parent of other groups or member of memberships). -
label (
-fs-ocg-label
), which maps toPDOptionalContentGroup.getName()
(seeName
entry in the Optional Content Group dictionary) and is displayed in the viewer's layer tree. -
visibility (
-fs-ocg-visibility={visible|hidden}
), which maps toPDOptionalContentProperties.isGroupEnabled(..)
(seeBaseState
,ON
,OFF
entries ofD
entry of the document's Optional Content Configuration dictionary). -
parent (
-fs-ocg-parent={%ocg-id%}
), which maps toOrder
entry ofD
entry of the document's Optional Content Configuration dictionary for nesting into the viewer's layer tree -- unfortunately, arbitrary nesting seems not to be natively supported by currently-used PDFBox version (2.0), as adding a group viaPDOptionalContentProperties.addGroup(..)
automatically builds a flat list insideOrder
entry instead.
-
-
compound layer (aka membership):
-
identity (
-fs-ocm-id
), which is used for internal reference. -
visibility policy (
-fs-ocm-visible={all-visible|all-hidden|any-visible|any-hidden}
), which maps toPDOptionalContentMembershipDictionary.getVisibilityPolicy()
(seeP
entry of Optional Content Membership dictionary). -
members (
-fs-ocm-ocgs={%ocg-id%...}
), which map toPDOptionalContentMembershipDictionary.getOCGs()
(seeOCGs
entry of Optional Content Membership dictionary).
-
For the sake of consistency, each content inherits the full layer hierarchy of its ancestor nodes. For example,
<div class="ocg2">
<p class="ocg1">This is a layered block inside another layered block (OCG 2/OCG 1).</p>
</div>
that paragraph element is rendered in the following way inside the content stream (NOTE: layer resource name assignment is an implementation detail internal to the PDF library (PDFBox); for clarity, here we assume that /oc2
maps to layer ocg2
and /oc1
maps to layer ocg1
):
/OC /oc2 BDC
/OC /oc1 BDC
. . .
(This is a layered content block inside another layered block \(OCG 2/OCG 1\)) Tj
. . .
EMC
EMC
[PR commit: 59c4701725978aeabefeab0cc80723debdffe471]
Here it is a demonstration of its use (see generating code below):
- Initial state (note that contents in layer "OCG 2" are hidden):
- All layers visible (note that contents in layer "OCG 2" are displayed):
- Membership's visibility policy (note that, hiding layers "OCG 2" and "OCG 3", the pink paragraph in case 9 is displayed):
Users can obviously toggle each layer interacting with their viewers.
Generated PDF: 712-ocg.pdf
Source HTML: 712-ocg.html
Generating code:
import java.io.File;
import java.io.FileOutputStream;
import java.io.OutputStream;
import org.apache.pdfbox.pdmodel.PDDocument;
import com.openhtmltopdf.pdfboxout.PdfBoxRenderer;
import com.openhtmltopdf.pdfboxout.PdfRendererBuilder;
public class LayerCase {
public static void main(String[] args) throws Exception {
try (PDDocument document = new PDDocument()) {
try (PdfBoxRenderer renderer = new PdfRendererBuilder()
.usePDDocument(document)
.testMode(true)
.withFile(new File("712-ocg.html"))
.buildPdfRenderer()) {
renderer.createPDFWithoutClosing();
}
try (OutputStream os = new FileOutputStream("712-ocg.pdf")) {
document.save(os);
}
}
}
}