prawn
prawn copied to clipboard
Is it possible to add accessibility tags throughout the PDF?
Is it possible to add accessibility tags such as "paragraph" to text and alt text to images?
https://github.com/prawnpdf/prawn/issues/1077
We are using Prawn at the moment and with the new accessibility laws, need to find a solution that allows tagging our generated pdf files. Tagging a pdf would be a useful addition. 👍
@brandoncc @sakamies This can already be done but in a very time consuming way. Generally, it will be rather hard to do because Prawn has no notion of paragraphs, headers, and so on.
So you would probably need to do it yourself with something similar to what is shown in #1077 for the data and, additionally, you would need to insert special markup into the content stream for this to work.
As I said this is no small task.
Appreciate the response. Have to keep searching for a practical solution to this.
@gettalong if I was to do this task, do you possibly have any code or resources you could point me toward on the topic? I don't know anything at all about accessibility in PDF's, but I have been tasked with this so I am trying to figure out how we can achieve it.
I see on that linked issue that he sets data on the root, but how would I do the same on a specific page?
Also, you mentioned special markup that would be needed. Can you point me toward any resources that describe that markup so I can learn about it?
Thank you for the response!
@brandoncc If you wanna/have to do this, be prepared for much work and much too learn.
You should get yourself a copy of the PDF specification. The PDF 1.7 spec is freely available but if you have the possibility/money, get the newer PDF 2.0 spec because, as far as I heard, the chapters on tagged PDF should be expanded and better explained.
Once you have the spec, read up on sections (I reference the 1.7 spec) 14.8 "Tagged PDF" and 14.9 "Accessibility Support". After that have a look at Prawn's code (split between prawn and pdf-core) to see how it internally works with respect to: the PDF object structure (document object, page objects, references) - you might wanna read sections 7.3 and 7.7 for a better understanding. See https://github.com/prawnpdf/pdf-core/blob/master/lib/pdf/core/document_state.rb, https://github.com/prawnpdf/pdf-core/blob/master/lib/pdf/core/object_store.rb, https://github.com/prawnpdf/pdf-core/blob/master/lib/pdf/core/reference.rb, https://github.com/prawnpdf/prawn/blob/master/lib/prawn/document.rb#L203.
Next follow the code from Prawn from the #text method down to where the actual PDF instructions are written to get a grasp on what is involved. This is purely for you to see how Prawn implements things. Read up on section 7.8 and also 14.6 the latter which describes marked content operators needed for tags.
When you have a firm grasp on things, decide on how you want to tackle this. First thing that came into my mind: Defining #paragraph, #header, ... methods on Prawn's Document class (or better in a mixin that gets included, see https://github.com/prawnpdf/prawn/blob/master/lib/prawn/document.rb#L62. These would only provide the marked content operations and would need to be called from the appropriate places in your app. They would yield to a block which then actually defines the contents. Please note that I haven't read the spec in detail to know if this approach would work.
@brandoncc Ah, forgot to include: There are several good videos from the PDF association regarding tagged PDF and accessiblity, e.g. https://www.youtube.com/watch?v=Y0cEYt84sfE and https://www.youtube.com/watch?v=4xd2tpyaUwg.
Thank you so much for all of this wonderful direction. I sincerely appreciate it!
When you have a firm grasp on things, decide on how you want to tackle this. First thing that came into my mind: Defining
#paragraph,#header, ... methods on Prawn'sDocumentclass (or better in a mixin that gets included, see https://github.com/prawnpdf/prawn/blob/master/lib/prawn/document.rb#L62. These would only provide the marked content operations and would need to be called from the appropriate places in your app. They would yield to a block which then actually defines the contents. Please note that I haven't read the spec in detail to know if this approach would work.
Any chance these could be included in prawn?
@matthewford As far as I know there is no implementation of the idea, so there is nothing to include in Prawn.