ICML writer: add FirstParagraph and Bibliography styles.
Closes #11268.
Not sure this is the right approach. Would it be better to add FirstParagraph in addition to Paragraph? (And similarly for Bibliography?) If the styles are nestable in this way (I don't know a thing about ICML), then this would be less disruptive, as people who have customized the Paragraph style would not need to do anything special when using the new writer.
Alternatively, perhaps we could have the style define the FirstParagraph and Bibliography styles in terms of Paragraph. (But this is less ideal for various reasons.)
Nice! But I don't quite understand how the "in addition" part would work (I don't know enough about XML). I also don't know if ICML or Indesign would support that as in Indesign one can choose only one "based on" style per style. Also there is only one style per paragraph in Indesign.
Good point about disruption. It is true that previous suggestion might break layouts before designer assigns FirstParagraph to be based on Paragraph.
But I did a quick test to see how Indesign references parent styles. It seems that FirstParagraph and Bibliography styles can be based on Paragraph like this:
<BasedOn type="object">ParagraphStyle/Paragraph</BasedOn> (replaces $ID/NormalParagraphStyle)
Attached example does this and loads as expected into Indesign. When relinking already placed ICML to a new version the following happens:
- if style exists the new imported style does not override it
- if style does not exist it is added to the style list
I guess doing it like this (by basing FirstParagraph and Bibliography on Paragraph) would mean that everything works as before. In old layouts the additional styles would appear in style list and can be defined from there or completely ignored.
Yes, we could do it this way, but it would require some futzing with the way styles are now generated.
What about the other option: simply assigning both Paragraph and FirstParagraph styles? Does that work? Can styles override each other like in CSS?
I don't think multiple styles can be added to same element. In ICML paragraph style is set with attribute and to my understanding there can be only one value per XML attribute. As the value is a string adding new items to it would make it a different string. Trying to add more "AppliedParagraphStyle" key-value pairs to an ParagraphStyleRange element generates error message "Duplicate attribute".
But nested definitions like this do seem to work if these are better match for pandoc:
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/Paragraph">
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/FirstParagraph">
<CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
<Content>First paragraph.</Content>
</CharacterStyleRange>
</ParagraphStyleRange>
</ParagraphStyleRange>
We do use multiple styles, though. E.g. for block quotes.
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/Blockquote > Paragraph">
<CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
<Content>hi</Content>
</CharacterStyleRange>
</ParagraphStyleRange>
Hmm I see (and maybe finally understand too), but isn't that just a single style which is named to look like two styles from XML point of view? There is a specific style with matching name in ICML root styles section:
<ParagraphStyle Self="ParagraphStyle/Blockquote > Paragraph" Name="Blockquote > Paragraph" LeftIndent="10">
<Properties>
<BasedOn type="object">$ID/NormalParagraphStyle</BasedOn>
</Properties>
</ParagraphStyle>
Pandoc's multiple styles are a single style in ICML (a key-value pair). Every used style needs to also exist in ICML's root styles section to work. If e.g. "ParagraphStyle/Paragraph > first" does not exist there it is undefined and paragraphs referring to it will default to [Basic Paragraph] in Indesign. Indesign does not do style combinations from a list of styles. It reads the "list of styles" as a specific style name. Because of this these styles look slightly weird: "Blockquote > Paragraph" instead of "Blockquote" (styles can be renamed but if linked ICML is updated the old styles appear again and renamed styles go unused).
So while "Name" and "Self" attribute values can be a list of styles that list is a single string identifier in ICML. Working example that mimics this is attached.
It would be nice if the new first paragraph root style would have BasedOn tag with "ParagraphStyle/Paragraph" to keep existing layouts the same (as in attached example ICML). But for me getting first paragraphs and bibliography tagged is more important as that change is quick to do in Indesign.
Hm. Wouldn't we also need to say whatBlockquote > FirstParagraph is based on? (And so on?)
Hmm true, I didn't think of that. But yes, if those styles are added it would be preferable to base their style to the element's basic style. Then change would be invisible in existing layouts with linked content (and ready for styling as a bonus as these new styles just appear in Indesign's style list). It also makes sense to have them (like for styling the first paragraph of a blockquote).
If every block has a first paragraph can it create a long list of emitted styles in ICML with nested content (I've no clue about pandoc's nesting levels or rules)? E.g. BlockQuote > FirstParagraph > BlockQuote > FirstParagraph > List > FirstItem etc? That might look little messy in Indesign.
Block quotes can't be nested inside paragraphs. So you should only have FirstParagraph at the end of one of these sequences.
That sounds fine! Is this a complicated to do? Bibliography paragraphs probably have no need for "first" style and maybe there are others than can be omitted.
OK, I have implemented a system where FirstParagraph is based on Paragraph x > y > FirstParagraph is based on x > y > Paragraph etc. Please test this thoroughly!
Thanks, I managed to compile it and it works! I tried with ≈160k character document and at least no contents were lost compared to ordinary pandoc.
I'm wondering what are the best practises for style naming and generation.
Current version emits FirstParagraph to every style including custom styles. This can lead to somewhat messy set of styles compared to e.g. docx export. Docx styles are also named differently (e.g. "Body Text" and "First Paragraph" vs "Paragraph" and "FirstParagraph") but I guess that pandoc does not have codified style names for different writers?
Currently bullet list styles have their first item style named as "BulList > first". Would it be a good practise to name FirstParagraph similarly "Paragraph > first"? And maybe omit "FirstParagraph" and "Paragraph" for bibliography and possibly for custom styles (and treat custom styles as one exact style)? Bibliography could just be "Bibliography". Also, would it be feasible to generate "FirstParagraph" after forced empty line (to me empty space indicates a new section of text even without heading)?
Also, current FirstParagraphs don't have base style set (maybe not a problem).
Would it be possible to skip some of these styles (or alternatively enable them) e.g. in metadata block? That might be a bad fit for other document sources than Markdown and adding command line options for this is probably not wanted either.
Example of generated styles in Indesign (markdown with some custom styles, "Basic Paragraph" is Indesign's base style):
ICML DOCX
[Basic Paragraph] [Basic Paragraph]
BulList Normal
BulList > first Body Text
Footnote > Paragraph First Paragraph
Header1 Compact
Headerl (unnumbered) Title
Header2 Author
Header3 Bibliography
Paragraph Heading 1
author > Paragraph Heading 2
quote > Blockquote > Paragraph Heading 3
quote > Paragraph Block Text
quoteauthor > Blockquote > Paragraph Footnote Text
quoteauthor > Paragraph quote
thanks > BulList quoteauthor
thanks > BulList > first thanks
thanks > Header2
thanks > Paragraph
title > Header1
Bibliography > Paragraph
Bibliography > FirstParagraph
FirstParagraph
author > FirstParagraph
quote > FirstParagraph
quoteauthor > Blockquote > FirstParagraph
thanks > FirstParagraph