WireViz icon indicating copy to clipboard operation
WireViz copied to clipboard

Escape any HTML special characters in GraphViz HTML

Open 17o2 opened this issue 4 years ago • 1 comments

YAML and Python accept special characters (<, >, etc.) within strings without problems, but they cause issues when such a string is embedded inside the GraphViz HTML. Therefore, these characters should be escaped (&lt;, &gt;, etc.) when generating GraphViz HTML.

17o2 avatar Mar 23 '21 22:03 17o2

We need clear rules on how to handle such characters when generating the different output formats, as they have different limitations:

  • .bom.tsv does not support TAB, CR or LF in the text fields. No hyperlinks or formatting tags are supported as such.
  • .gv designators have probably the same limitations as above, and also cannot contain characters that are interpreted as other syntax elements by Graphviz unless quoted.
  • .gv HTML in labels support a limited set of formatting tags and no hyperlinks in the text (only as table attributes). TAB, CR and LF might improve file readability, but regarded equal as space when rendered. <br/> is needed to force a linebreak. See doc.
  • .html support hyperlinks and a wider set of formatting tags. TAB, CR and LF might improve file readability, but regarded equal as space when rendered. <br/> is needed to force a linebreak.

I agree that we probably should escape HTML special characters and convert newline to <br/> (and perhaps replace('\u00b2', '&sup2;')) for the two HTML output formats, but it should also be possible to disable all this when the user wants to include hyperlinks or some formatting tags, e.g. bold or italic to be used in the output formats that support them, and be filtered out in the other output formats (already partly implemented in #164). I guess we need a way to specify which of these two alternatives to apply for each input text attribute.

Is it possible to have a leading specifier flag in the attribute text to specify the non-default alternative? E.g. text attributes with a leading < character (or perhaps something like <!wireviz!> is better) to specify the second alternative.

See also https://github.com/formatc1702/WireViz/pull/168#issuecomment-776115110

kvid avatar Jan 30 '22 14:01 kvid