liblouisutdml icon indicating copy to clipboard operation
liblouisutdml copied to clipboard

Stray non-breaking space in BRF output

Open rbeezer opened this issue 1 year ago • 3 comments

I'm getting what I think is a stray non-breaking space in BRF output.

  1. I apply file2brf (Version 2.11.0) to an HTML file purpose-built for translation via this method.

  2. HTML contains

<div data-braille="tableofcontents">Contents</div>
  1. Semantic file contains
contentsheader div,data-braille,tableofcontents
  1. Output BRF has

,3t5ts

as the ToC header, where there is a single U+00A0 after the final "s" and before the newline. Clearly visible in my pager (less) and by other means.

I looked through source but couldn't see where a change could be made to test, and a pull request formulated.

Thanks for any help you can provide, this is causiing me to use an incorrect encoding in a Python program that parses the BRF.

https://github.com/PreTeXtBook/pretext/blob/d402bdb3613d95984708150abe2fdb33123f565a/pretext/pretext.py#L2209

rbeezer avatar Nov 03 '22 19:11 rbeezer