Whitespace handling in `Content::plain_text`
I initially noticed that for code like this:
= My heading \
that goes on two lines!
it would be in the PDF outline as My heading that goes on two lines!. Note the double space.
I traced this back to Content::plain_text. I think that this is not the correct output because if you were to join the two lines together by replacing "\\\n" with "", you would end up with only one space.
I was just looking for another issue regarding content and whitespace and noticed that this one seems already to be fixed with typst 0.11.0 (2bf9f95d)
No, I can confirm that the issue still occurs with multiple spaces in the outline text.
For me this does not look like that
maybe you are using an older version of typst than me?
Yeah the initial example was not entirely correct. Here was my test case:
#heading(level: 1, [My heading \
that goes on multiple lines])
still works for me
This is the issue. Notice the double space.
this does not explain how you get to that state.
It would be easier to have a minimal example that can reproduce your issue, as shown above from the information here it is still not possible to reproduce it.
I suspect that you have some other interaction going on which might or might not be an actual problem.
The minimal example is as given above. Look at the outline in a PDF viewer capable of showing it. There is a double space in the title.
You can show this in code by using pymupdf.
>>> import fitz
>>> doc = fitz.open('/tmp/a.pdf') # PDF generated from typst code given above.
>>> toc = doc.get_toc()
>>> print(toc)
[[1, 'My heading that goes on multiple lines', 1]]
Maybe there is a difference between a linebreak and an actual newline character in pdf? At least I finally understand your issue, but I don't know enough about pdf or typst to be able to help you there.
In Typst you could try to use the newline character directly with unicode
#outline()
#heading(level: 1, [
My heading \
that goes on multiple lines
])
#heading(level: 1, [
This heading\u{000A}goes on multiple lines
])
loading the resulting document with fitz code like this:
import fitz
pdf_document = "quick_tests.pdf" # Replace with your actual PDF file path
doc = fitz.open(pdf_document)
page1 = doc.load_page(0)
page1text = page1.get_text()
print("Text from PDF: ", page1text)
toc = doc.get_toc()
print(toc)
gives me output like that:
(.venv) C:\Users\26383\typst_issue>py test.py
Text from PDF: Contents
My heading
that goes on multiple lines ..................................................................................................................................... 1
This heading
goes on multiple lines .............................................................................................................................................. 1
My heading
that goes on multiple lines
This heading
goes on multiple lines
[[1, 'My heading that goes on multiple lines', 1], [1, 'This heading\ngoes on multiple lines', 1]]