WeasyPrint icon indicating copy to clipboard operation
WeasyPrint copied to clipboard

IndexError: string index out of range

Open mariohuq opened this issue 4 years ago • 5 comments

When I trying to produce pdf out of my html by pandoc via Weasyprint I get an error.

tema1.html:

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="ru-RU" xml:lang="ru-RU">
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="pandoc" />
</head>
<body>
<pre class="cpp"><code>
//XXX XXX XXXX XXXXXX компилятор :
</code></pre>
</body>
</html>

Command: $ pandoc tema1.html -otema1.pdf --pdf-engine=weasyprint

Output:

[WARNING] This document format requires a nonempty <title> element.
  Defaulting to 'tema1' as the title.
  To specify a title, use 'title' in metadata or --metadata title="...".
WARNING: Ignored `text-rendering: optimizeLegibility` at 18:7, unknown property.
WARNING: Expected a media type, got (max-width: 600px)
WARNING: Invalid media type " (max-width: 600px) " the whole @media rule was ignored at 21:5.
WARNING: Ignored `overflow-x: auto` at 103:7, unknown property.
WARNING: Ignored `text-decoration: inherit` at 142:46, invalid value.
WARNING: Ignored `user-select: none` at 162:32, unknown property.
Traceback (most recent call last):
  File "/home/mariohuq/.local/bin/weasyprint", line 8, in <module>
    sys.exit(main())
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/__main__.py", line 214, in main
    getattr(html, 'write_' + format_)(output, **kwargs)
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/__init__.py", line 222, in write_pdf
    self.render(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/__init__.py", line 172, in render
    return Document._render(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/document.py", line 406, in _render
    [Page(page_box, enable_hinting) for page_box in page_boxes],
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/document.py", line 406, in <listcomp>
    [Page(page_box, enable_hinting) for page_box in page_boxes],
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/__init__.py", line 123, in layout_document
    pages = list(make_all_pages(context, root_box, html, pages))
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/pages.py", line 801, in make_all_pages
    page, resume_at = remake_page(i, context, root_box, html)
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/pages.py", line 738, in remake_page
    page, resume_at, next_page = make_page(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/pages.py", line 548, in make_page
    root_box, resume_at, next_page, _, _ = block_level_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 58, in block_level_layout
    return block_level_layout_switch(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 72, in block_level_layout_switch
    return block_box_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 126, in block_box_layout
    block_container_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 517, in block_container_layout
    collapsing_through) = block_level_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 58, in block_level_layout
    return block_level_layout_switch(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 72, in block_level_layout_switch
    return block_box_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 126, in block_box_layout
    block_container_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 517, in block_container_layout
    collapsing_through) = block_level_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 58, in block_level_layout
    return block_level_layout_switch(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 72, in block_level_layout_switch
    return block_box_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 126, in block_box_layout
    block_container_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 517, in block_container_layout
    collapsing_through) = block_level_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 58, in block_level_layout
    return block_level_layout_switch(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 72, in block_level_layout_switch
    return block_box_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 126, in block_box_layout
    block_container_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/blocks.py", line 379, in block_container_layout
    for i, (line, resume_at) in enumerate(lines_iterator):
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/inlines.py", line 48, in iter_line_boxes
    line, resume_at = get_next_linebox(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/inlines.py", line 103, in get_next_linebox
    last_letter, float_width) = split_inline_box(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/inlines.py", line 760, in split_inline_box
    split_inline_level(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/inlines.py", line 618, in split_inline_level
    last_letter, float_widths) = split_inline_box(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/inlines.py", line 760, in split_inline_box
    split_inline_level(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/inlines.py", line 623, in split_inline_level
    new_box = atomic_box(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/inlines.py", line 505, in atomic_box
    box = inline_block_box_layout(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/inlines.py", line 531, in inline_block_box_layout
    inline_block_width(box, context, containing_block)
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/min_max.py", line 17, in wrapper
    result = function(box, *args)
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/inlines.py", line 568, in inline_block_width
    box.width = shrink_to_fit(context, box, containing_block.width)
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/preferred.py", line 31, in shrink_to_fit
    min_content_width(context, box, outer=False),
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/preferred.py", line 48, in min_content_width
    return block_min_content_width(context, box, outer)
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/preferred.py", line 169, in block_min_content_width
    return _block_content_width(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/preferred.py", line 98, in _block_content_width
    children_widths = [
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/preferred.py", line 99, in <listcomp>
    function(context, child, outer=True) for child in box.children
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/preferred.py", line 52, in min_content_width
    return inline_min_content_width(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/preferred.py", line 194, in inline_min_content_width
    widths = list(widths)
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/preferred.py", line 272, in inline_line_widths
    lines = list(lines)
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/layout/preferred.py", line 298, in inline_line_widths
    text.split_first_line(
  File "/home/mariohuq/.local/lib/python3.8/site-packages/weasyprint/text.py", line 1181, in split_first_line
    if text[len(new_first_line_text)] == soft_hyphen:
IndexError: string index out of range
Error producing PDF.

Weasyprint version:

$ weasyprint --version
WeasyPrint version 52.5

Pandoc version:

$ pandoc --version
pandoc 2.13
Compiled with pandoc-types 1.22, texmath 0.12.2, skylighting 0.10.5,
citeproc 0.3.0.9, ipynb 0.1.0.1
User data directory: /home/mariohuq/.local/share/pandoc
Copyright (C) 2006-2021 John MacFarlane. Web:  https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.

OS version:

$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.2 LTS
Release:	20.04
Codename:	focal

P.S. But If I do

$ weasyprint tema1.html tema1.pdf

instead, I get no errors.

mariohuq avatar May 02 '21 03:05 mariohuq

Hello!

Thanks a lot for the bug report. Could you please share the stylesheet used with your document?

liZe avatar May 05 '21 15:05 liZe

Thanks a lot for the bug report. Could you please share the stylesheet used with your document?

The document has no links to stylesheets. Maybe pandoc modifies the file when processes it and adds some stylesheet to it.

Can you reproduce this issue without stylesheet?

mariohuq avatar May 06 '21 06:05 mariohuq

Can you reproduce this issue without stylesheet?

No, I can’t.

I’ll try to find in pandoc’s code if it’s using a custom stylesheet.

liZe avatar May 12 '21 16:05 liZe

I’ll try to find in pandoc’s code if it’s using a custom stylesheet.

I don’t find anything here: https://github.com/jgm/pandoc/blob/60974538b25657c9aa37e72cc66ca3957912ddec/src/Text/Pandoc/PDF.hs#L418

Maybe the pandoc team can help you debugging this issue?

liZe avatar May 12 '21 16:05 liZe

@mariohuq Is there anything more we can do for you? Many things have changed in text management with version 53, and without a way to reproduce it’s hard for us to know if it’s fixed.

liZe avatar Aug 18 '21 16:08 liZe