TypeError: can only concatenate list (not "NoneType") to list
Describe the bug
When processing a PDF using marker_single, a TypeError occurs during the line merging process.
Traceback
Traceback (most recent call last):
File "/Users/xxxxx/.local/bin/marker_single", line 8, in <module>
sys.exit(convert_single_cli())
^^^^^^^^^^^^^^^^^^^^
File "/Users/xxxxx/.local/pipx/venvs/marker-pdf/lib/python3.12/site-packages/click/core.py", line 1161, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxxxx/.local/pipx/venvs/marker-pdf/lib/python3.12/site-packages/click/core.py", line 1082, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/Users/xxxxx/.local/pipx/venvs/marker-pdf/lib/python3.12/site-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxxxx/.local/pipx/venvs/marker-pdf/lib/python3.12/site-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxxxx/.local/pipx/venvs/marker-pdf/lib/python3.12/site-packages/marker/scripts/convert_single.py", line 35, in convert_single_cli
rendered = converter(fpath)
^^^^^^^^^^^^^^^^
File "/Users/xxxxx/.local/pipx/venvs/marker-pdf/lib/python3.12/site-packages/marker/converters/pdf.py", line 154, in __call__
document = self.build_document(filepath)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/xxxxx/.local/pipx/venvs/marker-pdf/lib/python3.12/site-packages/marker/converters/pdf.py", line 149, in build_document
processor(document)
File "/Users/xxxxx/.local/pipx/venvs/marker-pdf/lib/python3.12/site-packages/marker/processors/line_merge.py", line 130, in __call__
self.merge_lines(lines, block)
File "/Users/xxxxx/.local/pipx/venvs/marker-pdf/lib/python3.12/site-packages/marker/processors/line_merge.py", line 104, in merge_lines
line.merge(other_line)
File "/Users/xxxxx/.local/pipx/venvs/marker-pdf/lib/python3.12/site-packages/marker/schema/text/line.py", line 99, in merge
self.structure = self.structure + other.structure
~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
TypeError: can only concatenate list (not "NoneType") to list
Cause
The error occurs in the merge method of the Line class (marker/schema/text/line.py). The line self.structure = self.structure + other.structure attempts to concatenate the structure attributes directly. If either self.structure or other.structure is None, this results in the observed TypeError.
Proposed Fix
Modify the merge method to handle potential None values by treating them as empty lists before concatenation:
def merge(self, other: "Line"):
self.polygon = self.polygon.merge([other.polygon])
# Handle potential None values for structure
self_structure = self.structure if self.structure is not None else []
other_structure = other.structure if other.structure is not None else []
self.structure = self_structure + other_structure
if self.formats is None:
self.formats = other.formats
elif other.formats is not None:
self.formats = list(set(self.formats + other.formats))
I am not sure whether the fix is acceptable for the original intended purpose of merge.
Environment (if relevant)
- marker-pdf version: (Please add the version you are using)
- Python version: 3.12
- OS: macOS Sonoma
Additional context
This error was encountered while processing a microsoft word converted pdf, the documents are quite dense with text.
Same issue with some PDFs:
"error": "Marker failed: 2025-04-04 15:45:58.866704: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on.
self.structure = self.structure + other.structure\nTypeError: can only concatenate list (not \"NoneType\") to list\n",
Same issue occurred while parsing this PDF document. https://www.indiabudget.gov.in/doc/eb/allsbe.pdf
@rjrobben @VikParuchuri Is this issue fixed by this commit ? https://github.com/VikParuchuri/marker/commit/c6dae45c76b389b10a109eac27fe461cacce913d
Yes, this should be fixed now