[FEAT]: Tables to Json
at the moment your markdown convert tables like a table look like, ok thats mark down!
maybe an LLM can read that if the context size is large but if you embedd the table become unreadable.
its possible that for tables the mark-down is replaced by a json-format, in this case an embedder can read a table very well!
Hi @kalle07 - have you used output_format=json? It should give you a Table block with the nested TableCells when we detect them. You might also consider html since it preserves some of the hierarchical / grouping relationships with <table> tags.
Or did you want everything in markdown but only table itself in json?
not yet ... I am waiting as long all is in a fine GUI ... i know lot of work
and yes, json only for tables.
would be nice if you write bit more specific what is be installed and how much GB download and a bit unclear what exact steps need to write "marker_single /path/to/file.pdf" is that CMD or python or ???
btw I wrote myself a very smart one (multicore) with a GUI) and a raw-docling(ocr-parallel) version ... but my coder capabilities are over i have only more ideas ;)