docx2python
docx2python copied to clipboard
Is there any way to extract the table into markdown format?
I want to extract the table in .docx file into markdown format, while maintaining the position of the table in the document. So I can't use python-docx
document.paragraghs
and document.tables
to handle paragraghs and tables separately (this will destory the positional relationship between them).
docx2python
is very easy to use. I would like to know whether docx2python
can save tables in markdown format, or whether it can separate tables, images and paragraphs in output.body
. Thank you!
I am going to leave this issue open for a bit and thing about how this might be seamlessly accomplished. Until then, here’s a script that will identify tables for you.
https://github.com/ShayHill/transpose_docx_tables
As of Docx2Python v 3.0.0, tables are guaranteed to be nxm (n rows by m columns) and are straightforward to identify. See details near the top of the README file. I've also left an example of exporting tables as markdown in the tests folder. It's referenced in the README.