docxtractr
docxtractr copied to clipboard
:scissors: Extract Tables from Microsoft Word Documents with R
First of all, thanks for the amazing package! I am trying to read the contents of a docx table and having issues with numbered lists. If I enter the numbers...
Hello, This package has been incredibly helpful. Is there a way to include (or get) page numbers for each table? Or can we read in particular number of pages and...
First off, thank you for this package, it's really useful. I've run into an interesting scenario where the argument include_text = TRUE fails for a word document. Here are two...
I've run across a valid DOCX XML structure where the commentStartRange and commentEndRange nodes are not siblings: This is an attempt to allow for accurate comment and anchor text extraction...
It would be so nicer if `docx_extract_all_cmnts()` function adds a column for `selected_text` which contains each block of selected text corresponding to each comment. This way will allow users to...
Thank you for docxtractr. While reading a .docx file, I have a special symbol (tick mark) within a table. Currently using docxtractr renders them as null character. Requesting to see...
Thanks a lot for such a great package. I was trying out `docxtractr::read_docx` on `doc` files in `Windows 10` using `LibreOffice Version: 6.2.5.2 (x64)`. It was horribly slow (_due to...
This package is incredibly handy, thanks! I don't know much about XML, but looking at an unzipped docx file, it appears that, if the footer exists, each section of document...
Hello, everyone, when I used the function docx_extract_all_tbls() to extract data from one docx file that outputted from SAS, there was an issue which showed that "Error: Must pass in...
Hi there, thanks for the package, very useful! I get the following error when assigning a row as a column name if the scraped Word table has only one column....