Steve Canny

Results 28 issues of Steve Canny

**Summary** Remove double-decoration from EML and MSG. **Additional Context** - These needed to wait to the end because `partition_email()` and `partition_msg()` can use any other partitioner for one of their...

Currently DOCX content nested in revision-marks is skipped when partitioning a .docx file. Add an "accept-all-revisions" step before partitioning to bring the document to the state most likely intended by...

enhancement
docx

**Summary** A DOCX with a large number of sections combined with a large number of paragraphs triggers an O(N^2) process to determine the paragraphs in each section. This greatly slows...

bug
docx

**Summary** Eliminate historical "idiosyncracies" of `table.metadata.text_as_html` HTML introduced by `partition_xlsx()`. Produce minified `.text_as_html` consistent with that formed by chunking. **Additional Context** - XLSX `.text_as_html` is minified (no extra whitespace or...

In order to modify an existing presentation to suit a new purpose As a developer using python-pptx I need the ability to delete a slide API perhaps: ``` python slides.remove(slide)...

slide

HTML content contained in custom HTML tags is currently skipped during partitioning. Enhance the HTML parser to include content in custom HTML-tags. **Additional Context** Challenges include determining whether to treat...

enhancement
html