Dean Malmgren

Results 102 comments of Dean Malmgren

That sounds reasonable to me. Thanks! On Fri, Jul 30, 2021, 16:55 traverseda ***@***.***> wrote: > Some of the libraries textract uses are no longer supporting python3 with > newer...

Contributions welcome! Please feel free to put together a PR.

Sorry I didn't get back to you sooner on this. What do you mean by "my table approach"?

I just took a closer look at your branch and I see what you're talking about. I agree that it would be better to have a lookup table of supported...

`shell=True` creates a security vulnerability that was fixed in #114. I do not recommend using `shell=True`, particularly if you have `textract` connected to a web application that others can use....

Can you confirm that this is still an issue with textract 1.6.1? If so, can you also try to diagnose what the `args` variable is in the `subprocess.Popen` call [here](https://github.com/deanmalmgren/textract/blob/master/textract/parsers/utils.py#L82)?...

This seems like it is connected to not having PDF Miner (the `pdf2txt.py` script) available from `sys.path`. `pip install textract` should install `pdf2txt.py` via [the requirements](https://github.com/deanmalmgren/textract/blob/master/requirements/python) in `/usr/local/bin` or equivalent...

I recently started using pyup-bot to track dependency version changes. It looks like https://github.com/chardet/chardet/issues/98 is fixed now in `chardet` and, according to pyup-bot and PR #166, chardet 3.0.4 is passing...

Huh. Filing the issue with chardet will definitely be good. I know that the maintainer was actively working on some major revisions and test cases are helpful. For debugging, I...

You have to edit the textract source code directly using the instructions above. Does that make sense?