David Huggins-Daines
David Huggins-Daines
Note also that you still need to disable Webpack's "mocking" of __filename, at the very least, or you will not be able to load your WASM unless it is in...
> @dhdaines I once managed to get a working solution with setting the Emscripten build to `EXPORT_ES6`, and then simply importing the .js file in my Webpack based (actually Angular)...
Just to follow up on this. The problem is actually *Angular* and its mysterious built-in webpack configuration, which **disables** `module.parser.javascript.url` for some unknown reason! If you re-enable it in a...
Hi! Deleted my previous comment because now I see what's going on here. > 1. Was this intentional? No, but it's not actually a problem, because inline content streams aren't...
> > 2. Have you tested this with PDFs containing inline content streams to ensure they're still processed correctly? > > That, I haven't done! I think there are some...
You can definitely simply take the new implementation (which is not 100% correct but is about 50% faster and will succeed in extracting images without crashing in all circumstances) from...
Hmm! that PDF has another (luckily not fatal) issue in that the `startxref` location is incorrect. In the meantime you can also use [https://github.com/dhdaines/paves](PAVÉS): ```python import paves.miner as pm from...
Oh. This is a **documentation** bug, it should be `tool.hatch.build.targets.wheel.hooks.mypyc` (what a mouthful!!) when using `pyproject.toml`. It would be good to make this clearer in the README for dunces like...
Note also that if the guidance is to ASCII-encode inline images, then this comment in section 7.4.1 should also be revised: > ASCII filters serve no useful purpose in a...
Indeed, you're absolutely right - finding the end of an inline image pre-PDF 2.0 isn't so much ambiguous as it is exceedingly complex, for the surprisingly common case where you...