python-poppler
python-poppler copied to clipboard
Building on windows
Ok, so I'm putzing about trying to get this to build on windows
- Install toolchain and dependencies
- You'll need visual studio of the appropriate version for your python distro
- Also Cmake
- And pkgconfig (https://chocolatey.org/packages/pkgconfiglite from chocolatey is nice in that it doesn't need glib)
- I think we could probably conditionally not use pkgconfig on windows, but I really don't feel like debugging cmakefiles today, so I just rolled with it. I'm kind of curious why pkgconfig was used instead of cmake's built-in
find_library()call.
- I think we could probably conditionally not use pkgconfig on windows, but I really don't feel like debugging cmakefiles today, so I just rolled with it. I'm kind of curious why pkgconfig was used instead of cmake's built-in
- Get precompiled poppler from somewhere (it's a giant pain to build). I'm using releases from https://github.com/oschwartz10612/poppler-windows/releases
- These builds seem to have some hard-coded paths in their
*.pcfiles.prefix=D:/bld/poppler_1595515154908/_h_env/Library. I replaced them with relative paths and that seems to have worked:prefix=../../_h_env/Library
- These builds seem to have some hard-coded paths in their
- Add an environment variable
PKG_CONFIG_PATHpointing to thelib\pkgconfigsubdirectory of wherever you unzipped the poppler library - At this point,
python setup.py bdistwill configure successfully, and then try to build.
I'm now at the point where I'm hitting compiler differences:
C:\code\python-poppler\src\cpp\image.cpp(47,27): error C2440: '<function-style-cast>': cannot convert from 'initializer list' to 'pybind11::buffer_info' [C:\code\python-poppler\build\temp.win-amd64-3.8\Release\image.vcxproj]
C:\code\python-poppler\src\cpp\image.cpp(54,5): message : No constructor could take the source type, or constructor overload resolution was ambiguous [C:\code\python-poppler\build\temp.win-amd64-3.8\Release\image.vcxproj]
C:\code\python-poppler\src\cpp\image.cpp(47,16): error C2064: term does not evaluate to a function taking 6 arguments [C:\code\python-poppler\build\temp.win-amd64-3.8\Release\image.vcxproj]
So it looks like it's not hugely difficult to get this to do things on windows.
I went in and added intermediates to figure out exactly what's breaking.
So it looks like it's barfing on this line
Specifically,
{ static_cast<py::ssize_t>(img.height()), static_cast<py::ssize_t>(img.width()), bytes_per_color },
This is passed through as an initializer list to the buffer_info constructor, where the relevant parameter is of type detail::any_container<ssize_t>.
However, looking at the definition of detail::any_container, it doesn't provide any constructors that take 3 arguments. I'm actually not sure how this is supposed to work. I think it's casting to a container type, and then the any_container(const Container &c) constructor, but for some reason MSVC++ isn't seeing that possible type conversion path..
The values are being passed as a single initializer list to the constructor, not to the constructor directly. It still doesn't work, though.
Replacing the contents of image_buffer_info() with the following seems to make things compile, at least:
py::buffer_info image_buffer_info(image &img)
{
py::ssize_t bytes_per_color = img.bytes_per_row() / img.width();
const std::array<ssize_t, 3> shape_in = {static_cast<py::ssize_t>(img.height()), static_cast<py::ssize_t>(img.width()), bytes_per_color};
const std::array<ssize_t, 3> strides_in = {static_cast<py::ssize_t>(img.bytes_per_row()), bytes_per_color, 1L};
py::buffer_info ret = py::buffer_info(
static_cast<void*>(img.data()),
1L,
py::format_descriptor<char>::format(),
3L,
shape_in,
strides_in
);
return ret;
}
(and adding #include <array>)
Ok, I tracked down the broken dependency. First, the poppler DLLs need to be placed in the poppler/cpp/ subdirectory so they're found properly.
Additionally, there's a dependency on libiconv that's missing from the poppler package. Strangely, the loader process was looking for iconv.dll, rather then what it's called everywhere I looked (libiconv.dll).
Grabbing a version from conda-forge (https://anaconda.org/conda-forge/libiconv/files) worked, but I had to rename the DLL before putting it into the poppler/cpp/ subdirectory as well.
Note: Conda-forge seems to also ship compiled binaries for poppler as well (https://anaconda.org/conda-forge/poppler/files). I think these are actually built by the same process as the ones from the github repo above, the paths in the packageconfig files have the same format.

Ha Ha!
I had to manually push some DLLs around (they need to be specified in the appropriate setup.py parameters), but all tests pass except for one that is broken by the windows line endings.
Thank you for this effort! I'm not very familiar with conda, but this is probably the easiest way to get a Windows version of poppler. Will you able to submit a pull request when all problems will be solved?
For the failing test, you could simply add something like:
text = text.replace(os.linesep, "\n"
Will you able to submit a pull request when all problems will be solved?
That would be the plan.
os.linesep
Huh, I didn't realize that was a thing.
Currently, I'm just changing the expected text based on sys.platform
I need to look at python packages I did for work a while ago and try to remember what the nicest way to package DLLs would be.
Note that if you ship a precompiled package (which is probably by far the best option for windows), the only person who'd have to do the dll hijinks would be the person building the package.
If you provide me all the needed steps to build the library, I should be able to handle the packaging. My plan is to add GitHub actions to test the package under Windows, and to upload the binary package to PyPI.
Allright, so currently https://github.com/fake-name/python-poppler/blob/master/setup.py will package everything correctly for windows, provided you manually provide the relevant DLLs.
This is a first attempt, I'm pretty sure I'm doing at least some of the packaging wrong. I'm not totally how setup.py operates when installing, rather then building a package (and it'll probably fail on windows if it's executed as part of installing a egg).
You need a bunch of conda packages for the associated DLLs, and these paths are hard coded for my machine, which is super not great. Ideally, it should only need to be hard-coded for the machine that builds the packages, so you might need to just tweak them, I'm not sure how github actions handle windows.
Also, you could probably use MSYS or similar to cross-compile, but I've got no idea how to go about doing that.
conda has some mechanism for resolving dependencies like this automatically (you can conda install -c conda-forge poppler and it does things), but I don't have a test VM with conda at the moment, and don't want to deal with multiple python installs on my desktop (which is my windows machine). I think if you can get conda into the windows github actions/travisci system, that'd make things a lot easier.
You'll also need to either cherry pick (or I can PR) 85091ae67434cf01b5d0f81000395cb2ee96d737, which fixes MSVC++ not seeing a type-conversion correctly. That was the one build issue that wasn't just missing dependencies I hit.
Ok. Could you open a pull request, with the change needed for image_buffer_info, and with the correction needed for the test? You should also add a line to NEWS.txt. Do not touch the version number for now, I will handle it.
When this will be merged, I will try to find some time to write the configuration to automatically build the windows package on GitHub.
Will do.
It would be useful to have this available as a guide for when someone is on windows. I am mostly using linux (and ruby, but I also use python); getting things compiled on windows is a bit tedious, so any documentation may help (if it works that is; not everyone will find this github issue though).
If you came here like me looking for a solution to install a package under windows. You can use this set of steps to set up your conda environment. Also, I have installed Visual Studio with cpp compiler.
conda create -n poppler_env python
conda activate poppler_env
conda install -c conda-forge poppler
conda install cmake
conda install -c conda-forge pkg-config
pip install python-poppler
After installing poppler in windows 10, python version is 3.8.7 Error of ImportError comes:
from poppler import load_from_file, PageRenderer
File "C:\Users\User\AppData\Local\Programs\Python\Python38\lib\site-packages\poppler_init_.py", line 20, in
You may try to run https://github.com/lucasg/Dependencies on the python poppler pyd to find if a dependency is missing.
Hi @fake-name (and @cbrunet ),
Did anyone had any luck in getting the package to work on Linux? I'm working on a package that would benefit a lot from not having to shell out calls to the poppler binaries (which I install through msys2 but maybe there's a better way..).
It seems that fake-name's fork got things kinda working, but now it's a dozen commits behind so I'm not sure if it would need major changes or not.
Thanks in advance for any pointers!