python-igraph icon indicating copy to clipboard operation
python-igraph copied to clipboard

wasm wheel

Open flange-ipb opened this issue 3 years ago • 48 comments

Hi folks!

I'd like to use the igraph library inside the Pyodide environment. With their latest release, they added the possibility to load binary wheels specifically built for WebAssembly. Having such wheels deployed at PyPI would be fantastic.

I already figured out how cross-compile the igraph C library to wasm using Emscripten. Thanks to the existing cmake infrastructure, this is pretty easy.

  • follow the instructions from WesThorburn's gist
  • run emcmake cmake .. -DIGRAPH_WARNINGS_AS_ERRORS:BOOL=OFF (Emscripten seems to be a bit more pedantic than gcc)

I'm not sure how to proceed now. I guess the C glue code in this repo also needs to be compiled to wasm.

flange-ipb avatar Aug 10 '22 12:08 flange-ipb

It's amazing what you can do in the browser nowadays. I don't have any experience with these tools, but it seems quite sure that you also need to convert the C glue code in this module to wasm. This is just pure guesswork, but how about something like wasmpy-build? It seems to be built for the purpose of compiling Python C extensions to wasm.

Some additional hackery might be required, because setup.py in this repo builds igraph's C core on its own using CMake, so you'll need to convert that part to use emcmake as well.

ntamas avatar Aug 10 '22 12:08 ntamas

I already figured out how cross-compile the igraph C library to wasm using Emscripten. Thanks to the existing cmake infrastructure, this is pretty easy.

Did you try this with version 0.9?

Would you be able to check whether it works without issues for the 0.10 pre-release as well? Here's a tarball.

szhorvat avatar Aug 12 '22 06:08 szhorvat

@szhorvat The cross-compile works for your 0.10 pre-release, for version 0.9.9 and for the current master.

flange-ipb avatar Aug 12 '22 07:08 flange-ipb

No progress from my side yet. I'm collecting a few things:

flange-ipb avatar Aug 12 '22 08:08 flange-ipb

Almost done (I think). I patched setup.py to invoke emcmake and then run pyodide build instead of python setup.py build, which compiles the C extensions with emscripten. pyodide-build 0.21.0 requires Python 3.10, which is why I moved to Docker with the following Dockerfile:

FROM python:3.10.6-bullseye

SHELL ["/bin/bash", "-c"]

COPY setup.py.diff.txt /setup.py.diff

RUN set -ex \
	&& echo "Installing flex and bison" \	
	&& apt-get update \
	&& apt-get -y install flex bison \
	&& echo "Installing Emscripten" \
	&& git clone https://github.com/emscripten-core/emsdk.git \
	&& cd emsdk \
	&& ./emsdk install latest \
	&& ./emsdk activate latest \
	&& source "/emsdk/emsdk_env.sh" \
	&& cd .. \
	&& echo "Cloning igraph repositories" \
	&& git clone https://github.com/igraph/python-igraph.git \
	&& cd python-igraph/vendor/source \
	&& git clone https://github.com/igraph/igraph.git \
	&& cd ../.. \
	&& echo "Applying patch to setup.py" \
	&& git apply < /setup.py.diff \
	&& echo "Installing pyodide-build" \
	&& pip install pyodide-build \
	&& echo "Building igraph wheel" \
	&& pyodide build

Now, while linking of the C extensions I'm facing the following problems:

c++ build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/arpackobject.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/attributes.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/bfsiter.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/common.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/convert.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/dfsiter.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/edgeobject.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/edgeseqobject.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/error.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/filehandle.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/force_cpp_linker.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/graphobject.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/igraphmodule.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/indexing.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/operators.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/pyhelpers.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/random.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/vertexobject.o build/temp.emscripten_3_1_14_wasm32-cpython-310/src/_igraph/vertexseqobject.o vendor/install/igraph/lib/libigraph.a -Lvendor/install/igraph/lib -L/usr/local/lib -L/usr/lib/x86_64-linux-gnu -L/usr/lib -L/lib64 -L/lib -o build/lib.emscripten_3_1_14_wasm32-cpython-310/igraph/_igraph.cpython-310-wasm32-emscripten.so
wasm-ld: error: function signature mismatch: igraph_gml_yylex_destroy
>>> defined as (i32) -> void in vendor/install/igraph/lib/libigraph.a(gml.c.o)
>>> defined as (i32) -> i32 in vendor/install/igraph/lib/libigraph.a(gml-lexer.c.o)

wasm-ld: error: function signature mismatch: s_copy
>>> defined as (i32, i32, i32, i32) -> i32 in vendor/install/igraph/lib/libigraph.a(dneupd.c.o)
>>> defined as (i32, i32, i32, i32) -> void in vendor/install/igraph/lib/libigraph.a(s_copy.c.o)

wasm-ld: error: function signature mismatch: igraph_pajek_yylex_destroy
>>> defined as (i32) -> void in vendor/install/igraph/lib/libigraph.a(pajek.c.o)
>>> defined as (i32) -> i32 in vendor/install/igraph/lib/libigraph.a(pajek-lexer.c.o)

wasm-ld: error: function signature mismatch: s_cat
>>> defined as (i32, i32, i32, i32, i32) -> i32 in vendor/install/igraph/lib/libigraph.a(dormhr.c.o)
>>> defined as (i32, i32, i32, i32, i32) -> void in vendor/install/igraph/lib/libigraph.a(s_cat.c.o)

wasm-ld: error: function signature mismatch: igraph_lgl_yylex_destroy
>>> defined as (i32) -> void in vendor/install/igraph/lib/libigraph.a(lgl.c.o)
>>> defined as (i32) -> i32 in vendor/install/igraph/lib/libigraph.a(lgl-lexer.c.o)

wasm-ld: error: function signature mismatch: igraph_dl_yylex_destroy
>>> defined as (i32) -> void in vendor/install/igraph/lib/libigraph.a(dl.c.o)
>>> defined as (i32) -> i32 in vendor/install/igraph/lib/libigraph.a(dl-lexer.c.o)

wasm-ld: error: function signature mismatch: igraph_ncol_yylex_destroy
>>> defined as (i32) -> void in vendor/install/igraph/lib/libigraph.a(ncol.c.o)
>>> defined as (i32) -> i32 in vendor/install/igraph/lib/libigraph.a(ncol-lexer.c.o)
em++: error: '/emsdk/upstream/bin/wasm-ld @/tmp/emscripten_c7kh4wet.rsp.utf-8' failed (returned 1)
error: command '/tmp/tmpn9kjzxnp/c++' failed with exit code 1

flange-ipb avatar Aug 12 '22 14:08 flange-ipb

The issues about igraph_*_yylex_destroy() are probably easy to fix. We declare these functions on our own in our C files with a void return type, while it seems like bison generates them with an int return type. We can simply change our own declarations in vendor/source/igraph/src/io/gml.c and similar to return int.

As for s_cat and s_copy, they do not seem as straightforward because these come from f2c, but the fix (workaround) is probably similar.

ntamas avatar Aug 12 '22 15:08 ntamas

https://github.com/igraph/igraph/commit/c307fc4089435f95587df7394a56a0f443a804f1 changes the yylex_destroy() functions to return int; this should fix most of the problems above. The remaining two are about s_cat and s_copy; I need to test those more with and without vendored LAPACK, but in the meanwhile you can try this patch:

diff --git a/vendor/lapack/dneupd.c b/vendor/lapack/dneupd.c
index 4dc419ac6..442d9d0b5 100644
--- a/vendor/lapack/dneupd.c
+++ b/vendor/lapack/dneupd.c
@@ -331,7 +331,7 @@ static doublereal c_b71 = -1.;
     /* Builtin functions */
     double pow_dd(doublereal *, doublereal *);
     integer s_cmp(char *, char *, ftnlen, ftnlen);
-    /* Subroutine */ int s_copy(char *, char *, ftnlen, ftnlen);
+    /* Subroutine */ void s_copy(char *, char *, ftnlen, ftnlen);

     /* Local variables */
     integer j, k, ih;
diff --git a/vendor/lapack/dormhr.c b/vendor/lapack/dormhr.c
index 6ff1704d4..d7c170f06 100644
--- a/vendor/lapack/dormhr.c
+++ b/vendor/lapack/dormhr.c
@@ -205,7 +205,7 @@ f">
     char ch__1[2];

     /* Builtin functions
-       Subroutine */ int s_cat(char *, char **, integer *, integer *, ftnlen);
+       Subroutine */ void s_cat(char *, char **, integer *, integer *, ftnlen);

     /* Local variables */
     integer i1, i2, nb, mi, nh, ni, nq, nw;

ntamas avatar Aug 12 '22 15:08 ntamas

This is Bison version dependent, we should not change it arbitrarily.

szhorvat avatar Aug 12 '22 18:08 szhorvat

I don't remember the details but I seemed to recall that we've been here and changed that return type back and forth at least once. This function is generated by Flex. I checked that both the latest released version as well as the ancient version included with macOS use int, so I might be misremembering.

Note that we use these functions with IGRAPH_FINALLY, which assumes a return type of void, but I think this will not cause problems ...

szhorvat avatar Aug 12 '22 18:08 szhorvat

The cross-compile works for your 0.10 pre-release, for version 0.9.9 and for the current master.

Did you provide a custom arith.h when cross-compiling?

Or does this method of compiling produce actually runnable executables?

@flange-ipb I am asking this to decide how much more testing we need to do on cross-compilation.

szhorvat avatar Aug 12 '22 20:08 szhorvat

Flex generates int as the return type, although I have no idea why - a well-behaved "destroy" function shouldn't fail. IGRAPH_FINALLY() is okay with such a function, it will simply ignore the return value. So I recommend keeping things this way.

ntamas avatar Aug 13 '22 17:08 ntamas

The attached patch now makes the compilation work with the most recent master revision of igraph. I have no idea whether the output makes sense.

igraph.patch.zip

ntamas avatar Aug 13 '22 20:08 ntamas

@flange-ipb Do you know why wasm-ld complains about this? Is this really just a warning that's turned into an error by a flag? Or is it a true error?

@ntamas If it is truly a hard requirement that the signatures should match, I worry that there will be problems with the code that casts functions to a different signature before calling them. Specifically, the flex destructors that were changed to an int (as a response to this issue) are cast to a void return type in IGRAPH_FINALLY.

szhorvat avatar Aug 14 '22 06:08 szhorvat

Sanitizers did not reveal anything so I'm pretty convinced that it doesn't cause problems. But, just to be on the safe side, we can wrap them in a wrapper that truly returns void, and put that on the FINALLY stack.

ntamas avatar Aug 14 '22 12:08 ntamas

At least in C89, it is legal to call functions with no prototype at all. So this should be fine, no?

As for the changes to s_copy() and co., the prototype clearly doesn't match, so I guess we can commit this patch to the main repo once @flange-ipb confirms that it's sufficient, right?

szhorvat avatar Aug 14 '22 12:08 szhorvat

Re the changes to s_copy() and s_cat(): these are internal f2c functions (not native Fortran ones), but there are variants floating around where they clearly return an int (even though it's always zero):

https://github.com/juanjosegarciaripoll/f2c/blob/master/lib/s_cat.c

So I dug around a bit more, and it turns out that there's a macro named VOID that you can define when compiling f2c stuff, and that decides whether these functions return void or int. To remain compliant, we need to change the prototypes where the f2c-translated code refer to s_copy() or s_cat() to also use VOID, and then we will be okay. I'll do that.

ntamas avatar Aug 14 '22 12:08 ntamas

The VOID macro is used for the K&R style prototype. Does K&R C even support void? Maybe it doesn't, or some older versions don't, which is the reason why it was done this way?

In C, the default return type is int to this day. It's used when no return type is specified at all.

szhorvat avatar Aug 14 '22 12:08 szhorvat

P.S. We definitely don't want to activate using the ancient K&R style definitions.

szhorvat avatar Aug 14 '22 12:08 szhorvat

The second to last paragraph of this Wikipedia article section confirms that the original K&R C did not support the void return type. Instead, people were free to omit the return type and the compiler assumed it to be int.

This is almost certainly the reason for this setup in f2c, which means that we should go ahead and just change those ints to void, as in your patch. At the same time, we should make a note about this change in case we ever re-translate some sources from Fortran to C in the future. As I remember, there was a README to explain the steps of translating the LAPACK sources.

szhorvat avatar Aug 14 '22 12:08 szhorvat

Committed the prototype changes and updated the Python interface so now both the master and the develop branch should compile in the Docker container as outlined above, without any patches to igraph's source code.

ntamas avatar Aug 14 '22 12:08 ntamas

Amazing! So much activity here. :smiley: Let me proceed slowly.

@szhorvat I repeated my initial build of igraph (develop branch) with

apt-get update
apt-get install flex bison cmake
mkdir build
cd build
emcmake cmake .. -DIGRAPH_WARNINGS_AS_ERRORS:BOOL=OFF
cmake --build .

During build configuration there is a warning:

CMake Warning at vendor/f2c/CMakeLists.txt:21 (message): Cross-compiling with internal ARPACK, BLAS or LAPACK, but F2C_EXTERNAL_ARITH_HEADER was not set. The build is likely to fail. See igraph's installation instructions for more information.

Nevertheless, the build does not fail.

Did you provide a custom arith.h when cross-compiling?

The following arith.h is generated in build/vendor/f2c:

#define IEEE_8087
#define Arith_Kind_ASL 1
#define Double_Align
#define NANCHECK
#define QNaN0 0x0
#define QNaN1 0x7ff80000

I guess it is generated via the following rule in build/vendor/f2c/CMakeFiles/f2c_vendored.dir/build.make:

vendor/f2c/arith.h: vendor/f2c/arithchk.js
        @$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --blue --bold --progress-dir=/igraph/build/CMakeFiles --progress-num=$(CMAKE_PROGRESS_1) "Generating arith.h for f2c..."
        cd /igraph/build/vendor/f2c && /emsdk/node/14.18.2_64bit/bin/node --experimental-wasm-threads /igraph/build/vendor/f2c/arithchk.js > /igraph/build/vendor/f2c/arith.h

Or does this method of compiling produce actually runnable executables?

Do you refer to the executables in build/vendor/f2c? There is arithchk.js and arithchk.wasm.

flange-ipb avatar Aug 15 '22 08:08 flange-ipb

I also ran the tests with cmake --build . --target check: 354 of 491 failed, most of them with a "Permission denied" error. I have the feeling some program is missing in my Docker container. From build/Testing/Temporary/LastTest.log:

5/491 Testing: example::dqueue
5/491 Test: example::dqueue
Command: "/usr/bin/cmake" "-DTEST_EXECUTABLE=/igraph/build/tests/example_dqueue.js" "-DEXPECTED_OUTPUT_FILE=/igraph/examples/simple/dqueue.out" "-DOBSERVED_OUTPUT_FILE=/igraph/build/tests/ex
ample_dqueue.out" "-DDIFF_FILE=/igraph/build/tests/example_dqueue.diff" "-DDIFF_TOOL=/usr/bin/diff" "-DFC_TOOL=" "-DIGRAPH_VERSION=0.10.0-dev+8407c7a2" "-P" "/igraph/etc/cmake/run_legacy_tes
t.cmake"
Directory: /igraph/build/tests
"example::dqueue" start time: Aug 15 08:29 UTC
Output:
----------------------------------------------------------
Test exited abnormally with error: Permission denied
=========================================
CMake Error at /igraph/etc/cmake/run_legacy_test.cmake:45 (message):
  Exiting test.

/usr/bin/diff and /usr/bin/cmake exist, all permissions seem to be ok in the igraph directory. I can execute /emsdk/node/14.18.2_64bit/bin/node /igraph/build/tests/example_dqueue.js, which gives the correct result.

flange-ipb avatar Aug 15 '22 08:08 flange-ipb

Thank you, this answers my question. Unlike during normal cross-compilation, the system is able to run arithchk.js, so it is not necessary to set an arith.h header manually.

This means that we still need to test igraph in a more traditional cross-compilation environment before the 0.10 release.

szhorvat avatar Aug 15 '22 08:08 szhorvat

Tests whose output is verified are executed differently, through run_legacy_test.cmake, which uses execute_process. It is likely trying to execute the .js directly without prefixing it with the node command. arithchk is run in a different way, through add_custom_command, which is somehow made to work correctly by emcmake.

We'll get back to this in time, but there may be a pause in activity here for a week or two from now on.

szhorvat avatar Aug 15 '22 09:08 szhorvat

The key is probably to use CROSSCOMPILING_EMULATOR when using execute_process.

szhorvat avatar Aug 15 '22 09:08 szhorvat

Tests whose output is verified are executed differently, through run_legacy_test.cmake, which uses execute_process. It is likely trying to execute the .js directly without prefixing it with the node command.

Indeed, this is what I also just found. I'm checking Emscripten's documentation ...

flange-ipb avatar Aug 15 '22 09:08 flange-ipb

The information necessary for the detection of Emscripten is not available in run_legacy_test.cmake at the moment. CROSSCOMPILING_EMULATOR is also undefined. What I have in build/CMakeFiles/3.18.4/CMakeSystem.cmake looks interesting:

set(CMAKE_HOST_SYSTEM "Linux-5.10.0-16-amd64")
set(CMAKE_HOST_SYSTEM_NAME "Linux")
set(CMAKE_HOST_SYSTEM_VERSION "5.10.0-16-amd64")
set(CMAKE_HOST_SYSTEM_PROCESSOR "x86_64")

include("/emsdk/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake")

set(CMAKE_SYSTEM "Emscripten-1")
set(CMAKE_SYSTEM_NAME "Emscripten")
set(CMAKE_SYSTEM_VERSION "1")
set(CMAKE_SYSTEM_PROCESSOR "x86")

set(CMAKE_CROSSCOMPILING "TRUE")

set(CMAKE_SYSTEM_LOADED 1)

Emscripten.cmake can be found here. CMAKE_CROSSCOMPILING_EMULATOR becomes the node.js executable.

flange-ipb avatar Aug 15 '22 10:08 flange-ipb

Alright, I'm able to build the wasm wheel with the current master branch of igraph. :smiley: I'll try to include it in my web project now. (We only call the BLISS algorithm for a Graph converted from NetworkX.)

With igraph's develop branch I get a few compilation errors for the C glue code, see the log from my pyodide build run. Correction: The develop branch also builds successfully. I forgot to switch the branch to "develop" in the python-igraph repo.

flange-ipb avatar Aug 15 '22 11:08 flange-ipb

@flange-ipb Could you please try to build C/igraph from the latest develop branch and see if the test suite runs properly now?

szhorvat avatar Aug 15 '22 19:08 szhorvat

Looks better, but 36 tests failed. Here is the log.

If you want to setup an Action CI workflow, then mymindstorm/setup-emsdk might be an option. That's the one suggested by the Pyodide developers. Should be something like

runs-on: ubuntu-latest
  steps:
  - uses: actions/checkout@v3
  - uses: mymindstorm/setup-emsdk@v11
     with:
       version: 3.1.18
  - run: |
      mkdir -p build
      cd build
      emcmake cmake .. -DIGRAPH_WARNINGS_AS_ERRORS:BOOL=OFF
      cmake --build .
      cmake --build . --target check

flange-ipb avatar Aug 16 '22 07:08 flange-ipb