VTKPythonPackage icon indicating copy to clipboard operation
VTKPythonPackage copied to clipboard

aarch64 support

Open odidev opened this issue 5 years ago • 50 comments
trafficstars

VTK installation using pip on aarch64 machine fails with below issue:

pip3 install vtk       
Collecting vtk
  Could not find a version that satisfies the requirement vtk (from versions: )
No matching distribution found for vtk

odidev avatar Apr 02 '20 12:04 odidev

Thanks for the report, this is related to the discussion here:

  • https://github.com/scikit-build/cmake-python-distributions/issues/96
  • https://gitlab.kitware.com/vtk/vtk/-/merge_requests/5543

A first step would be to try to build and test VTK on aarch64. Did you have any success doing so ?

jcfr avatar Apr 02 '20 12:04 jcfr

@jcfr , I have build the VTK wheel on aarch64 architecture but not sure how to test it. Can you please help me here?

odidev avatar Apr 03 '20 11:04 odidev

@jcfr is there anything else required from my side.

odidev avatar Apr 22 '20 14:04 odidev

@odidev if you built the wheel, you can just run python -m pip install /path/to/vtk-xyz.whl to install it.

@martinken @tjcorona @mathstuf

thewtex avatar Apr 24 '20 22:04 thewtex

I doubt we're going to be able to provide aarch64 wheels as long as we lack hardware for testing and building them regularly.

mathstuf avatar Apr 27 '20 13:04 mathstuf

I have build the VTK wheel on aarch64 architecture but not sure how to test it. Can you please help me here?

is there anything else required from my side.

@odidev This is exciting and thanks for sharing updates and thanks for your patience :pray:

As @mathstuf mentioned, we do not have the infrastructure to test here but we shouldn't let that stop us from having wheels for aarch64. I am sure we could find community members willing to help support this.

Now, would you like to document the steps used to build the VTK wheel ?

Also, since the process to build the wheel has been completely revamped (see here), would be great to try these.

jcfr avatar Apr 27 '20 14:04 jcfr

@jcfr, I explored the link (https://gitlab.kitware.com/vtk/vtk/-/blob/master/Documentation/dev/build.md#python-wheels) suggested by you. Please find below steps to build VTK aarch64 wheel.

Steps to build aarch64 wheel

  • apt install -y build-essential mesa-common-dev mesa-utils freeglut3-dev ninja-build git cmake

  • git clone --recursive https://gitlab.kitware.com/vtk/vtk.git

  • mkdir build

  • cd build

  • cmake -GNinja -DVTK_WHEEL_BUILD=ON -DVTK_WRAP_PYTHON=ON ../vtk/

  • ninja

  • python setup.py bdist_wheel

Installation/import logs

root@b605f0ff0d7c:~# pip install /build/dist/vtk-9.0.0-cp37-cp37m-linux_aarch64.whl 
Processing /build/dist/vtk-9.0.0-cp37-cp37m-linux_aarch64.whl
Installing collected packages: vtk
Successfully installed vtk-9.0.0
root@b605f0ff0d7c:~# python
Python 3.7.7 (default, Apr 21 2020, 08:59:39) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import vtk
>>> print(vtk.vtkVersion.GetVTKSourceVersion())
vtk version 9.0.0
>>> 

odidev avatar Apr 28 '20 06:04 odidev

@jcfr, can you please let me know if I can help with anything else.

odidev avatar Jun 01 '20 05:06 odidev

We don't do aarch64 testing of the codebase, so without that, I'm hesitant to make official releases of builds. Is there some way to mark uploads to PyPI as experimental or the like (not per-version, but per artifact)?

mathstuf avatar Jun 01 '20 16:06 mathstuf

@mathstuf - thanks for this. I work with @odidev on the AArch64 enablement. I think your asking about marking the aarch64 wheel as experimental in PyPI - something that I don't believe is supported.

We may be able to help by adding some automatic testing. As a simple end-to-end case, I guess running something like https://vtk.org/Wiki/VTK/Examples/Python/Cylinder with xvfb and comparing the result to a known good rendering would be a start. Would that provide sufficient confidence to move forward with an AArch64 wheel?

rhenwood-arm avatar Jun 08 '20 14:06 rhenwood-arm

I'd like a full test suite run on aarch64 (with Python enaled). This would be a normal (non-wheel) build of VTK since our wheel build configuration doesn't work with the test suite (due to the library relocations). You should be able to submit a run to our CDash with ctest -D Experimental once VTK is configured. You can also run the steps individually, but the CTest docs are better for that. A CTest script may also be more reproducible if it isn't just a simple run.

mathstuf avatar Jun 08 '20 15:06 mathstuf

Thanks for this @mathstuf, I'll work with @odidev to look into this.

rhenwood-arm avatar Jun 08 '20 16:06 rhenwood-arm

Hi @mathstuf, I tried running tests but it looks like the tests are not getting picked up. Please find below steps which I followed to run the tests. Also, can you please point it out if I am missing anything here?

Steps to build vtk

  • apt install -y build-essential mesa-common-dev mesa-utils freeglut3-dev ninja-build git cmake
  • git clone --recursive https://gitlab.kitware.com/vtk/vtk.git
  • mkdir build
  • cd build
  • cmake -GNinja -DVTK_WRAP_PYTHON=ON ../vtk/ # Non-wheel, python enabled build
  • ninja

Running ctest

root@fc2408cb3eaa:~/build# ctest -D Experimental
   Site: fc2408cb3eaa
   Build name: Linux-c++
Create new tag: 20200609-1139 - Experimental
Configure project
   Each . represents 1024 bytes of output
    . Size of output: 0K
Build project
   Each symbol represents 1024 bytes of output.
   '!' represents an error and '*' a warning.
    . Size of output: 0K
   0 Compiler errors
   0 Compiler warnings
Test project /root/build
No tests were found!!!
Performing coverage
 Cannot find any coverage files. Ignoring Coverage request.
Submit files
   SubmitURL: http://open.cdash.org/submit.php?project=VTK
   Uploaded: /root/build/Testing/20200609-1139/Configure.xml
   Uploaded: /root/build/Testing/20200609-1139/Build.xml
   Uploaded: /root/build/Testing/20200609-1139/Test.xml
   Uploaded: /root/build/Testing/20200609-1139/Done.xml
   Submission successful
root@fc2408cb3eaa:~/build#

odidev avatar Jun 09 '20 12:06 odidev

The tests are off by default. Should be VTK_BUILD_TESTING=WANT.

mathstuf avatar Jun 09 '20 12:06 mathstuf

OK. Let me try that. Thanks for the quick response.

odidev avatar Jun 09 '20 12:06 odidev

@mathstuf , I tried configuring VTK by "cmake -GNinja VTK_BUILD_TESTING=WANT -DVTK_WRAP_PYTHON=ON ../vtk/" command but even this is not helping me. Please find below logs-

root@fc2408cb3eaa:~/build# ctest -D Experimental
   Site: fc2408cb3eaa
   Build name: Linux-c++
Create new tag: 20200609-1320 - Experimental
Configure project
   Each . represents 1024 bytes of output
    . Size of output: 0K
Build project
   Each symbol represents 1024 bytes of output.
   '!' represents an error and '*' a warning.
    ... Size of output: 3K
   0 Compiler errors
   0 Compiler warnings
Test project /root/build
No tests were found!!!
Performing coverage
 Cannot find any coverage files. Ignoring Coverage request.
Submit files
   SubmitURL: http://open.cdash.org/submit.php?project=VTK
   Uploaded: /root/build/Testing/20200609-1320/Configure.xml
   Uploaded: /root/build/Testing/20200609-1320/Build.xml
   Uploaded: /root/build/Testing/20200609-1320/Test.xml
   Uploaded: /root/build/Testing/20200609-1320/Done.xml
   Submission successful
root@fc2408cb3eaa:~/build#

odidev avatar Jun 09 '20 13:06 odidev

@mathstuf, can you please suggest what needs to be done here?

odidev avatar Jul 07 '20 14:07 odidev

For reference, build options for VTK are documented here: https://github.com/Kitware/VTK/blob/master/Documentation/dev/build.md#build-settings

Instead of running ctest -D Experimental, I suggest you try to directly use ctest to run the test. What do you get doing the following:

git clone --recursive https://gitlab.kitware.com/vtk/vtk.git
mkdir build
cd build
cmake -GNinja -DVTK_WRAP_PYTHON=ON -DVTK_BUILD_TESTING=WANT ../vtk/ 
ninja
ctest -N

Running ctest -N should list all tests that have been enabled. See https://cmake.org/cmake/help/latest/manual/ctest.1.html#options

jcfr avatar Jul 07 '20 15:07 jcfr

cmake -GNinja VTK_BUILD_TESTING=WANT

It looks like you missed -D on the testing argument.

mathstuf avatar Jul 07 '20 15:07 mathstuf

@mathstuf Thanks for your support. I am able to run the test cases on x86 machine but few test cases are failing. I investigated about the possible reasons for the failures and I think that they are failing because of some system dependency. Could you please look into the logs and point the possible reasons for the failed test cases. Attaching the test logs for your reference. VTK_Text_logs.txt

odidev avatar Sep 01 '20 07:09 odidev

(I'm working with @odidev). @odidev - I guess the good news is that all the failures look the same: 'numerical' :)

Can you work on reproducing a single test with a complete example of how to reproduce the single test failure. Also, please can you output more verbose logging information as this may reveal the source of the problem.

rhenwood-arm avatar Sep 01 '20 21:09 rhenwood-arm

I have attached the test file TestConeLayoutStrategy.txt which looks to be a minimal program to reproduce the issue.

odidev avatar Nov 05 '20 12:11 odidev

I have ran example program to display the cylinder on both aarch64 and x86_64 architecture. here is the test program: test.txt

Please see blow outputs on both the architectures

On aarch64 aarch64

On x86_64 x86_64

odidev avatar Jan 20 '21 06:01 odidev

@mathstuf, I ran single test case with debug logs enabled using command ctest -R TestGPURayCastSlicePlane --debug 2>&1 | tee debug-log-aarch64.txt. Attached the debug log file here for your reference. Could you please help me debugging this test failure? Thanks in advance.

odidev avatar Feb 10 '21 07:02 odidev

I have no idea how to figure out rendering/OpenGL failures, sorry. @martinken?

mathstuf avatar Feb 10 '21 15:02 mathstuf

@mathstuf

Out of many issues in VTK testing on Linux/ARM64 machines, the most prominent issue seems to be the import error, while importing the module: ‘vtkCommonCore’. Please find the error logs below:

/home/ubuntu/cmake-3.20.0-rc3/Source/CTest/cmCTestRunTest.cxx:41 24:   File "/home/ubuntu/01_VTK_Project/build/lib/python3.8/site-packages/vtkmodules/__init__.py", line 53, in <module>
/home/ubuntu/cmake-3.20.0-rc3/Source/CTest/cmCTestRunTest.cxx:41 24:     from . import vtkCommonCore
/home/ubuntu/cmake-3.20.0-rc3/Source/CTest/cmCTestRunTest.cxx:41 24: ImportError: /home/ubuntu/01_VTK_Project/build/lib/python3.8/site-packages/vtkmodules/vtkCommonCore.cpython-38-aarch64-linux-gnu.so: undefined symbol: _Py_NotImplementedStruct
/home/ubuntu/cmake-3.20.0-rc3/Source/CTest/cmCTestRunTest.cxx:41 24: During handling of the above exception, another exception occurred:
/home/ubuntu/cmake-3.20.0-rc3/Source/CTest/cmCTestRunTest.cxx:41 24:   File "/home/ubuntu/01_VTK_Project/vtk/Utilities/vtkTclTest2Py/rtImageTest.py", line 12, in <module>
/home/ubuntu/cmake-3.20.0-rc3/Source/CTest/cmCTestRunTest.cxx:41 24:     import vtk

Above logs are the result of following test command:

ctest  -R PythonContext2DPython-testPythonItem --debug

The above error logs are the same in all the test cases, which shows the import error.

According to the logs, ‘vtkCommonCore’ is not getting imported, which results in an undefined symbol error for ‘_Py_NotImplementedStruct’.

I have tried following resolutions to fix this error:

  • I have verified that the vtk modules are successfully generated and are present at “/home/ubuntu/01_VTK_Project/build/lib/python3.8/site-packages/vtkmodules” location.
  • I have updated the cmake to the latest version (cmake-3.20.0-rc3) and then recompiled the VTK with the new cmake, but encountered the same error during testing.
  • Python availability (v3.8) has been verified to be installed for 64 bit ARM.
  • The environment variable ‘PYTHONPATH’ has also been exported with “/home/ubuntu/01_VTK_Project/build/lib/vtk”, but that has not helped.
  • I have tried setting LD_LIBRARY_PATH environment variable in “/home/ubuntu/01_VTK_Project/build/lib/python3.8/site-packages/vtkmodules/init.py” file, by adding following code to the file:
import os
modules_rough = "/home/ubuntu/01_VTK_Project/build/lib/python3.8/site-packages/vtkmodules"
os.environ['LD_LIBRARY_PATH'] = modules_rough
try:
    from . import vtkCommonCore
except ImportError:
    import _vtkmodules_static

But the error remains the same.

Can you please provide me with some pointers to this import issue during VTK testing on Linux/ARM64 machines? Once this error gets resolved, I will look into the rendering issue in other test cases.

odidev avatar Mar 05 '21 11:03 odidev

That's odd. Can you try with -DVTK_PYTHON_OPTIONAL_LINK=OFF?

Though it is used in the generated code and is in the Python headers. Is it not in libpython?

mathstuf avatar Mar 05 '21 14:03 mathstuf

@mathstuf Thank you for the suggestion. However, using -DVTK_PYTHON_OPTIONAL_LINK=OFF flag along with cmake brought dangerous relocation: unsupported relocation error during ninja command.

I suspected that the python import issue must be hitting because of some varied system configuration. So, I moved to another ARM64 server, and built and tested VTK. The results are much better than before, and there are no python import issues.

95% tests are passing now, only 112 tests out of 2156 tests are failing.

  • In the remaining 112 failing tests on an ARM64 server machine (accessed via mobxterm), the majority of the test cases (92) are failing because of glBlitFramebuffer operation. Below are the exact error logs picked while running single test in debug mode:
/io/src/Source/CTest/cmCTestRunTest.cxx:41 706: (  30.212s) [main thread     ]vtkOpenGLFramebufferObj:1390  WARN| failed at glBlitFramebuffer 1 OpenGL errors detected
/io/src/Source/CTest/cmCTestRunTest.cxx:41 706:   0 : (1282) Invalid operation
…
…
…
Failed Image Test ( TestSeedWidget2.png ) : 487.931

This error comes from this line in the VTK project. According to the documentation here, it seems that the glBlitFramebuffer function copies the block of pixels from one texture buffer object to another. However, I don’t have a Graphics card/GPU attached to my server machine. I was suspecting that the test fails to copy the pixels from objects, as there is no GPU available.

I again ran the test suite on a AMD64 server machine(accessed via mobxterm), and found that out of above 92 test cases failing in ARM64, only 50 tests showed the same issue (mentioned above) on AMD64 server machine. And on a local AMD64 Desktop machine, just 16 out of the above 92 test cases failed. Hence, I am doubtful if this issue is related to GPU, or there are some other complications involved. Also, there are certain test cases that are only failing on x64 server machine with the above error, but passing on ARM64 server machine.

  • Another common issue seen on ARM64 test failure results is the hardware limitation issue, as mentioned below:
/io/src/Source/CTest/cmCTestRunTest.cxx:41 705: (  12.898s) [main thread     ]   vtkTextureObject.cxx:1025   ERR| vtkTextureObject (0xaaab19090980): Attempt to use a texture buffer exceeding your hardware's limits. This can happen when trying to color by cell data with a large dataset. Hardware limit is 65536 values while 146404 was requested.

The error logs come from this line in VTK project. vtkglGetIntegerv(GL_MAX_TEXTURE_BUFFER_SIZE, &maxSize) function is used to fetch the maximum texture buffer size and place it in maxSize variable. As I do not have a GPU, I am suspecting that 65536 is taken as a default buffer size, and is compared to the requested numValues variable. As 65536 is smaller than the requested value, the check fails and hence the test fails. 11 tests are failing on ARM64 server machine due to this issue. However these tests are passing on the local AMD64 desktop machine.

  • There are few remaining tests that are failing due to image differencing between valid and test image, assertion issue, etc.

As the test results are varied on different machines, can you please provide some pointers and check if these issues are related to unavailability of GPU or not? Are there any possibilities to pass these tests alternatively? This can help me progress in this activity.

Also, may I please know if any of the above test failures can be ignored or not?

Please let me know if you need any further information from me.

odidev avatar Mar 23 '21 14:03 odidev

@martinken Any thoughts on the OpenGL failures/differences here?

mathstuf avatar Mar 23 '21 14:03 mathstuf

The texture buffer limit is a mesa issue and we have a MR in to fix it so fee free to ignore those. Haven't had a chance to look at the others.

martinken avatar Mar 23 '21 20:03 martinken