RTNeural
RTNeural copied to clipboard
RTNEURAL_DEFAULT_ALIGNMENT=8 on armv7, EIGEN backend
Hi, using RTNeural 4a540403e115bae18d29142a5f54e7c3598b6e51
docker buildx create --name mybuilder
docker buildx use mybuilder
docker run -it --rm --privileged tonistiigi/binfmt --install all # Install all qemu emulators
docker run --rm -it -u $UID --platform linux/arm -v "$(pwd):/workdir" debian:buster-slim bash
apt-get update && apt-get install -y build-essential cmake
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX="" -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON -DBUILD_BENCH=ON ../
make install DESTDIR="/workdir"
for the given env, cmake/SIMDExtensions.cmake set RTNEURAL_DEFAULT_ALIGNMENT=16. During build, I will see a lot of warnings like
warning: requested alignment 16 is larger than 8
this is related to Eigen backend compilation.
So far so good, I'm able to execute both dynamic and template
./rtneural_layer_bench lstm 10 1 12
Benchmarking lstm layer, with input size 1 and output size 12, with signal length 10 seconds
Processed 10 seconds of signal in 6.04064 seconds
1.65545x real-time
Testing templated implementation...
Processed 10 seconds of signal in 4.94528 seconds
2.02213x real-time
Templated layer is 1.2215x faster!
now I temporarily edited cmake/SIMDExtensions.cmake and set RTNEURAL_DEFAULT_ALIGNMENT=8, then recompile. The build ends without errors, then when executing
./rtneural_layer_bench lstm 10 1 12
Benchmarking lstm layer, with input size 1 and output size 12, with signal length 10 seconds
Processed 10 seconds of signal in 6.02594 seconds
1.65949x real-time
Testing templated implementation...
rtneural_layer_bench: /workdir/modules/RTNeural/modules/Eigen/Eigen/src/Core/MapBase.h:201: void Eigen::MapBase<Derived, 0>::checkSanity(typename Eigen::internal::enable_if<(Eigen::internal::traits<OtherDerived>
::Alignment > 0), void*>::type) const [with T = Eigen::Map<Eigen::Matrix<float, 12, 1, 0, 12, 1>, 16, Eigen::Stride<0, 0> >; Derived = Eigen::Map<Eigen::Matrix<float, 12, 1, 0, 12, 1>, 16, Eigen::Stride<0, 0> >;
typename Eigen::internal::enable_if<(Eigen::internal::traits<OtherDerived>::Alignment > 0), void*>::type = void*]: Assertion `( ((internal::UIntPtr(m_data) % internal::traits<Derived>::Alignment) == 0) || (cols
() * rows() * minInnerStride * sizeof(Scalar)) < internal::traits<Derived>::Alignment ) && "data is not aligned"' failed.
Aborted (core dumped)
I would conclude that RTNEURAL_DEFAULT_ALIGNMENT=8 is not supported by Eigen backend. However, it should be the correct alignment for a 32-bit processor.
NOTES: I'm not able to reporting XSIMD in this very same env since I'm experiencing several compilation errors, WIP -- The C compiler identification is GNU 8.3.0 -- The CXX compiler identification is GNU 8.3.0