manif
manif copied to clipboard
faster implementation of SO3 random
Following my comment on sophus I suspected that Marsaglia's rejection sampling is faster than the more elegant method that is currently being used.
So I implemented this method.
Below are some benchmarks. Please check out my branch and test on your machine and see if your experience matches mine. It seems we can get about 20% more performance.
I am confused why the same code for generating a random quaternion is duplicated in SE3_base.h
and in SO3_base.h
. If anyone has ideas for refactoring it so that it is no longer duplicated, I will be happy to hear.
Benchmarks:
Benchmark code:
#include <Eigen/Dense>
#include <chrono>
#include <fstream>
#include <iostream>
#include <vector>
#include "manif/SO3.h"
int main() {
std::ofstream fout("so3_randomization.txt");
const auto tic = std::chrono::steady_clock::now();
const size_t n = 20'000'000;
std::vector<manif::SO3d, Eigen::aligned_allocator<manif::SO3d>> stuff;
stuff.reserve(n);
for (size_t i = 0; i < n; i++) {
stuff.push_back(manif::SO3d::Random());
}
const auto toc = std::chrono::steady_clock::now();
std::cout << "Time elapsed: "
<< std::chrono::duration<double>(toc - tic).count() << " s"
<< std::endl;
for (const auto &zxcv : stuff) {
fout << zxcv.w() << ", " << zxcv.x() << ", " << zxcv.y() << ", "
<< zxcv.z() << std::endl;
}
return 0;
}
compiler | manif branch | result (seconds) |
---|---|---|
clang++ -O3 -I../manif/include -I/usr/include/eigen3 -I../manif/external/lt -o test test.cpp |
devel | 1.62653 |
clang++ -O3 -I../manif/include -I/usr/include/eigen3 -I../manif/external/lt -o test test.cpp |
dllu/faster-so3-random | 1.0261 |
g++ -O3 -I../manif/include -I/usr/include/eigen3 -I../manif/external/lt -o test test.cpp |
devel | 1.46156 |
g++ -O3 -I../manif/include -I/usr/include/eigen3 -I../manif/external/lt -o test test.cpp |
dllu/faster-so3-random | 1.22057 |
Compilers used:
manif-test » clang++ --version
clang version 10.0.0 (/startdir/llvm-project-git bfb5b0cb86cf90d9fa794f873644aa642b652c43)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
manif-test » g++ --version
g++ (GCC) 9.1.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Test System
- AMD Ryzen 9 3900X 12-Core Processor
- Arch Linux with kernel version Linux 5.2.0-rc1-amd-staging-drm-next-git-b8cd95e15410+
Randomness tests
I used a variant of qqiu's MATLAB script.
clear; clc; close all;
qua = dlmread('so3_randomization.txt')(1:200000,:);
axang_arr = quat2axang(qua);
figure(1)
plot3(axang_arr(:,1), axang_arr(:,2), axang_arr(:,3), '.');
figure(2)
hist(axang_arr(:,4),100);
qua = dlmread('so3_randomization_new.txt')(1:200000,:);
axang_arr = quat2axang(qua);
figure(3)
plot3(axang_arr(:,1), axang_arr(:,2), axang_arr(:,3), '.');
figure(4)
hist(axang_arr(:,4),100);
Because I am using a different quat2axang
implementation, mine outputs angle in [0, 2pi] rather than [-pi, pi] like @qqfly 's. But it does not make a difference.
Top: devel branch; bottom: my branch. Results look identical.
Codecov Report
Merging #105 into devel will decrease coverage by
0.15%
. The diff coverage is93.75%
.
@@ Coverage Diff @@
## devel #105 +/- ##
==========================================
- Coverage 98.04% 97.89% -0.16%
==========================================
Files 32 32
Lines 1025 1044 +19
==========================================
+ Hits 1005 1022 +17
- Misses 20 22 +2
Hi @dllu,
Thanks for the interesting PR.
First, concerning the code duplication, you could make a standalone random_quaternion
function (or such) in the manif/impl/utils.h
file.
Although I'm not sure a faster random quaternion generator is something critical, the plots looks good and the gain in speed is substantial. I don't see any reason not to merge it after the little refactoring suggested above.
@joansola, thoughts?
Hey, sorry I am a bit late for this one.
Thoughts: yes, faster is always good.
Also, the plots show a perfect match of performance. Mind that it is important to not fall back again to the issues discussed in #68 . So OK also in this regard.
For me it is OK to merge.