faster implementation of SO3 random

Open dllu opened this issue 4 years ago • 3 comments

Following my comment on sophus I suspected that Marsaglia's rejection sampling is faster than the more elegant method that is currently being used.

So I implemented this method.

Below are some benchmarks. Please check out my branch and test on your machine and see if your experience matches mine. It seems we can get about 20% more performance.

I am confused why the same code for generating a random quaternion is duplicated in SE3_base.h and in SO3_base.h. If anyone has ideas for refactoring it so that it is no longer duplicated, I will be happy to hear.

Benchmarks:

Benchmark code:

#include <Eigen/Dense>
#include <chrono>
#include <fstream>
#include <iostream>
#include <vector>

#include "manif/SO3.h"

int main() {
    std::ofstream fout("so3_randomization.txt");
    const auto tic = std::chrono::steady_clock::now();
    const size_t n = 20'000'000;
    std::vector<manif::SO3d, Eigen::aligned_allocator<manif::SO3d>> stuff;
    stuff.reserve(n);
    for (size_t i = 0; i < n; i++) {
        stuff.push_back(manif::SO3d::Random());
    }
    const auto toc = std::chrono::steady_clock::now();
    std::cout << "Time elapsed: "
              << std::chrono::duration<double>(toc - tic).count() << " s"
              << std::endl;

    for (const auto &zxcv : stuff) {
        fout << zxcv.w() << ", " << zxcv.x() << ", " << zxcv.y() << ", "
             << zxcv.z() << std::endl;
    }
    return 0;
}

compiler	manif branch	result (seconds)
`clang++ -O3 -I../manif/include -I/usr/include/eigen3 -I../manif/external/lt -o test test.cpp`	devel	1.62653
`clang++ -O3 -I../manif/include -I/usr/include/eigen3 -I../manif/external/lt -o test test.cpp`	dllu/faster-so3-random	1.0261
`g++ -O3 -I../manif/include -I/usr/include/eigen3 -I../manif/external/lt -o test test.cpp`	devel	1.46156
`g++ -O3 -I../manif/include -I/usr/include/eigen3 -I../manif/external/lt -o test test.cpp`	dllu/faster-so3-random	1.22057

Compilers used:

manif-test » clang++ --version
clang version 10.0.0 (/startdir/llvm-project-git bfb5b0cb86cf90d9fa794f873644aa642b652c43)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
manif-test » g++ --version
g++ (GCC) 9.1.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Test System

AMD Ryzen 9 3900X 12-Core Processor
Arch Linux with kernel version Linux 5.2.0-rc1-amd-staging-drm-next-git-b8cd95e15410+

Randomness tests

I used a variant of qqiu's MATLAB script.

clear; clc; close all;
qua = dlmread('so3_randomization.txt')(1:200000,:);
axang_arr = quat2axang(qua);
figure(1)
plot3(axang_arr(:,1), axang_arr(:,2), axang_arr(:,3), '.');
figure(2)
hist(axang_arr(:,4),100); 

qua = dlmread('so3_randomization_new.txt')(1:200000,:);
axang_arr = quat2axang(qua);
figure(3)
plot3(axang_arr(:,1), axang_arr(:,2), axang_arr(:,3), '.');
figure(4)
hist(axang_arr(:,4),100);

Because I am using a different quat2axang implementation, mine outputs angle in [0, 2pi] rather than [-pi, pi] like @qqfly 's. But it does not make a difference.

Top: devel branch; bottom: my branch. Results look identical.

Sep 19 '19 11:09 dllu

Codecov Report

Merging #105 into devel will decrease coverage by 0.15%. The diff coverage is 93.75%.

@@            Coverage Diff             @@
##            devel     #105      +/-   ##
==========================================
- Coverage   98.04%   97.89%   -0.16%     
==========================================
  Files          32       32              
  Lines        1025     1044      +19     
==========================================
+ Hits         1005     1022      +17     
- Misses         20       22       +2

Sep 19 '19 22:09 codecov-io

Hi @dllu, Thanks for the interesting PR. First, concerning the code duplication, you could make a standalone random_quaternion function (or such) in the manif/impl/utils.h file. Although I'm not sure a faster random quaternion generator is something critical, the plots looks good and the gain in speed is substantial. I don't see any reason not to merge it after the little refactoring suggested above. @joansola, thoughts?

Sep 20 '19 06:09 artivis

Hey, sorry I am a bit late for this one.

Thoughts: yes, faster is always good.

Also, the plots show a perfect match of performance. Mind that it is important to not fall back again to the issues discussed in #68 . So OK also in this regard.

For me it is OK to merge.

Nov 15 '19 19:11 joansola

manif manif copied to clipboard

faster implementation of SO3 random

Benchmarks:

Compilers used:

Test System

Randomness tests

Codecov Report

manif
manif copied to clipboard