reactphysics3d icon indicating copy to clipboard operation
reactphysics3d copied to clipboard

Benchmarks over 9x slower against Rapier3D

Open mrakh opened this issue 3 years ago • 1 comments

Hey Daniel,

Thanks for your work on this library. I tried comparing the performance of ReactPhysics3D against Rapier3D, a physics engine written in Rust. My results show that Rapier3D is over 9x faster:

Benchmark: 1000 ticks, 1/60 time step, 512 spheres
ReactPhysics3D: mean 1528.6 ms, stddev 77.3 ms
Rapier3D w/ SIMD: mean 163.6 ms, stddev 2.5 ms
Rapier3D w/o SIMD: mean 207.4 ms, stddev 12.4 ms

I'd like to figure out the reason behind this large discrepancy. Both tests initialize the world with the same objects in the same positions, run the same number of physics steps, and use the same solver iteration counts. I even tried using a custom allocator to discount the possibility of pathological allocation patterns, but that didn't change the results much. Perhaps you could take a look and tell me whether I am configuring something erroneously in the C++ code?

Here is the C++ code I am using to benchmark ReactPhysics3D:

#include <stdint.h>

#include <array>
#include <iostream>
#include <chrono>

#include <reactphysics3d/reactphysics3d.h>

static rp3d::DefaultAllocator allocator;
static rp3d::PoolAllocator pool_allocator{allocator};
static rp3d::PhysicsCommon common{&pool_allocator};

uint64_t benchmark(uint64_t cbrt_num_objs) {
    auto world = common.createPhysicsWorld();
    world->setNbIterationsVelocitySolver(15);
    world->setNbIterationsPositionSolver(8);
    world->setGravity(rp3d::Vector3{0, -9.81f, 0});

    auto ground = world->createCollisionBody(rp3d::Transform::identity());
    ground->addCollider(common.createBoxShape(rp3d::Vector3{200, 0.2f, 200}), rp3d::Transform::identity());
    auto sphere_shape = common.createSphereShape(0.5f);

    rp3d::RigidBody* body;
    for(uint64_t x = 0; x < cbrt_num_objs; ++x) {
        for(uint64_t y = 0; y < cbrt_num_objs; ++y) {
            for(uint64_t z = 0; z < cbrt_num_objs; ++z) {
                auto center = cbrt_num_objs / 2.0f;
                auto pos = rp3d::Vector3{
                    (x - center) * 1.05f,
                    1.0f + y * 1.05f,
                    (z - center) * 1.05f
                };
                auto rigid_body = world->createRigidBody(rp3d::Transform{pos, rp3d::Quaternion::identity()});
                auto collider = rigid_body->addCollider(sphere_shape, rp3d::Transform::identity());
                collider->getMaterial().setBounciness(0.7f);
                body = rigid_body;
            }
        }
    }

    auto start_time = std::chrono::steady_clock::now();
    for(uint64_t i = 0; i < 1000; ++i) {
        world->update(1.0f / 60);
    }
    auto end_time = std::chrono::steady_clock::now();

    // Do not let compiler optimize away whole simulation!
    if(body->getTransform().getPosition().x == -100.0f) {
        std::cout << "IUEGFOEIYUFGE " << body->getTransform().getPosition().x << std::endl;
    }

    auto elapsed = std::chrono::duration_cast<std::chrono::microseconds>(end_time - start_time).count();
    
    return static_cast<uint64_t>(elapsed);
}

int main(int argc, char** argv) {
    for(size_t i = 0; i < 8; ++i) {
        auto result = benchmark(8);
        std::cout << result << std::endl;
    }
    return 0;
}

And here is the flame graph from a perf run of the above code:

perf

For comparison, here is the Rust code for the Rapier3D benchmark:

use rapier3d::prelude::*;

fn benchmark(cbrt_num_objs: u64) -> u64 {
    let mut rigid_body_set = RigidBodySet::new();
    let mut collider_set = ColliderSet::new();

    /* Create the ground. */
    let collider = ColliderBuilder::cuboid(100.0, 0.1, 100.0).build();
    collider_set.insert(collider);

    let mut handle = None;
    for x in 0..cbrt_num_objs {
        for y in 0..cbrt_num_objs {
            for z in 0..cbrt_num_objs {
                let center = (cbrt_num_objs as f32) / 2.0;
                let rigid_body = RigidBodyBuilder::new_dynamic()
                    .translation(vector![
                        ((x as f32) - center) * 1.05,
                        1.0 + (y as f32) * 1.05,
                        ((z as f32) - center) * 1.05
                    ]).build();
                let collider = ColliderBuilder::ball(0.5).restitution(0.7).build();
                let rigid_body_handle = rigid_body_set.insert(rigid_body);
                collider_set.insert_with_parent(collider, rigid_body_handle, &mut rigid_body_set);
                handle = Some(rigid_body_handle);
            }
        }
    }

    /* Create other structures necessary for the simulation. */
    let gravity = vector![0.0, -9.81, 0.0];
    let integration_parameters = IntegrationParameters {
        dt: 1.0 / 60.0,
        max_velocity_iterations: 15,
        max_position_iterations: 8,
        ..Default::default()
    };
    let mut physics_pipeline = PhysicsPipeline::new();
    let mut island_manager = IslandManager::new();
    let mut broad_phase = BroadPhase::new();
    let mut narrow_phase = NarrowPhase::new();
    let mut joint_set = JointSet::new();
    let mut ccd_solver = CCDSolver::new();
    let physics_hooks = ();
    let event_handler = ();

    let start_time = std::time::Instant::now();

    /* Run the game loop, stepping the simulation once per frame. */
    for _ in 0..1000 {
        physics_pipeline.step(
            &gravity,
            &integration_parameters,
            &mut island_manager,
            &mut broad_phase,
            &mut narrow_phase,
            &mut rigid_body_set,
            &mut collider_set,
            &mut joint_set,
            &mut ccd_solver,
            &physics_hooks,
            &event_handler,
        );
    }

    let end_time = std::time::Instant::now();

    {
        // Do not let compiler optimize away the whole simulation!
        let body = &rigid_body_set[handle.unwrap()];
        if body.translation().x == -100.0f32 {
            println!("IUEGFOEIYUFGE {}", body.translation().x);
        }
    }
    (end_time - start_time).as_micros() as u64
}

fn main() {
    for _ in 0..8 {
        let result = benchmark(8);
        println!("{}", result);
    }
}

And here is the Cargo.toml for the Rapier3D test (Non-SIMD omits the features = [ "simd-stable"] entry):

[dependencies]
rapier3d = { version = "0.11.1", features = [ "simd-stable" ] }

[profile.release]
lto = false

The ReactPhysics3D benchmark was compiled with g++ v9.4.0 and -O3. The Rapier3D benchmark was built with rustc v1.59.0 and cargo build --release.

mrakh avatar Apr 04 '22 19:04 mrakh

Hello. Thanks for your feedback.

I don't see anything wrong with your code using ReactPhysics3D. I don't really know Rapier3D but I see that in IntegrationParameters, the two iterations parameters are named max_velocity_iterations and max_position_iterations. In the documentation here, it's written that those values are maximum number of iterations. Does it mean that during some update steps the number of iterations used by Rapier3D is smaller than those maximum values ? If so, it would be difficult to compare because in ReactPhysics3D those are not maximum values but those are the exact number of iterations that are taken at each step. Moreover, the values you have chosen for ReactPhysics3D (nb velocity iterations=15 and nb position iterations=8) are quite high. For instance by default in ReactPhysics3D, those values are nb velocity iterations=6 and nb position iterations=3.

I guess that the 9x speedup comes from the difference between those iterations steps (maybe they are not used the exact same way in both libraries) but I am not sure. Maybe, instead, you should try to tweak those values in the two libraries in order to have a 'similar' visual result of the simulation or the result that you consider good enough for your use with the smallest number of iterations in both libraries and then choose the library with the better performance.

DanielChappuis avatar Apr 05 '22 05:04 DanielChappuis

Just wanted to update, it turns out that I didn't set up the collisions properly in my simulations. Explicitly adding a static rigidbody to the ground in each simulation fixed the large discrepancy in run time. Both physics engines run more or less equally fast now, but I haven't benchmarked them to get precise numbers.

mrakh avatar Aug 24 '22 19:08 mrakh

Thanks a lot for your feedback.

DanielChappuis avatar Aug 24 '22 20:08 DanielChappuis