particle-life icon indicating copy to clipboard operation
particle-life copied to clipboard

200+ FPS (on an simple laptop)

Open ker2x opened this issue 2 years ago • 12 comments

i'm still busy refactoring and i'm currently relying on intel's oneAPI and TBB for multithreading. But you can check the code here https://github.com/ker2x/particle-life/tree/oneapi-dpl/particle_life/src , and perhaps backport the modification to a normal compiler and normal lib. (or i'll dot it myself some day i guess).

It's not fully optimized yet but, notable change :

  • Using Vertex Buffer (vbo) instead of bruteforcing call to circle.
	void Draw(colorGroup group)
	{
		ofSetColor(group.color);
		vbo.setVertexData(group.pos.data(), group.pos.size(), GL_DYNAMIC_DRAW);
		vbo.draw(GL_POINTS, 0, group.pos.size());

	}
  • Using SOA instead of AOS. better possible vectorization, and it was needed to efficiently use VBO anyway
struct colorGroup {
	std::vector<ofVec2f> pos;
	std::vector<float> vx;
	std::vector<float> vy;
	ofColor color;
};
  • it should also allow to add more color more easily (i hope)

  • major cleanup of interaction code

void ofApp::interaction(colorGroup& Group1, const colorGroup& Group2, 
		const float G, const float radius, bool boundsToggle) const
{
	
	assert(Group1.pos.size() % 64 == 0);
	assert(Group2.pos.size() % 64 == 0);
	
	const float g = G / -100;	// attraction coefficient

//		oneapi::tbb::parallel_for(
//			oneapi::tbb::blocked_range<size_t>(0, group1size), 
//			[&Group1, &Group2, group1size, group2size, radius, g, this]
//			(const oneapi::tbb::blocked_range<size_t>& r) {

	for (size_t i = 0; i < Group1.pos.size(); i++)
	{
		float fx = 0;	// force on x
		float fy = 0;	// force on y
		
		for (size_t j = 0; j < Group2.pos.size(); j++)
		{
			const float distance = Group1.pos[i].distance(Group2.pos[j]);
			if ((distance < radius)) {
				const float force = 1 / std::max(std::numeric_limits<float>::epsilon(), distance);	// avoid dividing by zero
				fx += ((Group1.pos[i].x - Group2.pos[j].x) * force);
				fy += ((Group1.pos[i].y - Group2.pos[j].y) * force);
			}
		}

		// Wall Repel
		if (wallRepel > 0.0F)
		{
			if (Group1.pos[i].x < wallRepel) Group1.vx[i] += (wallRepel - Group1.pos[i].x) * 0.1;
			if (Group1.pos[i].x > boundWidth - wallRepel) Group1.vx[i] += (boundWidth - wallRepel - Group1.pos[i].x) * 0.1;
			if (Group1.pos[i].y < wallRepel) Group1.vy[i] += (wallRepel - Group1.pos[i].y) * 0.1;
			if (Group1.pos[i].y > boundHeight - wallRepel) Group1.vy[i] += (boundHeight - wallRepel - Group1.pos[i].y) * 0.1;
		}

		// Viscosity & gravity
		Group1.vx[i] = (Group1.vx[i] + (fx * g)) * (1.0 - viscosity);
		Group1.vy[i] = (Group1.vy[i] + (fy * g)) * (1.0 - viscosity) + worldGravity;
//		Group1.vx[i] = std::fmaf(Group1.vx[i], (1.0F - viscosity), std::fmaf(fx, g, 0.0F));
//		Group1.vy[i] = std::fmaf(Group1.vy[i], (1.0F - viscosity), std::fmaf(fy, g, worldGravity));

		//Update position
		Group1.pos[i].x += Group1.vx[i];
		Group1.pos[i].y += Group1.vy[i];
	}

	if (boundsToggle) {
		for (auto& p : Group1.pos)
		{
			p.x = std::min(std::max(p.x, 0.0F), static_cast<float>(boundWidth));
			p.y = std::min(std::max(p.y, 0.0F), static_cast<float>(boundHeight));
		}
	}	
}

i still have some crap to clean :)

  • using oneapi::parallel_invoke for parallelization
	oneapi::tbb::parallel_invoke(
		[&] { interaction(red,   red,   powerSliderRR, vSliderRR, boundsToggle); },
		[&] { interaction(red,   green, powerSliderRR, vSliderRG, boundsToggle); },
		[&] { interaction(red,   blue,  powerSliderRR, vSliderRB, boundsToggle); },
		[&] { interaction(red,   white, powerSliderRR, vSliderRW, boundsToggle); },
		[&] { interaction(green, red,   powerSliderGR, vSliderGR, boundsToggle); },
		[&] { interaction(green, green, powerSliderGG, vSliderGG, boundsToggle); },
		[&] { interaction(green, blue,  powerSliderGB, vSliderGB, boundsToggle); },
		[&] { interaction(green, white, powerSliderGW, vSliderGW, boundsToggle); },
		[&] { interaction(blue,  red,   powerSliderBR, vSliderBR, boundsToggle); },
		[&] { interaction(blue,  green, powerSliderBG, vSliderBG, boundsToggle); },
		[&] { interaction(blue,  blue,  powerSliderBB, vSliderBB, boundsToggle); },
		[&] { interaction(blue,  white, powerSliderBW, vSliderBW, boundsToggle); },
		[&] { interaction(white, red,   powerSliderWR, vSliderWR, boundsToggle); },
		[&] { interaction(white, green, powerSliderWG, vSliderWG, boundsToggle); },
		[&] { interaction(white, blue,  powerSliderWB, vSliderWB, boundsToggle); },
		[&] { interaction(white, white, powerSliderWW, vSliderWW, boundsToggle); }
	);

this is me slowly learning to use oneAPI and SYCL in order to offload all the parallel code to the GPU in the future (in a new project)

The biggest performance improvement come from the use of SOA and VBO.

ker2x avatar Dec 07 '22 23:12 ker2x

Great job looking forward to it 👍 👍 💯

hunar4321 avatar Dec 08 '22 18:12 hunar4321

It is not working for me

KhadrasWellun avatar Dec 08 '22 18:12 KhadrasWellun

It is not working for me

yes. I'll try to make something mergeable with the main project, and independent of oneAPI.

ker2x avatar Dec 08 '22 20:12 ker2x

it took me a while. the code is unfortunately much slower on MSVC than on intel compiler. But still faster than the previous version of course.

i also removed the dependency to intel TBB so no 200FPS (it can still be seen in commented code however)

ker2x avatar Dec 19 '22 17:12 ker2x

Hi! I want to set different particle sizes depending on their type. For example, red should be 1.0 pixels, green 1.2 pixels, blue 1.4 pixels, and so on. I saw in the code that a size is defined for all particles in ofApp.h. How could I condition this particle size with an "if"? I'm referring to this piece of code: void draw() const { ofSetColor(r, g, b, 100); //set particle color + some alpha ofDrawCircle(x, y, 1.5F); //draws a point at x,y coordinates, the size of a 1.5 pixel circle } My colors are defined by generic names: void ofApp::restart() { if (numberSliderα > 0) { alpha = CreatePoints(numberSliderα, 0, 0, ofRandom(64, 255)); } if (numberSliderβ > 0) { betha = CreatePoints(numberSliderβ, 0, ofRandom(64, 255), 0); } if (numberSliderγ > 0) { gamma = CreatePoints(numberSliderγ, ofRandom(64, 255), 0, 0); } if (numberSliderδ > 0) { elta = CreatePoints(numberSliderδ, ofRandom(64, 255), ofRandom(64, 255), 0); } if (numberSliderε > 0) { epsilon = CreatePoints(numberSliderε, ofRandom(64, 255), 0, ofRandom(64, 255)); } if (numberSliderζ > 0) { zeta = CreatePoints(numberSliderζ, 0, ofRandom(64, 255), ofRandom(64, 255)); } if (numberSliderη > 0) { eta = CreatePoints(numberSliderη, ofRandom(64, 255), ofRandom(64, 255), ofRandom(64, 255)); } if (numberSliderθ > 0) { teta = CreatePoints(numberSliderθ, 0, 0, 0); } }

I would like to define something like: void draw() const { ofSetColor(r, g, b, 100); //set particle color + some alpha if (numberSliderα > 0) { ofDrawCircle(x, y, 1.0F); //draw a point at x,y coordinates, the size of a 1.0 pixels } if (numberSliderβ > 0) { ofDrawCircle(x, y, 1.2F); //draw a point at x,y coordinates, the size of a 1.2 pixels } if (numberSliderγ > 0) { ofDrawCircle(x, y, 1.4F); //draw a point at x,y coordinates, the size of a 1.4 pixels } if (numberSliderδ > 0) { ofDrawCircle(x, y, 1.6F); //draw a point at x,y coordinates, the size of a 1.6 pixels } if (numberSliderε > 0) { ofDrawCircle(x, y, 1.8F); //draw a point at x,y coordinates, the size of a 1.8 pixels } if (numberSliderζ > 0) { ofDrawCircle(x, y, 2.0F); //draw a point at x,y coordinates, the size of a 2.0 pixels } if (numberSliderη > 0) { ofDrawCircle(x, y, 2.2F); //draw a point at x,y coordinates, the size of a 2.2 pixels } if (numberSliderθ > 0) { ofDrawCircle(x, y, 2.4F); //draw a point at x,y coordinates, the size of a 2.4 pixels } But Visual Studio gives me errors because these sliders are not defined here (they are defined in the GUI, class ofApp final : public ofBaseApp). Please help me!

KhadrasWellun avatar Dec 26 '22 20:12 KhadrasWellun

i assume your are referring to old codebase. Can you post a link to your code ?

the easiest way to do this would be to add a radius property to the point struct. and then you would just ofDrawCircle(x, y, radius)

ker2x avatar Dec 27 '22 11:12 ker2x

Here is my last version of the code. Manuel_src.zip

KhadrasWellun avatar Dec 27 '22 15:12 KhadrasWellun

i assume your are referring to old codebase. Can you post a link to your code ?

the easiest way to do this would be to add a radius property to the point struct. and then you would just ofDrawCircle(x, y, radius)

I tried the new code, but it works extremely hard. I couldn't get more than 8 fps at 8 colors of 1000 particles each. Then, another shortcoming of the new code is that the particles look extremely small, like little dots where you can't really distinguish the color shades. The structures formed don't look good at all because of this.

KhadrasWellun avatar Dec 27 '22 15:12 KhadrasWellun

I also tried to insert a fullscreen button but it didn't work. I used ofToggleFullscreen(), but the screen kept blinking without showing anything. I'm also trying to figure out how to introduce the 3D vision function (with 3D glasses). Also, I don't know how to save the color palette generated before saving the model. The particle colors are generated when pressing the buttons that trigger the restart. But when I save the model, the existing colour palette is not saved so that I can load it later. Do you have any idea how I could save this color palette?

KhadrasWellun avatar Dec 27 '22 18:12 KhadrasWellun

I'll take a look at your code, and also patch my code to allow drawing circle. my code shouldn't be slower. this is weird. are you using openmp in your code ? I might have forgot to reenable it.

ker2x avatar Dec 28 '22 05:12 ker2x

I'll check the fullscreen problem as well.

ker2x avatar Dec 28 '22 05:12 ker2x

I'll take a look at your code, and also patch my code to allow drawing circle. my code shouldn't be slower. this is weird. are you using openmp in your code ? I might have forgot to reenable it.

I took your exact code and just added more colors and those extra buttons. And after compiling, the code worked extremely hard. I had to set a small number of particles (under 1000 of each color) to go at 10 fps.

Here is your code with my additions in it in old way, that works fine with 17313 big particles at 61 fps: src 1.7.6.5.zip

Here is a proof screenshot: 202301060731

Here is your code with my additions in it but in new way, that works slow and particles are very very small (near dots) with 9600 particles at 4 fps: src 1.8.5.zip

Here is a proof screenshot: 202301060705

KhadrasWellun avatar Dec 28 '22 05:12 KhadrasWellun