Efficency Inquiry
Dear lettuce friends,
I have been using your code for a few weeks. I have been very successful with the 2D single phase model.
Now, I would like to extend the code to perform flow sims in 3D. In my first attempt I notice that the iteration speed did not scale very well, I wanted to see if you guys have some ideas on how to implement the model better.
o. In the example below, I have a lot of solid. Is it possible to mask the operations there to speed-up the iterations? Or can you think about something else? The current problem 256^3 takes about 1 hr on a A100 (for reference, a parallel code in ~4 cores would finish in 1 minute)
o. I'm also getting weird pressure values. Can you spot anything dumb in my implementation?
https://github.com/je-santos/lettuce/blob/bug/examples/3d_frac.ipynb
Many thanks for all your help, your code is great.
-Javier
Hi Javier,
At first sight, I don't see anything inefficient in your code. That said, the current lettuce version is not optimized for speed (~100 MLUPS on a V100 in 3D). This is going to change soon with our upcoming native CUDA kernels (#78) which give at least one order of magnitude speedup.
Unfortunately there isn't currently a good way to mask operations -- however, I doubt that implementing a masked stream + collide on the Python API level would make the code faster on GPUs.
Weird pressure values might possibly be related to #103 (stream > collide > boundary vs. collide > stream > boundary). But it might just be because the flow hasn't yet reached a steady state.
Just out of curiosity: in terms of speed, what code are you comparing to? Is it an LBM code or a classical solver for steady flows?
Cheers, Andreas
Hi Andreas,
Those are exciting news! I was comparing it to Palabos running on my macbook air.
Many thanks for your response