picongpu icon indicating copy to clipboard operation
picongpu copied to clipboard

Problems encountered in running examples on GPU

Open summer7807 opened this issue 2 years ago • 18 comments

Hi,guys,When I ran the example on ubuntu18.04, I encountered the following problems after pic-build, but cuda was displayed on my PC and the version was available. I don't know why. I would be grateful if you could give me some suggestions. 2022-09-15 14-19-15 的屏幕截图

summer7807 avatar Sep 15 '22 06:09 summer7807

Hello @summer7807 ,

Having nvidia-smi and an NVIDIA GPU does not automatically imply you have a CUDA compiler. Or it could be that your system has such a compiler, but somehow cmake does not see it.

Does nvcc -v work on your system with that environment? I suspect it would not find nvcc either. If so you need to install CUDA Toolkit first, or perhaps it was installed but something is wrong with your paths.

sbastrakov avatar Sep 15 '22 07:09 sbastrakov

Hi,@sbastrakov,Before I do this, I want to ask if there is anything to pay attention to in the cuda toolist version. It seems that I have installed it before but still reported an error

summer7807 avatar Sep 15 '22 08:09 summer7807

So can you access your CUDA compiler from your terminal? For example, does that the nvcc -v command I suggested before work?

sbastrakov avatar Sep 15 '22 08:09 sbastrakov

Hi,@sbastrakov,I'm sorry that my reply was not timely due to network problems. Yes, I ran nvcc -V but nothing happened

summer7807 avatar Sep 15 '22 09:09 summer7807

I am sorry, could you clarify what do you mean by "nothing"? If you have an nvcc and it was found, the output should have been nvcc fatal : No input files specified; use option --help for more information (or something similar to it). If you do not have it, or it is not in your paths, it should be something like Command 'nvcc' not found.

In the latter case please check your CUDA installation or perhaps try to install it again.

sbastrakov avatar Sep 15 '22 10:09 sbastrakov

In case the issue is with paths, please follow this documentation: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#post-installation-actions

sbastrakov avatar Sep 15 '22 10:09 sbastrakov

Hi, @sbastrakov ,"Nothing" is Command 'nvcc' not found. I then installed the cuda toolkit and configured the environment variables. The later runs are stuck here. I don't know whether the previous problems have been solved. Can you give me some suggestions? 2022-09-15 20-00-28 的屏幕截图 2022-09-15 20-00-42 的屏幕截图 2022-09-15 20-00-48 的屏幕截图

summer7807 avatar Sep 15 '22 12:09 summer7807

Okay, so now the compiler was found and the build process started.

Based on your output I see you are using the last PIConGPU release 0.6.0, which is same as our master branch. However, your output is cropped so I do not see it fully and thus do not know which host-side compiler did you use. Which version of gcc or another compiler did you use? It was shown at the start of pic-build output e.g.

-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0

In general please attach such information as text (in a mesage or a file) and fully, not as screenhots.

sbastrakov avatar Sep 15 '22 12:09 sbastrakov

I think you mean this? CMakeOutput.log

summer7807 avatar Sep 15 '22 12:09 summer7807

Yes. So everything appears right, and I am not sure what caused the issue.

Perhaps there is some problem with older PIConGPU on more modern software environment. Could you try switching to the dev branch (this is our current development version) and building it? The steps to make a setup and compile are generally same, but please make a new setup directory with pic-create and follow these docs.

sbastrakov avatar Sep 15 '22 13:09 sbastrakov

I can try to build, but I don't quite understand what you mean by the dev version. Can you give me a link or explain it again? Thank you.

summer7807 avatar Sep 15 '22 14:09 summer7807

For that you need to clone the repository as described here. And then do the update as suggested there and finally git checkout dev.

sbastrakov avatar Sep 15 '22 14:09 sbastrakov

Maybe I don't need to use a very modern software (or compile) environment. Can you provide me with the environment you are using? I'm not sure whether this is a good solution?

summer7807 avatar Sep 15 '22 14:09 summer7807

Maybe I don't need to use a very modern software (or compile) environment. Can you provide me with the environment you are using? I'm not sure whether this is a good solution?

PIConGPU (dev branch) is supporting all CUDA 11.X versions. The master/last release is supporting CUDA 10.X up to 11.2.

@summer7807 The best is what @sbastrakov suggested, please switch to the dev branch. The last release of PIConGPU is not supporting CUDA 11.4, the reason is a bug in the NVDIA nvcc compiler. Switching to an old version of CUDA is sometimes not easy and it depends on your operating system. We can not help with this process, therefore using the current development branch is the easiest way.

psychocoderHPC avatar Sep 15 '22 15:09 psychocoderHPC

Hi,@sbastrakov,According to your suggestion, I have successfully built PIConGPU. But when I ran the Bremsstrahlung example, I encountered the following errors. I don't know the reason. Is there a problem with my configuration? Can you give me some suggestions?

2022-09-16 19-49-54 的屏幕截图

summer7807 avatar Sep 16 '22 11:09 summer7807

It could be you are running of GPU memory - this setup is quite demanding. It is possible to reduce the grid size by modifying the .cfg file. But first could you try the LaserWakefield example like our documentation suggests?

sbastrakov avatar Sep 16 '22 13:09 sbastrakov

I have successfully run LaserWakefield. In fact, the example I want to run is Bremsstrahlung. I used to run under the CPU and failed because of insufficient memory. After reducing the grid, I can run successfully. However, due to the reduction (imprecision) of the grid, the operation results are inaccurate, and many expected physical processes are not shown through simulation, so I switched to run under the GPU. To sum up, what I want to say is whether it is possible to run Bremsstrahlung without changing its original parameters. Do you have any good suggestions for me?

summer7807 avatar Sep 17 '22 00:09 summer7807

Memory requirements per particle and cell depend on simulation parameters in .param files. However, once the "physics" is defined they are basically fixed and mostly cannot be reduced. Thus your user-side parameter to adjust is how much grid cells you could fit in your GPU memory, as controlled by your .cfg file. For that a trial and error process can be used, basically similar to what you have tried already, or we have a memory calculator.

With that calculator, or even some back-of-the-envelope calculations, you could estimate how much overall memory is needed for your desired grid size. And then compare it to your GPU memory, or perhaps multiple GPUs if you have access to such a system, and find a suitable configuration.

sbastrakov avatar Sep 19 '22 08:09 sbastrakov

Will close. Original issue seems to be resolved. Further problems seem to originate in lack of resources.

steindev avatar Oct 14 '22 17:10 steindev