abacus-develop icon indicating copy to clipboard operation
abacus-develop copied to clipboard

A Segmentation fault when calculating the vacancy energy of 256 iron atoms

Open ieiue opened this issue 1 year ago • 8 comments

Describe the bug

When calculating the vacancy energy of 256 (actually 255) iron atoms by using 64 cpus and 2 processors with ABACUS(v3.7.3) , there is a Segmentation fault. And I tried 28 cpus and 4 processors, it can run. (1)64 cpus & 2 processors : image (2)28 cpus & 4 processors: image

Expected behavior

What's the relationship between calculation success and the number of cpus and the number of processes.

To Reproduce

Fe256_vacancy.zip

Environment

No response

Additional Context

No response

Task list for Issue attackers (only for developers)

  • [ ] Verify the issue is not a duplicate.
  • [ ] Describe the bug.
  • [ ] Steps to reproduce.
  • [ ] Expected behavior.
  • [ ] Error message.
  • [ ] Environment details.
  • [ ] Additional context.
  • [ ] Assign a priority level (low, medium, high, urgent).
  • [ ] Assign the issue to a team member.
  • [ ] Label the issue with relevant tags.
  • [ ] Identify possible related issues.
  • [ ] Create a unit test or automated test to reproduce the bug (if applicable).
  • [ ] Fix the bug.
  • [ ] Test the fix.
  • [ ] Update documentation (if necessary).
  • [ ] Close the issue and inform the reporter (if applicable).

ieiue avatar Aug 23 '24 02:08 ieiue

I think there must be something went wrong in 3.7.x version, recently i also met problems but SIGNAL 9 AND 6. I suggest you to try version 3.6.5 in https://github.com/deepmodeling/abacus-develop/releases/tag/v3.6.5

Reach me if it works.

xdzhu avatar Aug 29 '24 15:08 xdzhu

I tired v.3.6.4 before. There was the same Segmentation. image @xdzhu

ieiue avatar Aug 30 '24 10:08 ieiue

Is it OK for 1 mpi case?

xdzhu avatar Aug 30 '24 10:08 xdzhu

For 1 processor, it still has a Segmentation fault. image image

ieiue avatar Sep 02 '24 06:09 ieiue

There seems to be some bug with ELPA, could you please try ks_solver scalapack_gvx?

maki49 avatar Sep 02 '24 08:09 maki49

Setting ks_solver= scalapack_gvx doesn't help. image

ieiue avatar Sep 03 '24 10:09 ieiue

Can you try the latest version 3.7.5 to check whether it's solved or not?

xdzhu avatar Sep 10 '24 03:09 xdzhu

with elpa

xdzhu avatar Sep 10 '24 03:09 xdzhu

Now this bug is solved in abacus v3.8.0.

YuLiu98 avatar Oct 23 '24 07:10 YuLiu98