alamode icon indicating copy to clipboard operation
alamode copied to clipboard

'std::bad_alloc' Error

Open abhirup86 opened this issue 3 years ago • 11 comments

Hi, I am trying to train the cubic scaling after performing VASP single point calculations for the 8485 structures predicted by my "cubic.pattern_ANHARM3" file. However, I am getting the following error-

OPTIMIZATION

LMODEL = least-squares

Training data file (DFSET) : DFSET_cubic

NSTART = 1; NEND = 8484 8484 entries will be used for training.

Total Number of Parameters : 52693 Total Number of Free Parameters : 44192

terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted

I tried allocating higher memory (>120 GB), but, I am still stuck with this error. Can anyone suggest how to solve this issue? I didn't have this issue while reproducing the Si example.

Thanks, Abhirup

abhirup86 avatar Nov 22 '21 17:11 abhirup86

This occurs likely because of the large RAM requirement. Please see #47.

ttadano avatar Nov 23 '21 16:11 ttadano

I tried SPARSE = 1, and also tried to run in bigger memory nodes (512 GB) it still gives me the same error. It seems there is more than the sensing matrix allocated in the program. Any suggestion on how to solve this?

abhirup86 avatar Nov 24 '21 16:11 abhirup86

I have not encountered this issue before. Perhaps, the error occurs only when the input array length is very large. Could you provide more detailed information including the input files for ALM? They are necessary for identifying the error location. (If you feel reluctant to upload the files, please send them directly to me via email.)

ttadano avatar Nov 25 '21 00:11 ttadano

Thanks for the email. I just sent you an email to your 'gmail' from your website. In case you don't see my mail please check the spam.

Thank you so much once again.

Abhirup


Abhirup Patra Research Scientist II Delaware Energy Institute University of Delaware, Newark, DE


From: Terumasa TADANO @.> Sent: Wednesday, November 24, 2021 7:55 PM To: ttadano/alamode @.> Cc: Abhirup Patra @.>; Author @.> Subject: Re: [ttadano/alamode] 'std::bad_alloc' Error (Issue #50)

I have not encountered this issue before. Perhaps, the error occurs only when the input array length is very large. Could you provide more detailed information including the input files for ALM? They are necessary for identifying the error location. (If you feel reluctant to upload the files, please send them directly to me via email.)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ttadano/alamode/issues/50#issuecomment-978656953, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACU6X3BNE2OKI6EIJB4ETH3UNWCO7ANCNFSM5IRR7LWQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

abhirup86 avatar Nov 25 '21 01:11 abhirup86

Thank you for the input files.

I've found that you use SPARSE = 1 combined with LMODEL = enet, but SPARSE = 1 is effective only for the ordinary least-squares (LMODEL = ols). Please set like

 &optimize
     LMODEL = ols
     SPARSE = 1
     ...

While the calculation of 3rd-order force constants has not finished, the bad_alloc error did not appear (so far).

ttadano avatar Nov 25 '21 11:11 ttadano

Thanks. I will set LMODEL = ols and give it a try in a bigger memory node.


Abhirup Patra Research Scientist II Delaware Energy Institute University of Delaware, Newark, DE


From: Terumasa TADANO @.> Sent: Thursday, November 25, 2021 6:22 AM To: ttadano/alamode @.> Cc: Abhirup Patra @.>; Author @.> Subject: Re: [ttadano/alamode] 'std::bad_alloc' Error (Issue #50)

Thank you for the input files.

I've found that you use SPARSE = 1 combined with LMODEL = enet, but SPARSE = 1 is effective only for the ordinary least-squares (LMODEL = ols). Please set like

&optimize LMODEL = ols SPARSE = 1 ...

While the calculation of 3rd-order force constants has not finished, the bad_alloc error did not appear (so far).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ttadano/alamode/issues/50#issuecomment-979111913, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACU6X3DSUVACLD23UFB2THDUNYL55ANCNFSM5IRR7LWQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

abhirup86 avatar Nov 25 '21 14:11 abhirup86

Hi, I tried with SPARSE=1 and LMODEL=ols, this time, I did not get the error message but it's been more than 24 hrs I don't see any progress in the calculation.

abhirup86 avatar Nov 27 '21 00:11 abhirup86

The calculation using the same inputs finished in ~12 hours using a Xeon node as shown below:

OPTIMIZATION
 ============

  LMODEL = least-squares

  Training data file (DFSET) : DFSET_cubic

  NSTART = 1; NEND = 8484
  8484 entries will be used for training.

  Total Number of Parameters : 56005
  Total Number of Free Parameters : 44192

  Now, start fitting ...
  Solve least-squares problem by Eigen SimplicialLDLT.
  Residual sum of squares for the solution: 0.534217
  Fitting error (%) : 11.5154

  Time Elapsed: 41419.6 sec.

 -------------------------------------------------------------------

 The following files are created:

 Force constants in a human-readable format : al2o3_cubic.fcs
 Input data for the phonon code ANPHON      : al2o3_cubic.xml

 Job finished at Thu Nov 25 23:33:02 2021

You may need to wait longer depending on the performance of the CPU chip. The long execution time may be shortened by using another sparse solver, such as Pardiso, instead of Eigen, but I have not implemented it yet.

ttadano avatar Nov 27 '21 00:11 ttadano

Thanks, will it be possible for you to share those two files here? I do not have access to Xenon nodes right now. I will try on some other big memory nodes.

Thanks, Abhirup

abhirup86 avatar Nov 27 '21 15:11 abhirup86

I've sent the files to you directly via email.

ttadano avatar Nov 30 '21 03:11 ttadano

Thank you so much.

abhirup86 avatar Nov 30 '21 20:11 abhirup86