alamode
alamode copied to clipboard
'std::bad_alloc' Error
Hi, I am trying to train the cubic scaling after performing VASP single point calculations for the 8485 structures predicted by my "cubic.pattern_ANHARM3" file. However, I am getting the following error-
OPTIMIZATION
LMODEL = least-squares
Training data file (DFSET) : DFSET_cubic
NSTART = 1; NEND = 8484 8484 entries will be used for training.
Total Number of Parameters : 52693 Total Number of Free Parameters : 44192
terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc Aborted
I tried allocating higher memory (>120 GB), but, I am still stuck with this error. Can anyone suggest how to solve this issue? I didn't have this issue while reproducing the Si example.
Thanks, Abhirup
This occurs likely because of the large RAM requirement. Please see #47.
I tried SPARSE = 1, and also tried to run in bigger memory nodes (512 GB) it still gives me the same error. It seems there is more than the sensing matrix allocated in the program. Any suggestion on how to solve this?
I have not encountered this issue before. Perhaps, the error occurs only when the input array length is very large. Could you provide more detailed information including the input files for ALM? They are necessary for identifying the error location. (If you feel reluctant to upload the files, please send them directly to me via email.)
Thanks for the email. I just sent you an email to your 'gmail' from your website. In case you don't see my mail please check the spam.
Thank you so much once again.
Abhirup
Abhirup Patra Research Scientist II Delaware Energy Institute University of Delaware, Newark, DE
From: Terumasa TADANO @.> Sent: Wednesday, November 24, 2021 7:55 PM To: ttadano/alamode @.> Cc: Abhirup Patra @.>; Author @.> Subject: Re: [ttadano/alamode] 'std::bad_alloc' Error (Issue #50)
I have not encountered this issue before. Perhaps, the error occurs only when the input array length is very large. Could you provide more detailed information including the input files for ALM? They are necessary for identifying the error location. (If you feel reluctant to upload the files, please send them directly to me via email.)
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ttadano/alamode/issues/50#issuecomment-978656953, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACU6X3BNE2OKI6EIJB4ETH3UNWCO7ANCNFSM5IRR7LWQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Thank you for the input files.
I've found that you use SPARSE = 1
combined with LMODEL = enet
, but SPARSE = 1
is effective only for the ordinary least-squares (LMODEL = ols). Please set like
&optimize
LMODEL = ols
SPARSE = 1
...
While the calculation of 3rd-order force constants has not finished, the bad_alloc error did not appear (so far).
Thanks. I will set LMODEL = ols and give it a try in a bigger memory node.
Abhirup Patra Research Scientist II Delaware Energy Institute University of Delaware, Newark, DE
From: Terumasa TADANO @.> Sent: Thursday, November 25, 2021 6:22 AM To: ttadano/alamode @.> Cc: Abhirup Patra @.>; Author @.> Subject: Re: [ttadano/alamode] 'std::bad_alloc' Error (Issue #50)
Thank you for the input files.
I've found that you use SPARSE = 1 combined with LMODEL = enet, but SPARSE = 1 is effective only for the ordinary least-squares (LMODEL = ols). Please set like
&optimize LMODEL = ols SPARSE = 1 ...
While the calculation of 3rd-order force constants has not finished, the bad_alloc error did not appear (so far).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/ttadano/alamode/issues/50#issuecomment-979111913, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACU6X3DSUVACLD23UFB2THDUNYL55ANCNFSM5IRR7LWQ. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Hi, I tried with SPARSE=1 and LMODEL=ols, this time, I did not get the error message but it's been more than 24 hrs I don't see any progress in the calculation.
The calculation using the same inputs finished in ~12 hours using a Xeon node as shown below:
OPTIMIZATION
============
LMODEL = least-squares
Training data file (DFSET) : DFSET_cubic
NSTART = 1; NEND = 8484
8484 entries will be used for training.
Total Number of Parameters : 56005
Total Number of Free Parameters : 44192
Now, start fitting ...
Solve least-squares problem by Eigen SimplicialLDLT.
Residual sum of squares for the solution: 0.534217
Fitting error (%) : 11.5154
Time Elapsed: 41419.6 sec.
-------------------------------------------------------------------
The following files are created:
Force constants in a human-readable format : al2o3_cubic.fcs
Input data for the phonon code ANPHON : al2o3_cubic.xml
Job finished at Thu Nov 25 23:33:02 2021
You may need to wait longer depending on the performance of the CPU chip. The long execution time may be shortened by using another sparse solver, such as Pardiso, instead of Eigen, but I have not implemented it yet.
Thanks, will it be possible for you to share those two files here? I do not have access to Xenon nodes right now. I will try on some other big memory nodes.
Thanks, Abhirup
I've sent the files to you directly via email.
Thank you so much.