amuse
amuse copied to clipboard
Brutus worker dies
Hello,
I get the following message: amuse.support.exceptions.CodeException: Exception when calling function 'evolve_model', of code 'BrutusInterface', exception was 'Error in code: no error message - code probably died, sorry.'
Condition:
-
Debian 10
-
git commit 13ca566e32ba2fdd78bae0b33f188b6cf250d52b Author: Inti Pelupessy [email protected] Date: Mon Jan 13 18:25:44 2020 +0100
-
small model with 10 bodies
-
usage of gravity.set_bs_tolerance_string("1e-20") or even gravity.set_bs_tolerance(1e-20) With Bs-tolerance 1e-19 I can vary word-lenght from 112 to 130 and even eta in a wide range without crash. Once I use bs_tolerance 1e-20 the worker dies
Thank you fo r the support.
@tjardaboekholt could you have a look at this?
Hi, thanks for the error message concerning Brutus.
I managed to do some runs with e=1e-20, and say, Lw=128 and dt_param=0.10, and the code did not crash. If I reduce Lw to 40 bits, then it crashes and gives the same error message you quoted. This is because the number of bits was too low to reach convergence of e=1e-20. If Brutus fails to reach a converged solution within the maximum number of iterations, it will give up and this causes the code to stop. However, for suitable combinations of (e, Lw, dt_param), the code should in principle work fine, i.e. make sure you have enough bits to resolve e.
Cheers!
Hello,
I identified the values you've written as: gravity.set_bs_tolerance_string("1e-20") # as your "e" gravity.set_word_length(130) #as your Lw gravity.set_eta(0.01) #as your dt_param Are this the wrong parameters? It is not working.
Hello,
here the build.log. Maybe there is missing something see warnings Building code: brutus, target: all, in directory: src/amuse/community/brutus
make[1]: Verzeichnis „/home/pi/amuse/src/amuse/community/brutus“ wird betreten mpicxx -g -O2 -fPIC -std=c++0x -I../mpfrc++ -I/home/pi/amuse/lib/stopcond -Impfrc++ -I./src -c -o interface.o interface.cc In file included from ./src/Star.h:6, from ./src/Brutus.h:1, from interface.cc:12: mpfrc++/mpreal.h: In function ‘const mpfr::mpreal mpfr::root(const mpfr::mpreal&, long unsigned int, mpfr_rnd_t)’: mpfrc++/mpreal.h:2201:50: warning: ‘int mpfr_root(mpfr_ptr, mpfr_srcptr, long unsigned int, mpfr_rnd_t)’ is deprecated [-Wdeprecated-declarations] mpfr_root(y.mpfr_ptr(), x.mpfr_srcptr(), k, r); ^ In file included from mpfrc++/mpreal.h:121, from ./src/Star.h:6, from ./src/Brutus.h:1, from interface.cc:12: /usr/include/mpfr.h:693:21: note: declared here __MPFR_DECLSPEC int mpfr_root (mpfr_ptr, mpfr_srcptr, unsigned long, ^~~~~~~~~ In file included from ./src/Star.h:6, from ./src/Brutus.h:1, from interface.cc:12: mpfrc++/mpreal.h:2201:50: warning: ‘int mpfr_root(mpfr_ptr, mpfr_srcptr, long unsigned int, mpfr_rnd_t)’ is deprecated [-Wdeprecated-declarations] mpfr_root(y.mpfr_ptr(), x.mpfr_srcptr(), k, r); ^ In file included from mpfrc++/mpreal.h:121, from ./src/Star.h:6, from ./src/Brutus.h:1, from interface.cc:12: /usr/include/mpfr.h:693:21: note: declared here __MPFR_DECLSPEC int mpfr_root (mpfr_ptr, mpfr_srcptr, unsigned long, ^~~~~~~~~ In file included from ./src/Star.h:6, from ./src/Brutus.h:1, from interface.cc:12: mpfrc++/mpreal.h: In function ‘const mpfr::mpreal mpfr::grandom(__gmp_randstate_struct (&)[1], mpfr_rnd_t)’: mpfrc++/mpreal.h:2646:53: warning: ‘int mpfr_grandom(mpfr_ptr, mpfr_ptr, __gmp_randstate_struct*, mpfr_rnd_t)’ is deprecated [-Wdeprecated-declarations] mpfr_grandom(x.mpfr_ptr(), NULL, state, rnd_mode); ^ In file included from mpfrc++/mpreal.h:121, from ./src/Star.h:6, from ./src/Brutus.h:1, from interface.cc:12: /usr/include/mpfr.h:502:21: note: declared here __MPFR_DECLSPEC int mpfr_grandom (mpfr_ptr, mpfr_ptr, gmp_randstate_t, ^~~~~~~~~~~~ In file included from ./src/Star.h:6, from ./src/Brutus.h:1, from interface.cc:12: mpfrc++/mpreal.h:2646:53: warning: ‘int mpfr_grandom(mpfr_ptr, mpfr_ptr, __gmp_randstate_struct*, mpfr_rnd_t)’ is deprecated [-Wdeprecated-declarations] mpfr_grandom(x.mpfr_ptr(), NULL, state, rnd_mode); ^ In file included from mpfrc++/mpreal.h:121, from ./src/Star.h:6, from ./src/Brutus.h:1, from interface.cc:12: /usr/include/mpfr.h:502:21: note: declared here __MPFR_DECLSPEC int mpfr_grandom (mpfr_ptr, mpfr_ptr, gmp_randstate_t, ^~~~~~~~~~~~ mpicxx -g -O2 -fPIC -std=c++0x -I../mpfrc++ -I/home/pi/amuse/lib/stopcond -I./src worker_code.cc src/libbrutus.a interface.o -o brutus_worker -L./src -lbrutus -L/home/pi/amuse/lib/stopcond -lstopcond -lmpfr -lgmp -lgmp make[1]: Verzeichnis „/home/pi/amuse/src/amuse/community/brutus“ wird verlassen
Hello http://gcc.gnu.org/onlinedocs/gcc-3.4.6/gcc/Deprecated-Features.html tells that you use functionality which is no longer supported. Can you update the code to state of the art syntax.
Hello,
I identified the values you've written as: gravity.set_bs_tolerance_string("1e-20") # as your "e" gravity.set_word_length(130) #as your Lw gravity.set_eta(0.01) #as your dt_param Are this the wrong parameters? It is not working.
Yes that is correct. Another way to set the parameters is:
code = Brutus()
code.parameters.bs_tolerance = "1e-20"
code.parameters.word_length = 128
code.parameters.dt_param = 0.10
print(code.parameters) # to check values are correctly set
Just to add to that: the latter way @tjardaboekholt mentioned is the preferred method.
Hello,
thank you for showing me the preferred usage. But this has no effect. The problem is the "deprecated" in the build.log. Please have a look to the result according your preferred method: begin_time: 0.0 s default: 0.0 s brutus_output_directory: /home/tst/amuse/data/brutus/output/ default: ./ bs_tolerance: 1e-20 default: 1e-08 dt_param: 0.1 default: 0.24 stopping_condition_maximum_density: 2.55293255306e+306 m**-3 * kg default: -0.0142011587158 m**-3 * kg stopping_condition_maximum_internal_energy: inf m2 * s-2 default: -2558461176.91 m2 * s-2 stopping_condition_minimum_density: -0.0142011587158 m**-3 * kg default: -0.0142011587158 m**-3 * kg stopping_condition_minimum_internal_energy: -2558461176.91 m2 * s-2 default: -2558461176.91 m2 * s-2 stopping_conditions_number_of_steps: 1 default: 1.0 stopping_conditions_out_of_box_size: 0.0 m default: 0.0 m stopping_conditions_out_of_box_use_center_of_mass: 0 default: False stopping_conditions_timeout: 4.0 s default: 4.0 s timestep: 102715.479587 s default: 719008.357111 s word_length: 128 default: 72
0.0 s /home/pi/amuse/src/amuse/units/generic_unit_converter.py:189: RuntimeWarning: overflow encountered in double_scalars return new_quantity(number * factor, new_unit) Traceback (most recent call last):
File "
File "/usr/lib/python3/dist-packages/spyder_kernels/customize/spydercustomize.py", line 678, in runfile execfile(filename, namespace)
File "/usr/lib/python3/dist-packages/spyder_kernels/customize/spydercustomize.py", line 106, in execfile exec(compile(f.read(), filename, 'exec'), namespace)
File "/home/pi/tests/solar1.py", line 124, in
File "/home/pi/tests/solar1.py", line 103, in gravity_minimal gravity.evolve_model(gravity.model_time + (10| units.day))
File "/home/pi/amuse/src/amuse/support/methods.py", line 167, in call result = self.method(*list_arguments, **keyword_arguments)
File "/home/pi/amuse/src/amuse/support/methods.py", line 167, in call result = self.method(*list_arguments, **keyword_arguments)
File "/home/pi/amuse/src/amuse/support/methods.py", line 167, in call result = self.method(*list_arguments, **keyword_arguments)
File "/home/pi/amuse/src/amuse/support/methods.py", line 266, in call return self.method(*list_arguments, **keyword_arguments)
File "/home/pi/amuse/src/amuse/rfi/core.py", line 123, in call raise exceptions.CodeException("Exception when calling function '{0}', of code '{1}', exception was '{2}'".format(self.specification.name, type(self.interface).name, ex))
CodeException: Exception when calling function 'evolve_model', of code 'BrutusInterface', exception was 'lost connection to code'
As contrast here a good example:
begin_time: 0.0 s default: 0.0 s brutus_output_directory: /home/pi/amuse/data/brutus/output/ default: ./ bs_tolerance: 1e-19 default: 1e-08 dt_param: 0.01 default: 0.24 stopping_condition_maximum_density: 2.55293255306e+306 m**-3 * kg default: -0.0142011587158 m**-3 * kg stopping_condition_maximum_internal_energy: inf m2 * s-2 default: -2558461176.91 m2 * s-2 stopping_condition_minimum_density: -0.0142011587158 m**-3 * kg default: -0.0142011587158 m**-3 * kg stopping_condition_minimum_internal_energy: -2558461176.91 m2 * s-2 default: -2558461176.91 m2 * s-2 stopping_conditions_number_of_steps: 1 default: 1.0 stopping_conditions_out_of_box_size: 0.0 m default: 0.0 m stopping_conditions_out_of_box_use_center_of_mass: 0 default: False stopping_conditions_timeout: 4.0 s default: 4.0 s timestep: 10271.5479587 s default: 719008.357111 s word_length: 128 default: 72
0.0 s /home/pi/amuse/src/amuse/units/generic_unit_converter.py:189: RuntimeWarning: overflow encountered in double_scalars return new_quantity(number * factor, new_unit) 864000.0 s
Hello can you confirm that the issue is related to: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91226 https://github.com/BrianGladman/mpfr/blob/master/tests/tget_set_d64.c /* The volatile below avoids _Decimal64 constant propagation, which is buggy for non-canonical encoding in various GCC versions on the x86 and x86_64 targets: failure with gcc (Debian 20190719-1) 10.0.0 20190718 (experimental) [trunk revision 273586]; the MPFR test was not failing with previous GCC versions, but GCC versions 5 to 9 are also affected on the simple testcase at: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91226 */
I have updated mpfr c++ to 3.6.6 - this should get rid of the deprecation warnings..can you try out? (its only updated in the github master, no release on pypi yet)
if you still get the error, can you post a minimal example wich triggers it?
the above comment was for @GFTwrt ;-)
btw, thanks for bringing this up (I had not noticed mpfr c++ was updated, the website still has the 2015 as the latest )
Hello @ipelupessy,
Thank you for your action. It was not the sollution :-( . Please have a look to my last comment (gcc-Bug).
Attached the simple model file "solar1" (constellation of planets from Amuse-book) and the build log. 1e-19 runs 1e-20 fails.
Building code: brutus, target: all, in directory: src/amuse/community/brutus
make[1]: Verzeichnis „/home/tom/testam/src/amuse/community/brutus“ wird betreten
/home/tom/testam/build.py --type=c interface.py BrutusInterface -o worker_code.cc
/home/tom/testam/build.py --type=H -i amuse.support.codes.stopping_conditions.StoppingConditionInterface interface.py BrutusInterface -o worker_code.h
make -C src all CXXFLAGS="-g -O2 -fPIC -I./mpfrc++ -std=c++0x -I../mpfrc++"
make[2]: Verzeichnis „/home/tom/testam/src/amuse/community/brutus/src“ wird betreten
g++ -O1 -g -O2 -fPIC -I./mpfrc++ -std=c++0x -I../mpfrc++ -c Star.cpp
g++ -O1 -g -O2 -fPIC -I./mpfrc++ -std=c++0x -I../mpfrc++ -c Cluster.cpp
g++ -O1 -g -O2 -fPIC -I./mpfrc++ -std=c++0x -I../mpfrc++ -c Bulirsch_Stoer.cpp
g++ -O1 -g -O2 -fPIC -I./mpfrc++ -std=c++0x -I../mpfrc++ -c Brutus.cpp
g++ -O1 -g -O2 -fPIC -I./mpfrc++ -std=c++0x -I../mpfrc++ -c main.cpp
rm -f libbrutus.a
ar crs libbrutus.a main.o Brutus.o Bulirsch_Stoer.o Cluster.o Star.o
ranlib libbrutus.a
make[2]: Verzeichnis „/home/tom/testam/src/amuse/community/brutus/src“ wird verlassen
mpicxx -g -O2 -fPIC -I./mpfrc++ -std=c++0x -I../mpfrc++ -I./mpfrc++ -I/home/tom/testam/lib/stopcond -Impfrc++ -I./src -c -o interface.o interface.cc
mpicxx -g -O2 -fPIC -I./mpfrc++ -std=c++0x -I../mpfrc++ -I./mpfrc++ -I/home/tom/testam/lib/stopcond -I./src worker_code.cc src/libbrutus.a interface.o -o brutus_worker -L./src -lbrutus -L/home/tom/testam/lib/stopcond -lstopcond -L/usr/lib/x86_64-linux-gnu/ -lmpfr -L/usr/lib/x86_64-linux-gnu/ -lgmp
make[1]: Verzeichnis „/home/tom/testam/src/amuse/community/brutus“ wird verlassen
there is an error in the state model for Brutus. The script will work if the parameters are set before the particles are added; I think if you do it the other way round ~~the changes in the word_length are not propagated to the integrator~~ the derived eta is not updated anymore...hence the failure to converge! So the script can be made to work by:
...
gravity = Brutus(convert_nbody,number_of_workers=1)
gravity.parameters.bs_tolerance = 1e-20
gravity.parameters.word_length = 128
gravity.parameters.dt_param = 0.010
print(gravity.parameters) # to check values are correctly set
gravity.particles.add_particles(bodies)
...
but I will try to fix the state model, because the ordering should not matter...
hmm my explanation above was not entirely correct..
@tjardaboekholt I think the problem is in the set_eta(tolerance) ..it is called in the setup which is called in the commit_particles...according to the state model of gravitational_dynamics commit_particles is triggered also when changing the parameters after adding particles. We could fix this by moving the setup or add an set_eta to the setter of the tolerance??
I confirm that by setting the parameters before giving the particles to Brutus makes the script run. So please proceed using this temporary fix. Also, the current version of Brutus adapts eta to the value of epsilon that is given. In principle this should be ok as then you can just focus on 2 parameters (epsilon, word-length). Meanwhile I plan to update the Brutus version in Amuse soon, together with a fix for this issue. Many thanks for pointing this out to us.
Thank you for your support. The model is evolving now. Lets have a look to the result.
@tjardaboekholt: May I add a request - if you do some updates in Brutus? As I mentioned at the first post to amuse at github I want to do simulation of our solar system including solar wind. I expect to need a resolution in energy conservation better than 1 mW (milli Watt). At the moment the interface between the code and python is not able to transport this accuracy. Can you add a string based interface providing a number (difference of energy between two freely chosen timesteps by the time difference) as well as the particle data? It would be very nice to get such an interface.
dear GFT,
in principle, you can do that already by converting your mW (which is basically an enery conserving quantity, to a tolerance. the tolerance then is the inverse of the fraction of the total binding energy of the Solar system in terms of 1mW. sounds like you are performing an interesting experiment.
Simon
On Tue, Feb 11, 2020, 21:49 GFTwrt [email protected] wrote:
@tjardaboekholt https://github.com/tjardaboekholt: May I add a request
- if you do some updates in Brutus? As I mentioned at the first post to amuse at github I want to do simulation of our solar system including solar wind. I expect to need a resolution in energy conservation better than 1 mW (milli Watt). At the moment the interface between the code and python is not able to transport this accuracy. Can you add a string based interface providing a number (difference of energy between two freely chosen timesteps by the time difference) as well as the particle data? It would be very nice to get such an interface.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/amusecode/amuse/issues/579?email_source=notifications&email_token=ABCPFTEG6L3RAETPFUN3M73RCMFN5A5CNFSM4KPL5JBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELOAI5Y#issuecomment-584844407, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCPFTDTZY4GMR3NJBEIJDDRCMFN5ANCNFSM4KPL5JBA .
@spzwart: Thank You Simon. I was not sure how to interpret eta (tolerance/bs_tolerance) out of your paper. I was not sure about potential or energy. So the unit of tolerance (eta) is (1/(energy/power)) and therefore time?
@tjardaboekholt: if you add the get_
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 28 days if no further activity occurs. Thank you for your contributions.
The original issue seems solved, but I am not sure if the later brutus fixes proposed have been implemented; @tjardaboekholt there is mention of a new brutus version: has that been merged? Also note the full string interface functions should be checked?
Hi Inti, thanks for the reminder. The student Arend Moerman has implemented PN terms into Brutus. I will also check his Amuse interface/string treatment. I will work on merging this into Amuse as soon as I have some time!
Hello, mayby you should have a look to https://github.com/GFTwrt/amuse/tree/master/src/amuse/community/gpuhermite8 too. Thomas Ps. The interface is the samethan https://github.com/GFTwrt/amuse/tree/master/src/amuse/community/brutus
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 28 days if no further activity occurs. Thank you for your contributions.