BOUT-dev
BOUT-dev copied to clipboard
Pvcode + Cvode improvements
- Exposes more options to the user.
- Allows to dump if the solver fails. This can be useful for figuring out why the solver is slow. Currently only for pvode, but should also be possible to implement this for cvode (for cvode we can use a documented API)
My worry here is the tracking storing a lot of data if it's storing the full Field3D
on every single change. Should it also be a compile-time option so that it completely disappears for production runs?
My worry here is the tracking storing a lot of data if it's storing the full
Field3D
on every single change. Should it also be a compile-time option so that it completely disappears for production runs?
It could be, but I don't think it is needed. I would personally keep it enabled for production runs, as it has a runtime switch, that is disabled by default. And I would also keep it like this, at least by default.
If a simulation crashes, it can be very useful to immediately get why it failed (this is currently the only use of the name tracking). Having to restart + queue again, just to get to the point where it crashes can be very time consuming, and the cost should be really negligable, if disabled.
I did notice a slow down in my simulations with this enabled, but that was when it was enabled all the time and the names did not get reset, thus the names have been growing exponentially.
On 3/19/24 16:34, Peter Hill wrote:
@.**** commented on this pull request.
In include/bout/field.hxx https://github.com/boutproject/BOUT-dev/pull/2889#discussion_r1530625642:
@@ -683,4 +683,12 @@ inline T floor(const T& var, BoutReal f, const std::string& rgn = "RGN_ALL") {
#undef FIELD_FUNC
+template <typename T, typename = bout::utils::EnableIfField<T>, class... Types> +inline T setName(T&& f, const std::string& name, Types... args) {
I think most other setters like this are member functions
But then they return either Field, or need to be overwritten.
The cuda build fails with:
/__w/BOUT-dev/BOUT-dev/src/field/field3d.cxx: In member function 'Options* Field3D::track(const T&, std::string)':
/__w/BOUT-dev/BOUT-dev/src/field/field3d.cxx:851:63: error: expected ';' before '}' token
851 | (*tracking)[outname].setAttributes({
| ^
| ;
/__w/BOUT-dev/BOUT-dev/src/field/field3d.cxx:851:64: error: expected primary-expression before ',' token
851 | (*tracking)[outname].setAttributes({
| ^
/__w/BOUT-dev/BOUT-dev/src/field/field3d.cxx:851:66: error: expected primary-expression before '{' token
851 | (*tracking)[outname].setAttributes({
| ^
/__w/BOUT-dev/BOUT-dev/src/field/field3d.cxx:851:95: error: invalid use of void expression
851 | (*tracking)[outname].setAttributes({
| ^
[ 16%] Building CUDA object CMakeFiles/bout++.dir/src/field/field_data.cxx.o
The relevant code is:
(*tracking)[outname].setAttributes({
{"operation", operation},
#if BOUT_USE_TRACK
{"rhs.name", change.name},
#endif
});
Is gcc9.4 not able to use setAttributes?
It could be, but I don't think it is needed. I would personally keep it enabled for production runs, as it has a runtime switch, that is disabled by default. And I would also keep it like this, at least by default.
We do try to completely remove debugging features for production runs, see TRACE
for example. Being able to remove the function call entirely at lower CHECK
levels would make me much happier about this.
If a simulation crashes, it can be very useful to immediately get why it failed (this is currently the only use of the name tracking). Having to restart + queue again, just to get to the point where it crashes can be very time consuming, and the cost should be really negligable, if disabled.
This is true, but I would really like to understand the costs. It looks like it's storing potentially hundreds or thousands of copies of full fields, so although it looks like a really interesting and useful feature, it would be good to have a good idea of the costs.
I did notice a slow down in my simulations with this enabled, but that was when it was enabled all the time and the names did not get reset, thus the names have been growing exponentially.
Is that with the current form?
But then they return either Field, or need to be overwritten.
It should almost certainly return Field&
and then this wouldn't be an issue
It could be, but I don't think it is needed. I would personally keep it enabled for production runs, as it has a runtime switch, that is disabled by default. And I would also keep it like this, at least by default.
We do try to completely remove debugging features for production runs, see
TRACE
for example. Being able to remove the function call entirely at lowerCHECK
levels would make me much happier about this.
If at all, I would be in favor of adding a new configure flag. I use this in runs with CHECK=0.
If a simulation crashes, it can be very useful to immediately get why it failed (this is currently the only use of the name tracking). Having to restart + queue again, just to get to the point where it crashes can be very time consuming, and the cost should be really negligable, if disabled.
This is true, but I would really like to understand the costs. It looks like it's storing potentially hundreds or thousands of copies of full fields, so although it looks like a really interesting and useful feature, it would be good to have a good idea of the costs.
It is only storing fields, if the solver has failed. Unless that happens, no additional storage is used.
There are some calls, but they are all no-ops, if you think that might be a problem, I can certainly guard them with #if
statements, but I think that should be orthogonal to any of the flags we have.
In the case the solver crashes, I think the cost is really negligible. This is only done once, if the solver decided it cannot continue. For hermes-2, the dump file contains 53 Field3D
s, while the normal dump file contains 45 Field3D
s.
And part of that is just the metric, so I guess the overhead is less then 100% of the storage requirement of the run.
But if BOUT++ runs out of memory here, it would not be much worse then if not - the simulation finishes anyway.
I did notice a slow down in my simulations with this enabled, but that was when it was enabled all the time and the names did not get reset, thus the names have been growing exponentially.
Is that with the current form?
No, currently it is not noticeable. I activate it for all my runs.
But then they return either Field, or need to be overwritten.
It should almost certainly return
Field&
and then this wouldn't be an issue
That is exactly what I tried:
diff --git a/include/bout/field.hxx b/include/bout/field.hxx
index 04035f5b7..488aecaab 100644
--- a/include/bout/field.hxx
+++ b/include/bout/field.hxx
@@ -84,6 +84,14 @@ public:
return *this;
}
+ template <class... Types>
+ inline Field& setName(const std::string& name, Types... args) {
+#if BOUT_USE_TRACK
+ this->name = fmt::format(name, args...);
+#endif
+ return *this;
+ }
+
std::string name;
#if CHECK > 0
@@ -683,19 +691,4 @@ inline T floor(const T& var, BoutReal f, const std::string& rgn = "RGN_ALL") {
#undef FIELD_FUNC
-template <typename T, typename = bout::utils::EnableIfField<T>, class... Types>
-inline void setName(T& f, const std::string& name, Types... args) {
-#if BOUT_USE_TRACK
- f.name = fmt::format(name, args...);
-#endif
-}
-
-template <typename T, typename = bout::utils::EnableIfField<T>, class... Types>
-inline T setName(T&& f, const std::string& name, Types... args) {
-#if BOUT_USE_TRACK
- f.name = fmt::format(name, args...);
-#endif
- return f;
-}
-
#endif /* FIELD_H */
diff --git a/src/mesh/coordinates.cxx b/src/mesh/coordinates.cxx
index 32774d622..643e148b1 100644
--- a/src/mesh/coordinates.cxx
+++ b/src/mesh/coordinates.cxx
@@ -1542,7 +1542,7 @@ Field3D Coordinates::Grad_par(const Field3D& var, CELL_LOC outloc,
TRACE("Coordinates::Grad_par( Field3D )");
ASSERT1(location == outloc || outloc == CELL_DEFAULT);
- return setName(::DDY(var, outloc, method) * invSg(), "Grad_par({:s})", var.name);
+ return (::DDY(var, outloc, method) * invSg()).setName("Grad_par({:s})", var.name);
}
/////////////////////////////////////////////////////////
@@ -1601,7 +1601,7 @@ Field3D Coordinates::Div_par(const Field3D& f, CELL_LOC outloc,
f_B.yup(i) = f.yup(i) / Bxy_floc.yup(i);
f_B.ydown(i) = f.ydown(i) / Bxy_floc.ydown(i);
}
- return setName(Bxy * Grad_par(f_B, outloc, method), "Div_par({:s})", f.name);
+ return (Bxy * Grad_par(f_B, outloc, method)).setName("Div_par({:s})", f.name);
}
/////////////////////////////////////////////////////////
diff --git a/src/solver/impls/pvode/pvode.cxx b/src/solver/impls/pvode/pvode.cxx
index db28f64d8..c36985ae0 100644
--- a/src/solver/impls/pvode/pvode.cxx
+++ b/src/solver/impls/pvode/pvode.cxx
@@ -376,7 +376,7 @@ BoutReal PvodeSolver::run(BoutReal tout) {
for (auto& f : f3d) {
f.F_var->enableTracking(fmt::format("ddt_{:s}", f.name), debug);
- setName(*f.var, f.name);
+ f.var->setName(f.name);
}
run_rhs(simtime);
gcc14 complains with:
/u/dave/soft/BOUT-dev/merge/src/mesh/coordinates.cxx: In member function ‘Field3D Coordinates::Grad_par(const Field3D&, CELL_LOC, const std::string&)’: /u/dave/soft/BOUT-dev/merge/src/mesh/coordinates.cxx:1545:56: error: could not convert ‘operator*(const Field3D&, const Field2D&)((* &((Coordinates*)this)->Coordinates::invSg())).Field3D::Field.Field::setName<std::__cxx11::basic_string<char, std::char_traits<char>, std:
:allocator<char> > >(std::__cxx11::basic_string<char>(((const char*)"Grad_par({:s})"), std::allocator<char>()), std::__cxx11::basic_string<char>(var.Field3D::Field.Field::name))’ from ‘Field’ to ‘Field3D’
1545 | return (::DDY(var, outloc, method) * invSg()).setName("Grad_par({:s})", var.name);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| Field
/u/dave/soft/BOUT-dev/merge/src/mesh/coordinates.cxx: In member function ‘Field3D Coordinates::Div_par(const Field3D&, CELL_LOC, const std::string&)’:
/u/dave/soft/BOUT-dev/merge/src/mesh/coordinates.cxx:1604:55: error: could not convert ‘operator*(const Field2D&, const Field3D&)(Coordinates::Grad_par(const Field3D&, CELL_LOC, const std::string&)(f_B, outloc, (* & method))).Field3D::Field.Field::setName<std::__cxx11::
basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char>(((const char*)"Div_par({:s})"), std::allocator<char>()), std::__cxx11::basic_string<char>(f.Field3D::Field.Field::name))’ from ‘Field’ to ‘Field3D’
1604 | return (Bxy * Grad_par(f_B, outloc, method)).setName("Div_par({:s})", f.name); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
| |
| Field
@dschwoerer After discussing this with @bendudson, we're quite concerned about the potential overhead of the temporary field dumping mechanism, both in terms of memory and performance, for what seems like a small benefit when debugging. We'd really like to see some performance analysis and scaling to see it doesn't have an affect when disabled. I'm also worried that we'd need to see setName
everywhere in the code for this to be really useful.
Could something similar not be achieved with a custom monitor instead?
Exposing more CVODE options to the user seems very straightforward and useful, so maybe that could be split out?
@dschwoerer After discussing this with @bendudson, we're quite concerned about the potential overhead of the temporary field dumping mechanism, both in terms of memory and performance, for what seems like a small benefit when debugging.
But the memory overhead is only if the solver fails. Do you still think at that point that is an issue? I know of @totork using BOUT++ in a inner-loop of an optimization script, where just failing vs crashing would matter. Thus having an option to disable would make sense.
We'd really like to see some performance analysis and scaling to see it doesn't have an affect when disabled.
I can certainly run blob2d for this branch and next. But I am certain there are no significant differences, thus I haven't done it. Would you like to see something else? Would just the BOUT++ internal timings sufficient? Would you like mpi runs?
I'm also worried that we'd need to see
setName
everywhere in the code for this to be really useful.
Sure, having them makes things more easily readable. But that is optional, and I can certainly add some if you think this is worth merging an wanted before merging. It would make BOUT_USE_TRACK
also more useful. Right now
Could something similar not be achieved with a custom monitor instead?
I would not know how. If you can explain how to, I am happy to change the design.
Exposing more CVODE options to the user seems very straightforward and useful, so maybe that could be split out?
I can split them out :+1:
Ok, I think we understand this a bit better now. I'd still be much happier if this was a macro and only turned on for CHECK > 0
. I'm quite worried about this proliferating everywhere and performance dying a death by a thousand cuts.
Ok, I think we understand this a bit better now. I'd still be much happier if this was a macro and only turned on for
CHECK > 0
. I'm quite worried about this proliferating everywhere and performance dying a death by a thousand cuts.
That is a fair point. I do understand that worry, but at the same time I am convinced it is orthogonal to CHECK=0
.
I am happy to add a new configure option. -DBOUT_EXTRA_DEBUG_ON_FAILURE
? I guess it make sense to have it disabled by default, as unless you know about the option and will look at the debug files, it is rather useless.