Trixi.jl
Trixi.jl copied to clipboard
Proof of concept: TrixiMPIArray
This is a rough draft of a possible MPI array type. A lot of TODO notes are left in the draft at the moment.
Partially implemented in a reduced version (only ode_norm
and ode_unstable_check
) in #1113. We will use this reduced version for now and see how it works in the wild.
TODO:
- [ ] Local reductions (
sum
) - to docstring or test whether we could also just use localmapreduce
and parallelode_norm
? - [ ] Check step rejections
- [ ] Check some complex setups (MPI shock capturing does not use alpha smoothing! but everything else should work, incl. AMR)
- [ ] Maybe performance of serial vs. one MPI rank (needs some hacks,
mpi_parallel
andmpi_isparallel
)
Closes #329; closes #339
Codecov Report
Merging #1104 (cdcf828) into main (1b604a6) will increase coverage by
0.00%
. The diff coverage is98.81%
.
@@ Coverage Diff @@
## main #1104 +/- ##
=======================================
Coverage 96.75% 96.75%
=======================================
Files 303 305 +2
Lines 23876 23931 +55
=======================================
+ Hits 23099 23153 +54
- Misses 777 778 +1
Flag | Coverage Δ | |
---|---|---|
unittests | 96.75% <98.81%> (+<0.01%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
src/Trixi.jl | 66.67% <ø> (ø) |
|
src/callbacks_step/save_restart_dg.jl | 89.36% <ø> (ø) |
|
src/callbacks_step/save_solution_dg.jl | 95.89% <ø> (ø) |
|
src/auxiliary/mpi_arrays.jl | 97.92% <97.92%> (ø) |
|
src/callbacks_step/amr.jl | 97.07% <100.00%> (ø) |
|
src/callbacks_step/analysis_dg2d_parallel.jl | 100.00% <100.00%> (ø) |
|
src/callbacks_step/stepsize_dg2d.jl | 100.00% <100.00%> (ø) |
|
src/callbacks_step/stepsize_dg3d.jl | 100.00% <100.00%> (ø) |
|
src/callbacks_step/time_series_dg2d.jl | 100.00% <100.00%> (ø) |
|
src/meshes/meshes.jl | 100.00% <100.00%> (ø) |
|
... and 5 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 1b604a6...cdcf828. Read the comment docs.
Some results from 987407e8
julia --check-bounds=no --threads=2
julia --check-bounds=no --threads=2
julia> trixi_include("examples/tree_2d_dgsem/elixir_euler_ec.jl", tspan=(0.0, 10.0))
julia> sol = solve(ode, CarpenterKennedy2N54(williamson_condition=false), dt=1.0, save_everystep=false, callback=callbacks); summary_callback()
────────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 2.73s / 90.4% 23.5MiB / 97.3%
Section ncalls time %tot avg alloc %tot avg
────────────────────────────────────────────────────────────────────────────────────
rhs! 4.24k 2.36s 95.9% 558μs 7.57MiB 33.1% 1.83KiB
volume integral 4.24k 1.94s 78.6% 458μs 1.16MiB 5.1% 288B
interface flux 4.24k 251ms 10.2% 59.2μs 1.62MiB 7.1% 400B
prolong2interfaces 4.24k 58.2ms 2.4% 13.7μs 0.97MiB 4.2% 240B
surface integral 4.24k 56.3ms 2.3% 13.3μs 1.23MiB 5.4% 304B
reset ∂u/∂t 4.24k 28.3ms 1.1% 6.68μs 0.00B 0.0% 0.00B
Jacobian 4.24k 22.3ms 0.9% 5.27μs 1.10MiB 4.8% 272B
~rhs!~ 4.24k 8.06ms 0.3% 1.90μs 1.50MiB 6.5% 370B
prolong2boundaries 4.24k 251μs 0.0% 59.2ns 0.00B 0.0% 0.00B
prolong2mortars 4.24k 177μs 0.0% 41.7ns 0.00B 0.0% 0.00B
mortar flux 4.24k 145μs 0.0% 34.3ns 0.00B 0.0% 0.00B
source terms 4.24k 91.7μs 0.0% 21.6ns 0.00B 0.0% 0.00B
boundary flux 4.24k 87.0μs 0.0% 20.5ns 0.00B 0.0% 0.00B
calculate dt 848 50.1ms 2.0% 59.0μs 0.00B 0.0% 0.00B
analyze solution 10 30.6ms 1.2% 3.06ms 174KiB 0.7% 17.4KiB
I/O 11 20.9ms 0.8% 1.90ms 15.1MiB 66.1% 1.38MiB
save solution 10 20.7ms 0.8% 2.07ms 15.1MiB 66.0% 1.51MiB
get element variables 10 97.2μs 0.0% 9.72μs 20.6KiB 0.1% 2.06KiB
~I/O~ 11 26.0μs 0.0% 2.37μs 7.20KiB 0.0% 671B
save mesh 10 785ns 0.0% 78.5ns 0.00B 0.0% 0.00B
────────────────────────────────────────────────────────────────────────────────────
julia> sol = solve(ode, RDPK3SpFSAL35(), abstol=1.0e-4, reltol=1.0e-4, save_everystep=false, callback=callbacks); summary_callback()
────────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 1.43s / 81.7% 15.5MiB / 86.5%
Section ncalls time %tot avg alloc %tot avg
────────────────────────────────────────────────────────────────────────────────────
rhs! 2.35k 1.14s 97.7% 487μs 4.20MiB 31.4% 1.83KiB
volume integral 2.35k 924ms 79.0% 394μs 660KiB 4.8% 288B
interface flux 2.35k 121ms 10.3% 51.3μs 917KiB 6.7% 400B
prolong2interfaces 2.35k 32.3ms 2.8% 13.7μs 550KiB 4.0% 240B
surface integral 2.35k 30.9ms 2.6% 13.1μs 697KiB 5.1% 304B
reset ∂u/∂t 2.35k 17.4ms 1.5% 7.42μs 0.00B 0.0% 0.00B
Jacobian 2.35k 12.9ms 1.1% 5.51μs 624KiB 4.6% 272B
~rhs!~ 2.35k 4.41ms 0.4% 1.88μs 853KiB 6.2% 372B
prolong2boundaries 2.35k 158μs 0.0% 67.3ns 0.00B 0.0% 0.00B
prolong2mortars 2.35k 104μs 0.0% 44.2ns 0.00B 0.0% 0.00B
mortar flux 2.35k 79.6μs 0.0% 33.9ns 0.00B 0.0% 0.00B
source terms 2.35k 54.2μs 0.0% 23.1ns 0.00B 0.0% 0.00B
boundary flux 2.35k 50.1μs 0.0% 21.3ns 0.00B 0.0% 0.00B
analyze solution 6 18.3ms 1.6% 3.05ms 105KiB 0.8% 17.5KiB
I/O 7 9.09ms 0.8% 1.30ms 9.08MiB 67.8% 1.30MiB
save solution 6 9.00ms 0.8% 1.50ms 9.06MiB 67.7% 1.51MiB
get element variables 6 73.3μs 0.0% 12.2μs 12.4KiB 0.1% 2.06KiB
~I/O~ 7 16.2μs 0.0% 2.31μs 5.20KiB 0.0% 761B
save mesh 6 448ns 0.0% 74.7ns 0.00B 0.0% 0.00B
────────────────────────────────────────────────────────────────────────────────────
tmpi 2 julia --check-bounds=no --threads=1
tmpi 2 julia --check-bounds=no --threads=1
julia> trixi_include("examples/tree_2d_dgsem/elixir_euler_ec.jl", tspan=(0.0, 10.0))
julia> sol = solve(ode, CarpenterKennedy2N54(williamson_condition=false), dt=1.0, save_everystep=false, callback=callbacks); summary_callback()
────────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 2.72s / 95.4% 19.2MiB / 98.0%
Section ncalls time %tot avg alloc %tot avg
────────────────────────────────────────────────────────────────────────────────────
rhs! 4.24k 2.49s 95.9% 588μs 3.44MiB 18.3% 852B
volume integral 4.24k 2.01s 77.4% 475μs 0.00B 0.0% 0.00B
interface flux 4.24k 277ms 10.6% 65.3μs 0.00B 0.0% 0.00B
surface integral 4.24k 55.9ms 2.1% 13.2μs 0.00B 0.0% 0.00B
prolong2interfaces 4.24k 52.2ms 2.0% 12.3μs 0.00B 0.0% 0.00B
reset ∂u/∂t 4.24k 23.0ms 0.9% 5.44μs 0.00B 0.0% 0.00B
Jacobian 4.24k 19.6ms 0.8% 4.64μs 0.00B 0.0% 0.00B
MPI interface flux 4.24k 13.6ms 0.5% 3.22μs 0.00B 0.0% 0.00B
~rhs!~ 4.24k 11.8ms 0.5% 2.79μs 1.70MiB 9.0% 420B
finish MPI receive 4.24k 11.4ms 0.4% 2.68μs 530KiB 2.8% 128B
start MPI send 4.24k 9.67ms 0.4% 2.28μs 397KiB 2.1% 96.0B
prolong2mpiinterfaces 4.24k 3.17ms 0.1% 749ns 0.00B 0.0% 0.00B
finish MPI send 4.24k 1.03ms 0.0% 243ns 596KiB 3.1% 144B
start MPI receive 4.24k 912μs 0.0% 215ns 265KiB 1.4% 64.0B
prolong2mortars 4.24k 286μs 0.0% 67.5ns 0.00B 0.0% 0.00B
prolong2boundaries 4.24k 256μs 0.0% 60.3ns 0.00B 0.0% 0.00B
MPI mortar flux 4.24k 224μs 0.0% 52.8ns 0.00B 0.0% 0.00B
prolong2mpimortars 4.24k 210μs 0.0% 49.6ns 0.00B 0.0% 0.00B
mortar flux 4.24k 148μs 0.0% 35.0ns 0.00B 0.0% 0.00B
boundary flux 4.24k 91.0μs 0.0% 21.5ns 0.00B 0.0% 0.00B
source terms 4.24k 75.2μs 0.0% 17.8ns 0.00B 0.0% 0.00B
calculate dt 848 70.5ms 2.7% 83.2μs 79.5KiB 0.4% 96.0B
analyze solution 10 22.1ms 0.9% 2.21ms 2.61MiB 13.9% 267KiB
I/O 11 14.6ms 0.6% 1.33ms 12.6MiB 67.4% 1.15MiB
save solution 10 14.4ms 0.6% 1.44ms 12.6MiB 67.2% 1.26MiB
get element variables 10 178μs 0.0% 17.8μs 23.0KiB 0.1% 2.30KiB
~I/O~ 11 21.5μs 0.0% 1.95μs 7.20KiB 0.0% 671B
save mesh 10 991ns 0.0% 99.1ns 0.00B 0.0% 0.00B
────────────────────────────────────────────────────────────────────────────────────
julia> sol = solve(ode, RDPK3SpFSAL35(), abstol=1.0e-4, reltol=1.0e-4, save_everystep=false, callback=callbacks); summary_callback()
────────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 1.44s / 87.5% 12.3MiB / 90.0%
Section ncalls time %tot avg alloc %tot avg
────────────────────────────────────────────────────────────────────────────────────
rhs! 2.35k 1.23s 98.1% 525μs 1.91MiB 17.3% 855B
volume integral 2.35k 978ms 77.7% 416μs 0.00B 0.0% 0.00B
interface flux 2.35k 135ms 10.7% 57.4μs 0.00B 0.0% 0.00B
surface integral 2.35k 31.0ms 2.5% 13.2μs 0.00B 0.0% 0.00B
prolong2interfaces 2.35k 30.3ms 2.4% 12.9μs 0.00B 0.0% 0.00B
reset ∂u/∂t 2.35k 12.6ms 1.0% 5.37μs 0.00B 0.0% 0.00B
finish MPI receive 2.35k 11.5ms 0.9% 4.90μs 294KiB 2.6% 128B
Jacobian 2.35k 11.2ms 0.9% 4.77μs 0.00B 0.0% 0.00B
MPI interface flux 2.35k 7.86ms 0.6% 3.35μs 0.00B 0.0% 0.00B
~rhs!~ 2.35k 7.16ms 0.6% 3.05μs 969KiB 8.6% 423B
start MPI send 2.35k 5.48ms 0.4% 2.33μs 220KiB 1.9% 96.0B
prolong2mpiinterfaces 2.35k 1.91ms 0.2% 813ns 0.00B 0.0% 0.00B
finish MPI send 2.35k 712μs 0.1% 303ns 330KiB 2.9% 144B
start MPI receive 2.35k 547μs 0.0% 233ns 147KiB 1.3% 64.0B
prolong2mortars 2.35k 184μs 0.0% 78.6ns 0.00B 0.0% 0.00B
prolong2mpimortars 2.35k 161μs 0.0% 68.7ns 0.00B 0.0% 0.00B
prolong2boundaries 2.35k 154μs 0.0% 65.5ns 0.00B 0.0% 0.00B
MPI mortar flux 2.35k 120μs 0.0% 51.3ns 0.00B 0.0% 0.00B
mortar flux 2.35k 109μs 0.0% 46.4ns 0.00B 0.0% 0.00B
source terms 2.35k 58.0μs 0.0% 24.7ns 0.00B 0.0% 0.00B
boundary flux 2.35k 47.8μs 0.0% 20.4ns 0.00B 0.0% 0.00B
analyze solution 6 13.3ms 1.1% 2.21ms 1.56MiB 14.1% 267KiB
I/O 7 10.8ms 0.9% 1.54ms 7.58MiB 68.6% 1.08MiB
save solution 6 10.6ms 0.8% 1.76ms 7.57MiB 68.4% 1.26MiB
get element variables 6 169μs 0.0% 28.1μs 13.8KiB 0.1% 2.30KiB
~I/O~ 7 12.8μs 0.0% 1.83μs 5.20KiB 0.0% 761B
save mesh 6 647ns 0.0% 108ns 0.00B 0.0% 0.00B
────────────────────────────────────────────────────────────────────────────────────
TL/DR: Looks reasonable
New results from Rocinante:
julia --project=. --check-bounds=no --threads=24
julia --project=. --check-bounds=no --threads=24
julia> using Trixi, OrdinaryDiffEq
julia> trixi_include("examples/tree_2d_dgsem/elixir_euler_ec.jl", tspan=(0.0, 10.0),
initial_refinement_level=6, save_solution=TrivialCallback())
julia> sol = solve(ode, CarpenterKennedy2N54(williamson_condition=false), dt=1.0, save_everystep=false, callback=callbacks); summary_callback()
─────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 8.31s / 44.9% 18.3MiB / 87.6%
Section ncalls time %tot avg alloc %tot avg
─────────────────────────────────────────────────────────────────────────────────
rhs! 8.77k 3.00s 80.5% 343μs 15.7MiB 98.0% 1.83KiB
volume integral 8.77k 1.59s 42.5% 181μs 2.41MiB 15.1% 288B
reset ∂u/∂t 8.77k 883ms 23.6% 101μs 0.00B 0.0% 0.00B
interface flux 8.77k 289ms 7.7% 33.0μs 3.35MiB 20.9% 400B
prolong2interfaces 8.77k 92.6ms 2.5% 10.6μs 2.01MiB 12.6% 240B
surface integral 8.77k 89.8ms 2.4% 10.2μs 2.54MiB 15.9% 304B
~rhs!~ 8.77k 32.1ms 0.9% 3.66μs 3.09MiB 19.3% 369B
Jacobian 8.77k 29.6ms 0.8% 3.38μs 2.28MiB 14.2% 272B
prolong2mortars 8.77k 473μs 0.0% 54.0ns 0.00B 0.0% 0.00B
prolong2boundaries 8.77k 469μs 0.0% 53.5ns 0.00B 0.0% 0.00B
mortar flux 8.77k 291μs 0.0% 33.2ns 0.00B 0.0% 0.00B
boundary flux 8.77k 207μs 0.0% 23.5ns 0.00B 0.0% 0.00B
source terms 8.77k 205μs 0.0% 23.4ns 0.00B 0.0% 0.00B
calculate dt 1.75k 554ms 14.8% 316μs 0.00B 0.0% 0.00B
analyze solution 19 175ms 4.7% 9.22ms 328KiB 2.0% 17.3KiB
─────────────────────────────────────────────────────────────────────────────────
julia> sol = solve(ode, CarpenterKennedy2N54(williamson_condition=false, thread=OrdinaryDiffEq.True()), dt=1.0, save_everystep=false, callback=callbacks); summary_callback()
─────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 3.56s / 80.3% 19.6MiB / 81.5%
Section ncalls time %tot avg alloc %tot avg
─────────────────────────────────────────────────────────────────────────────────
rhs! 8.77k 2.13s 74.5% 243μs 15.7MiB 98.0% 1.83KiB
volume integral 8.77k 1.54s 53.8% 175μs 2.41MiB 15.1% 288B
interface flux 8.77k 286ms 10.0% 32.6μs 3.35MiB 20.9% 400B
prolong2interfaces 8.77k 120ms 4.2% 13.7μs 2.01MiB 12.6% 240B
surface integral 8.77k 87.9ms 3.1% 10.0μs 2.54MiB 15.9% 304B
reset ∂u/∂t 8.77k 33.9ms 1.2% 3.87μs 0.00B 0.0% 0.00B
~rhs!~ 8.77k 31.2ms 1.1% 3.55μs 3.09MiB 19.3% 369B
Jacobian 8.77k 30.8ms 1.1% 3.52μs 2.28MiB 14.2% 272B
prolong2boundaries 8.77k 486μs 0.0% 55.4ns 0.00B 0.0% 0.00B
prolong2mortars 8.77k 378μs 0.0% 43.1ns 0.00B 0.0% 0.00B
mortar flux 8.77k 288μs 0.0% 32.8ns 0.00B 0.0% 0.00B
boundary flux 8.77k 204μs 0.0% 23.2ns 0.00B 0.0% 0.00B
source terms 8.77k 199μs 0.0% 22.7ns 0.00B 0.0% 0.00B
calculate dt 1.75k 555ms 19.4% 316μs 0.00B 0.0% 0.00B
analyze solution 19 172ms 6.0% 9.07ms 328KiB 2.0% 17.3KiB
─────────────────────────────────────────────────────────────────────────────────
julia> sol = solve(ode, RDPK3SpFSAL35(), abstol=1.0e-4, reltol=1.0e-4, save_everystep=false, callback=callbacks); summary_callback()
─────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 4.52s / 35.2% 16.6MiB / 51.0%
Section ncalls time %tot avg alloc %tot avg
─────────────────────────────────────────────────────────────────────────────────
rhs! 4.64k 1.49s 93.7% 322μs 8.29MiB 97.8% 1.83KiB
volume integral 4.64k 687ms 43.2% 148μs 1.27MiB 15.0% 288B
reset ∂u/∂t 4.64k 474ms 29.8% 102μs 0.00B 0.0% 0.00B
interface flux 4.64k 142ms 8.9% 30.7μs 1.77MiB 20.9% 400B
~rhs!~ 4.64k 62.2ms 3.9% 13.4μs 1.64MiB 19.3% 370B
prolong2interfaces 4.64k 54.7ms 3.4% 11.8μs 1.06MiB 12.5% 240B
surface integral 4.64k 50.0ms 3.1% 10.8μs 1.34MiB 15.9% 304B
Jacobian 4.64k 18.5ms 1.2% 4.00μs 1.20MiB 14.2% 272B
prolong2mortars 4.64k 672μs 0.0% 145ns 0.00B 0.0% 0.00B
prolong2boundaries 4.64k 520μs 0.0% 112ns 0.00B 0.0% 0.00B
mortar flux 4.64k 345μs 0.0% 74.3ns 0.00B 0.0% 0.00B
source terms 4.64k 127μs 0.0% 27.4ns 0.00B 0.0% 0.00B
boundary flux 4.64k 108μs 0.0% 23.2ns 0.00B 0.0% 0.00B
analyze solution 11 101ms 6.3% 9.19ms 189KiB 2.2% 17.2KiB
─────────────────────────────────────────────────────────────────────────────────
julia> sol = solve(ode, RDPK3SpFSAL35(thread=OrdinaryDiffEq.True()), abstol=1.0e-4, reltol=1.0e-4, save_everystep=false, callback=callbacks); summary_callback()
─────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 2.57s / 44.0% 17.8MiB / 47.7%
Section ncalls time %tot avg alloc %tot avg
─────────────────────────────────────────────────────────────────────────────────
rhs! 4.64k 1.03s 91.2% 223μs 8.29MiB 97.8% 1.83KiB
volume integral 4.64k 660ms 58.2% 142μs 1.27MiB 15.0% 288B
interface flux 4.64k 142ms 12.5% 30.6μs 1.77MiB 20.9% 400B
reset ∂u/∂t 4.64k 92.8ms 8.2% 20.0μs 0.00B 0.0% 0.00B
prolong2interfaces 4.64k 62.0ms 5.5% 13.4μs 1.06MiB 12.5% 240B
surface integral 4.64k 45.4ms 4.0% 9.79μs 1.34MiB 15.9% 304B
~rhs!~ 4.64k 17.1ms 1.5% 3.69μs 1.64MiB 19.3% 370B
Jacobian 4.64k 14.4ms 1.3% 3.11μs 1.20MiB 14.2% 272B
prolong2boundaries 4.64k 238μs 0.0% 51.3ns 0.00B 0.0% 0.00B
mortar flux 4.64k 189μs 0.0% 40.8ns 0.00B 0.0% 0.00B
prolong2mortars 4.64k 183μs 0.0% 39.5ns 0.00B 0.0% 0.00B
boundary flux 4.64k 108μs 0.0% 23.2ns 0.00B 0.0% 0.00B
source terms 4.64k 105μs 0.0% 22.7ns 0.00B 0.0% 0.00B
analyze solution 11 99.4ms 8.8% 9.04ms 190KiB 2.2% 17.2KiB
─────────────────────────────────────────────────────────────────────────────────
tmpi 2 julia --project=. --check-bounds=no --threads=12
tmpi 2 julia --project=. --check-bounds=no --threads=12
julia> using Trixi, OrdinaryDiffEq
julia> trixi_include("examples/tree_2d_dgsem/elixir_euler_ec.jl", tspan=(0.0, 10.0),
initial_refinement_level=6, save_solution=TrivialCallback())
julia> sol = solve(ode, CarpenterKennedy2N54(williamson_condition=false), dt=1.0, save_everystep=false, callback=callbacks); summary_callback()
────────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 5.61s / 58.3% 46.1MiB / 97.2%
Section ncalls time %tot avg alloc %tot avg
────────────────────────────────────────────────────────────────────────────────────
rhs! 8.77k 2.84s 86.7% 323μs 25.4MiB 56.7% 2.97KiB
volume integral 8.77k 1.54s 47.2% 176μs 2.81MiB 6.3% 336B
reset ∂u/∂t 8.77k 415ms 12.7% 47.4μs 0.00B 0.0% 0.00B
interface flux 8.77k 282ms 8.6% 32.2μs 3.35MiB 7.5% 400B
finish MPI receive 8.77k 194ms 5.9% 22.1μs 1.07MiB 2.4% 128B
surface integral 8.77k 94.0ms 2.9% 10.7μs 2.54MiB 5.7% 304B
start MPI send 8.77k 93.2ms 2.8% 10.6μs 822KiB 1.8% 96.0B
prolong2interfaces 8.77k 85.0ms 2.6% 9.69μs 2.01MiB 4.5% 240B
~rhs!~ 8.77k 33.6ms 1.0% 3.83μs 3.49MiB 7.8% 418B
MPI interface flux 8.77k 31.2ms 1.0% 3.55μs 3.35MiB 7.5% 400B
Jacobian 8.77k 29.2ms 0.9% 3.33μs 2.41MiB 5.4% 288B
prolong2mpiinterfaces 8.77k 27.1ms 0.8% 3.09μs 1.87MiB 4.2% 224B
finish MPI send 8.77k 2.14ms 0.1% 244ns 1.20MiB 2.7% 144B
start MPI receive 8.77k 1.89ms 0.1% 216ns 548KiB 1.2% 64.0B
prolong2boundaries 8.77k 547μs 0.0% 62.4ns 0.00B 0.0% 0.00B
prolong2mpimortars 8.77k 401μs 0.0% 45.7ns 0.00B 0.0% 0.00B
prolong2mortars 8.77k 388μs 0.0% 44.3ns 0.00B 0.0% 0.00B
MPI mortar flux 8.77k 368μs 0.0% 42.0ns 0.00B 0.0% 0.00B
mortar flux 8.77k 287μs 0.0% 32.7ns 0.00B 0.0% 0.00B
source terms 8.77k 203μs 0.0% 23.2ns 0.00B 0.0% 0.00B
boundary flux 8.77k 201μs 0.0% 22.9ns 0.00B 0.0% 0.00B
calculate dt 1.75k 335ms 10.2% 191μs 165KiB 0.4% 96.0B
analyze solution 19 101ms 3.1% 5.29ms 19.2MiB 42.9% 1.01MiB
────────────────────────────────────────────────────────────────────────────────────
julia> sol = solve(ode, CarpenterKennedy2N54(williamson_condition=false, thread=OrdinaryDiffEq.True()), dt=1.0, save_everystep=false, callback=callbacks); summary_callback()
────────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 3.30s / 82.8% 48.5MiB / 92.5%
Section ncalls time %tot avg alloc %tot avg
────────────────────────────────────────────────────────────────────────────────────
rhs! 8.77k 2.33s 85.5% 266μs 25.4MiB 56.7% 2.97KiB
volume integral 8.77k 1.53s 56.2% 175μs 2.81MiB 6.3% 336B
interface flux 8.77k 297ms 10.9% 33.9μs 3.35MiB 7.5% 400B
prolong2interfaces 8.77k 105ms 3.9% 12.0μs 2.01MiB 4.5% 240B
finish MPI receive 8.77k 98.0ms 3.6% 11.2μs 1.07MiB 2.4% 128B
surface integral 8.77k 86.5ms 3.2% 9.87μs 2.54MiB 5.7% 304B
start MPI send 8.77k 62.5ms 2.3% 7.12μs 822KiB 1.8% 96.0B
~rhs!~ 8.77k 33.9ms 1.2% 3.86μs 3.49MiB 7.8% 418B
MPI interface flux 8.77k 33.0ms 1.2% 3.77μs 3.35MiB 7.5% 400B
Jacobian 8.77k 28.2ms 1.0% 3.22μs 2.41MiB 5.4% 288B
reset ∂u/∂t 8.77k 27.1ms 1.0% 3.09μs 0.00B 0.0% 0.00B
prolong2mpiinterfaces 8.77k 20.6ms 0.8% 2.35μs 1.87MiB 4.2% 224B
finish MPI send 8.77k 2.44ms 0.1% 279ns 1.20MiB 2.7% 144B
start MPI receive 8.77k 1.81ms 0.1% 207ns 548KiB 1.2% 64.0B
prolong2boundaries 8.77k 404μs 0.0% 46.0ns 0.00B 0.0% 0.00B
prolong2mortars 8.77k 380μs 0.0% 43.3ns 0.00B 0.0% 0.00B
MPI mortar flux 8.77k 341μs 0.0% 38.9ns 0.00B 0.0% 0.00B
prolong2mpimortars 8.77k 341μs 0.0% 38.8ns 0.00B 0.0% 0.00B
mortar flux 8.77k 250μs 0.0% 28.5ns 0.00B 0.0% 0.00B
source terms 8.77k 203μs 0.0% 23.1ns 0.00B 0.0% 0.00B
boundary flux 8.77k 201μs 0.0% 22.9ns 0.00B 0.0% 0.00B
calculate dt 1.75k 295ms 10.8% 168μs 165KiB 0.4% 96.0B
analyze solution 19 101ms 3.7% 5.32ms 19.2MiB 42.9% 1.01MiB
────────────────────────────────────────────────────────────────────────────────────
julia> sol = solve(ode, RDPK3SpFSAL35(), abstol=1.0e-4, reltol=1.0e-4, save_everystep=false, callback=callbacks); summary_callback()
────────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 3.01s / 45.0% 29.0MiB / 84.7%
Section ncalls time %tot avg alloc %tot avg
────────────────────────────────────────────────────────────────────────────────────
rhs! 4.64k 1.30s 95.7% 280μs 13.5MiB 54.7% 2.97KiB
volume integral 4.64k 677ms 50.0% 146μs 1.49MiB 6.0% 336B
reset ∂u/∂t 4.64k 233ms 17.2% 50.3μs 0.00B 0.0% 0.00B
interface flux 4.64k 137ms 10.1% 29.6μs 1.77MiB 7.2% 400B
surface integral 4.64k 47.6ms 3.5% 10.3μs 1.34MiB 5.5% 304B
finish MPI receive 4.64k 47.0ms 3.5% 10.1μs 580KiB 2.3% 128B
prolong2interfaces 4.64k 45.1ms 3.3% 9.72μs 1.06MiB 4.3% 240B
start MPI send 4.64k 44.5ms 3.3% 9.60μs 435KiB 1.7% 96.0B
~rhs!~ 4.64k 18.2ms 1.3% 3.92μs 1.85MiB 7.5% 419B
MPI interface flux 4.64k 15.9ms 1.2% 3.43μs 1.77MiB 7.2% 400B
Jacobian 4.64k 15.1ms 1.1% 3.26μs 1.27MiB 5.2% 288B
prolong2mpiinterfaces 4.64k 13.1ms 1.0% 2.82μs 0.99MiB 4.0% 224B
start MPI receive 4.64k 1.09ms 0.1% 235ns 290KiB 1.2% 64.0B
finish MPI send 4.64k 982μs 0.1% 212ns 652KiB 2.6% 144B
prolong2boundaries 4.64k 284μs 0.0% 61.3ns 0.00B 0.0% 0.00B
prolong2mpimortars 4.64k 238μs 0.0% 51.2ns 0.00B 0.0% 0.00B
prolong2mortars 4.64k 217μs 0.0% 46.8ns 0.00B 0.0% 0.00B
MPI mortar flux 4.64k 196μs 0.0% 42.4ns 0.00B 0.0% 0.00B
mortar flux 4.64k 140μs 0.0% 30.1ns 0.00B 0.0% 0.00B
boundary flux 4.64k 118μs 0.0% 25.4ns 0.00B 0.0% 0.00B
source terms 4.64k 112μs 0.0% 24.0ns 0.00B 0.0% 0.00B
analyze solution 11 57.7ms 4.3% 5.25ms 11.1MiB 45.3% 1.01MiB
────────────────────────────────────────────────────────────────────────────────────
julia> sol = solve(ode, RDPK3SpFSAL35(thread=OrdinaryDiffEq.True()), abstol=1.0e-4, reltol=1.0e-4, save_everystep=false, callback=callbacks); summary_callback()
────────────────────────────────────────────────────────────────────────────────────
Trixi.jl Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 2.17s / 55.6% 31.0MiB / 79.2%
Section ncalls time %tot avg alloc %tot avg
────────────────────────────────────────────────────────────────────────────────────
rhs! 4.64k 1.11s 92.2% 240μs 13.5MiB 54.7% 2.97KiB
volume integral 4.64k 662ms 54.9% 143μs 1.49MiB 6.0% 336B
interface flux 4.64k 135ms 11.2% 29.1μs 1.77MiB 7.2% 400B
finish MPI receive 4.64k 57.1ms 4.7% 12.3μs 580KiB 2.3% 128B
reset ∂u/∂t 4.64k 56.8ms 4.7% 12.2μs 0.00B 0.0% 0.00B
prolong2interfaces 4.64k 56.7ms 4.7% 12.2μs 1.06MiB 4.3% 240B
surface integral 4.64k 48.3ms 4.0% 10.4μs 1.34MiB 5.5% 304B
start MPI send 4.64k 32.3ms 2.7% 6.97μs 435KiB 1.7% 96.0B
~rhs!~ 4.64k 17.5ms 1.5% 3.78μs 1.85MiB 7.5% 419B
MPI interface flux 4.64k 15.7ms 1.3% 3.38μs 1.77MiB 7.2% 400B
Jacobian 4.64k 15.2ms 1.3% 3.28μs 1.27MiB 5.2% 288B
prolong2mpiinterfaces 4.64k 11.2ms 0.9% 2.42μs 0.99MiB 4.0% 224B
finish MPI send 4.64k 1.35ms 0.1% 292ns 652KiB 2.6% 144B
start MPI receive 4.64k 919μs 0.1% 198ns 290KiB 1.2% 64.0B
prolong2boundaries 4.64k 226μs 0.0% 48.7ns 0.00B 0.0% 0.00B
prolong2mpimortars 4.64k 215μs 0.0% 46.4ns 0.00B 0.0% 0.00B
prolong2mortars 4.64k 203μs 0.0% 43.7ns 0.00B 0.0% 0.00B
MPI mortar flux 4.64k 199μs 0.0% 42.9ns 0.00B 0.0% 0.00B
mortar flux 4.64k 137μs 0.0% 29.6ns 0.00B 0.0% 0.00B
source terms 4.64k 111μs 0.0% 24.0ns 0.00B 0.0% 0.00B
boundary flux 4.64k 106μs 0.0% 22.8ns 0.00B 0.0% 0.00B
analyze solution 11 93.9ms 7.8% 8.54ms 11.1MiB 45.3% 1.01MiB
────────────────────────────────────────────────────────────────────────────────────
Looks okay, doesn't it? In particular, there seems to be an effect of using multi-threading also for the RK solver.
Looks okay, doesn't it? In particular, there seems to be an effect of using multi-threading also for the RK solver.
Yes, it looks ok. Although it's not clear yet what the performance impact really is (hard to tell with such a small problem size) and whether it makes more sense to use more threads or more ranks. Then again, this is often hardware dependent...
Looks okay, doesn't it? In particular, there seems to be an effect of using multi-threading also for the RK solver.
Yes, it looks ok. Although it's not clear yet what the performance impact really is (hard to tell with such a small problem size) and whether it makes more sense to use more threads or more ranks. Then again, this is often hardware dependent...
My intention was just to test whether it works at all - I'll leave the rest to you HLRS guys :sweat_smile:
Do you understand why the serial p4est runs fail? Why would the results change? Is it because we do not use raw PtrArray
s anymore and thus OrdinaryDiffEq.jl does something different under the hood when computing the time step update?
Do you understand why the serial p4est runs fail? Why would the results change? Is it because we do not use raw
PtrArray
s anymore and thus OrdinaryDiffEq.jl does something different under the hood when computing the time step update?
No idea... It's elixir_advection_basic.jl
, everything else passes :confused:
Do you understand why the serial p4est runs fail? Why would the results change? Is it because we do not use raw
PtrArray
s anymore and thus OrdinaryDiffEq.jl does something different under the hood when computing the time step update?No idea... It's
elixir_advection_basic.jl
, everything else passes 😕
Positive: Now everything that was "weirdly" broken passes. Negative: macOS tests are still hanging...
Yeah... but I can't really debug the macOS part (since I don't have a Mac)
Could you see which test is the issue? If yes, we can try disabling it to check whether it's a singleton issue or a general problem. Although we should try to find the root cause either way.
Could you see which test is the issue? If yes, we can try disabling it to check whether it's a singleton issue or a general problem. Although we should try to find the root cause either way.
Looks like it's examples/tree_2d_dgsem/elixir_euler_ec.jl
with error-based step size control :cry:
Could you see which test is the issue? If yes, we can try disabling it to check whether it's a singleton issue or a general problem. Although we should try to find the root cause either way.
Looks like it's
examples/tree_2d_dgsem/elixir_euler_ec.jl
with error-based step size control 😢
@andrewwinters5000 It would be great if you could try to reproduce this issue.
I got rid of the global length
completely, since it leads to hard-to-find bugs. Let's see what happens now...
MPI tests pass :partying_face: @sloede Please have a look at the new stuff. Right now, our calling convention must be
sol = solve(ode, alg; kwargs..., internalnorm=ode_norm, unstable_check=ode_unstable_check)
We should probably make it easier to use all this but it seems to be working.