amrex
amrex copied to clipboard
TinyProfiler: only count profiled functions towards the topmost region
Summary
Previously, a profiled function would count towards all active regions. In this PR, a function only counts towards the latest region that was activated (more precisely, the region that is at the top of the region stack) to clean up the output.
It is to be discussed if this is a desirable change.
Additional background
Test where I added an Evolve Region to HiPACE++ PR:
TinyProfiler total time across processes [min...avg...max]: 19.97 ... 20.41 ... 20.71
-----------------------------------------------------------------------------------------------
Name NCalls Excl. Min Excl. Avg Excl. Max Max %
-----------------------------------------------------------------------------------------------
Hipace::InitData() 1 0.005004 0.02718 0.03177 0.15%
sortBeamParticlesByBox() 0 0 0.002245 0.01796 0.09%
FabArray::setVal() 3 0.01305 0.01357 0.01475 0.07%
AnyDST::CreatePlan() 1 0.005245 0.005869 0.006669 0.03%
Fields::AllocData() 1 0.005029 0.005651 0.006021 0.03%
BeamParticleContainer::InitBeamFixedWeight3D() 1 9.62e-07 0.0006685 0.005339 0.03%
main() 1 0.001121 0.001193 0.001241 0.01%
AdaptiveTimeStep::CalculateFromDensity() 0 0 7.378e-06 5.902e-05 0.00%
AdaptiveTimeStep::CalculateFromMinUz() 0 0 9.619e-07 7.695e-06 0.00%
AdaptiveTimeStep::GatherMinUzSlice() 0 0 6.676e-07 5.341e-06 0.00%
-----------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------
Name NCalls Incl. Min Incl. Avg Incl. Max Max %
-----------------------------------------------------------------------------------------------
main() 1 19.97 20.41 20.71 100.00%
Hipace::InitData() 1 0.05507 0.05519 0.05527 0.27%
Fields::AllocData() 1 0.02344 0.02509 0.0267 0.13%
sortBeamParticlesByBox() 0 0 0.002245 0.01796 0.09%
FabArray::setVal() 3 0.01305 0.01357 0.01475 0.07%
AnyDST::CreatePlan() 1 0.005245 0.005869 0.006669 0.03%
BeamParticleContainer::InitBeamFixedWeight3D() 1 9.62e-07 0.0006685 0.005339 0.03%
AdaptiveTimeStep::CalculateFromDensity() 0 0 7.378e-06 5.902e-05 0.00%
AdaptiveTimeStep::CalculateFromMinUz() 0 0 9.619e-07 7.695e-06 0.00%
AdaptiveTimeStep::GatherMinUzSlice() 0 0 6.676e-07 5.341e-06 0.00%
-----------------------------------------------------------------------------------------------
BEGIN REGION Hipace::Evolve()
--------------------------------------------------------------------------------------------------
Name NCalls Excl. Min Excl. Avg Excl. Max Max %
--------------------------------------------------------------------------------------------------
hpmg::MultiGrid::solve1() 1000 6.525 6.544 6.592 31.83%
AnyDST::Execute() 6000 3.856 3.874 3.902 18.84%
AdvanceBeamParticlesSlice() 1000 2.697 2.713 2.737 13.21%
ExplicitDeposition() 1000 2.244 2.256 2.263 10.93%
AdvancePlasmaParticles() 1000 1.251 1.254 1.258 6.07%
DepositCurrent_PlasmaParticleContainer() 1001 1.008 1.015 1.024 4.94%
MultiBuffer::get_data() 1000 0.0009277 0.4132 0.8939 4.32%
FFTPoissonSolverDirichlet::SolvePoissonEquation() 3000 0.4715 0.4744 0.481 2.32%
Fields::InitializeSlices() 1000 0.4468 0.4524 0.4593 2.22%
Fields::ShiftSlices() 1000 0.2879 0.3412 0.4221 2.04%
Fields::SolvePoissonPsiExmByEypBxEzBz() 1000 0.3728 0.3748 0.3761 1.82%
Hipace::InitializeSxSyWithBeam() 1000 0.2203 0.222 0.2234 1.08%
FillBoundary_nowait() 4000 0.1177 0.1215 0.1284 0.62%
Fields::AddRhoIons() 1000 0.09325 0.09406 0.09442 0.46%
MultiBuffer::put_data() 1000 0.005097 0.0505 0.06129 0.30%
DepositCurrentSlice_BeamParticleContainer() 2000 0.04122 0.04178 0.04248 0.21%
shiftSlippedParticles() 678 0.03531 0.0359 0.03678 0.18%
BeamParticleContainer::InitBeamFixedWeightSlice() 125 0 0.004217 0.03374 0.16%
AdaptiveTimeStep::GatherMinUzSlice() 1000 0.029 0.02985 0.03123 0.15%
PlasmaParticleContainer::InitParticles() 1 0.01566 0.0163 0.01715 0.08%
Hipace::SolveOneSlice() 1000 0.007837 0.008237 0.008972 0.04%
Hipace::ExplicitMGSolveBxBy() 1000 0.003879 0.004016 0.004326 0.02%
BeamParticleContainer::resize() 3015 0.002181 0.002504 0.00269 0.01%
REG::Hipace::Evolve() 1 0.0008531 0.001721 0.00223 0.01%
FabArray::FillBoundary() 4000 0.00147 0.001559 0.00162 0.01%
FillBoundary_finish() 4000 0.0008026 0.0009062 0.001127 0.01%
FabArray::setVal() 1 0.000822 0.0008416 0.0008629 0.00%
FabArrayBase::getFB() 4000 0.0006824 0.0007352 0.0007725 0.00%
AdaptiveTimeStep::CalculateFromDensity() 1 6.487e-05 7.008e-05 7.951e-05 0.00%
FabArrayBase::FB::FB() 1 3.071e-05 3.599e-05 3.782e-05 0.00%
AdaptiveTimeStep::CalculateFromMinUz() 1 2.955e-06 3.839e-06 8.757e-06 0.00%
ParticleContainer::clearParticles() 1 3.5e-07 4.22e-07 5.01e-07 0.00%
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
Name NCalls Incl. Min Incl. Avg Incl. Max Max %
--------------------------------------------------------------------------------------------------
REG::Hipace::Evolve() 1 19.92 20.35 20.66 99.73%
Hipace::SolveOneSlice() 1000 19.88 20.31 20.62 99.56%
Hipace::ExplicitMGSolveBxBy() 1000 6.529 6.548 6.596 31.85%
hpmg::MultiGrid::solve1() 1000 6.525 6.544 6.592 31.83%
Fields::SolvePoissonPsiExmByEypBxEzBz() 1000 4.769 4.788 4.818 23.26%
FFTPoissonSolverDirichlet::SolvePoissonEquation() 3000 4.328 4.348 4.377 21.13%
AnyDST::Execute() 6000 3.856 3.874 3.902 18.84%
AdvanceBeamParticlesSlice() 1000 2.697 2.713 2.737 13.21%
ExplicitDeposition() 1000 2.244 2.256 2.263 10.93%
AdvancePlasmaParticles() 1000 1.251 1.254 1.258 6.07%
DepositCurrent_PlasmaParticleContainer() 1001 1.008 1.015 1.024 4.94%
MultiBuffer::get_data() 1000 0.03601 0.4186 0.8952 4.32%
Fields::InitializeSlices() 1000 0.4468 0.4524 0.4593 2.22%
Fields::ShiftSlices() 1000 0.2879 0.3412 0.4221 2.04%
Hipace::InitializeSxSyWithBeam() 1000 0.2769 0.2811 0.2864 1.38%
FabArray::FillBoundary() 4000 0.1209 0.1247 0.1319 0.64%
FillBoundary_nowait() 4000 0.1184 0.1222 0.1292 0.62%
Fields::AddRhoIons() 1000 0.09325 0.09406 0.09442 0.46%
MultiBuffer::put_data() 1000 0.005097 0.05073 0.06158 0.30%
DepositCurrentSlice_BeamParticleContainer() 2000 0.04122 0.04178 0.04248 0.21%
shiftSlippedParticles() 678 0.03596 0.03651 0.03717 0.18%
BeamParticleContainer::InitBeamFixedWeightSlice() 125 0 0.004386 0.03509 0.17%
AdaptiveTimeStep::GatherMinUzSlice() 1000 0.029 0.02985 0.03123 0.15%
PlasmaParticleContainer::InitParticles() 1 0.01566 0.0163 0.01715 0.08%
BeamParticleContainer::resize() 3015 0.002181 0.002504 0.00269 0.01%
FillBoundary_finish() 4000 0.0008026 0.0009062 0.001127 0.01%
FabArray::setVal() 1 0.000822 0.0008416 0.0008629 0.00%
FabArrayBase::getFB() 4000 0.0007194 0.0007711 0.0008088 0.00%
AdaptiveTimeStep::CalculateFromDensity() 1 6.487e-05 7.008e-05 7.951e-05 0.00%
FabArrayBase::FB::FB() 1 3.071e-05 3.599e-05 3.782e-05 0.00%
AdaptiveTimeStep::CalculateFromMinUz() 1 2.955e-06 3.839e-06 8.757e-06 0.00%
ParticleContainer::clearParticles() 1 3.5e-07 4.22e-07 5.01e-07 0.00%
--------------------------------------------------------------------------------------------------
Dev:
TinyProfiler total time across processes [min...avg...max]: 20.14 ... 20.58 ... 20.92
--------------------------------------------------------------------------------------------------
Name NCalls Excl. Min Excl. Avg Excl. Max Max %
--------------------------------------------------------------------------------------------------
hpmg::MultiGrid::solve1() 1000 6.581 6.593 6.607 31.58%
AnyDST::Execute() 6000 3.878 3.885 3.897 18.63%
AdvanceBeamParticlesSlice() 1000 2.784 2.796 2.814 13.45%
ExplicitDeposition() 1000 2.289 2.302 2.315 11.06%
AdvancePlasmaParticles() 1000 1.265 1.269 1.273 6.09%
DepositCurrent_PlasmaParticleContainer() 1001 1.01 1.015 1.021 4.88%
MultiBuffer::get_data() 1000 0.00102 0.3955 0.881 4.21%
FFTPoissonSolverDirichlet::SolvePoissonEquation() 3000 0.4738 0.4754 0.478 2.28%
Fields::InitializeSlices() 1000 0.4425 0.448 0.4525 2.16%
Fields::ShiftSlices() 1000 0.2885 0.3433 0.4114 1.97%
Fields::SolvePoissonPsiExmByEypBxEzBz() 1000 0.3736 0.3756 0.3768 1.80%
Hipace::InitializeSxSyWithBeam() 1000 0.222 0.2226 0.2233 1.07%
FillBoundary_nowait() 4000 0.1176 0.123 0.1257 0.60%
Fields::AddRhoIons() 1000 0.09399 0.0944 0.0955 0.46%
MultiBuffer::put_data() 1000 0.005381 0.05095 0.06064 0.29%
DepositCurrentSlice_BeamParticleContainer() 2000 0.04133 0.04216 0.04298 0.21%
shiftSlippedParticles() 678 0.0352 0.03672 0.03822 0.18%
BeamParticleContainer::InitBeamFixedWeightSlice() 125 0 0.004371 0.03497 0.17%
AdaptiveTimeStep::GatherMinUzSlice() 1000 0.03028 0.03094 0.03231 0.15%
Hipace::InitData() 1 0.004988 0.01514 0.01775 0.08%
PlasmaParticleContainer::InitParticles() 1 0.0137 0.01417 0.01475 0.07%
Hipace::SolveOneSlice() 1000 0.007951 0.008304 0.008721 0.04%
FabArray::setVal() 4 0.007093 0.007664 0.008265 0.04%
AnyDST::CreatePlan() 1 0.005701 0.006122 0.006449 0.03%
Fields::AllocData() 1 0.005323 0.005739 0.006122 0.03%
sortBeamParticlesByBox() 0 0 0.0005569 0.004456 0.02%
BeamParticleContainer::InitBeamFixedWeight3D() 1 9.32e-07 0.0005567 0.004446 0.02%
Hipace::ExplicitMGSolveBxBy() 1000 0.003986 0.004133 0.004308 0.02%
main() 1 0.001152 0.001526 0.003954 0.02%
REG::Hipace::Evolve() 1 0.001529 0.002525 0.003085 0.01%
BeamParticleContainer::resize() 3016 0.002765 0.002876 0.003002 0.01%
FabArray::FillBoundary() 4000 0.001996 0.002069 0.002138 0.01%
FillBoundary_finish() 4000 0.001346 0.001428 0.001509 0.01%
FabArrayBase::getFB() 4000 0.001058 0.001126 0.001233 0.01%
AdaptiveTimeStep::CalculateFromDensity() 1 6.735e-05 8.429e-05 0.0001311 0.00%
FabArrayBase::FB::FB() 1 3.032e-05 3.535e-05 3.68e-05 0.00%
AdaptiveTimeStep::CalculateFromMinUz() 1 3.196e-06 4.429e-06 1.062e-05 0.00%
ParticleContainer::clearParticles() 1 6.31e-07 7.365e-07 8.22e-07 0.00%
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
Name NCalls Incl. Min Incl. Avg Incl. Max Max %
--------------------------------------------------------------------------------------------------
main() 1 20.14 20.58 20.92 100.00%
REG::Hipace::Evolve() 1 20.11 20.54 20.89 99.83%
Hipace::SolveOneSlice() 1000 20.08 20.52 20.86 99.71%
Hipace::ExplicitMGSolveBxBy() 1000 6.585 6.597 6.611 31.60%
hpmg::MultiGrid::solve1() 1000 6.581 6.593 6.607 31.58%
Fields::SolvePoissonPsiExmByEypBxEzBz() 1000 4.79 4.803 4.814 23.01%
FFTPoissonSolverDirichlet::SolvePoissonEquation() 3000 4.352 4.361 4.373 20.90%
AnyDST::Execute() 6000 3.878 3.885 3.897 18.63%
AdvanceBeamParticlesSlice() 1000 2.784 2.796 2.814 13.45%
ExplicitDeposition() 1000 2.289 2.302 2.315 11.06%
AdvancePlasmaParticles() 1000 1.265 1.269 1.273 6.09%
DepositCurrent_PlasmaParticleContainer() 1001 1.01 1.015 1.021 4.88%
MultiBuffer::get_data() 1000 0.03736 0.4011 0.8823 4.22%
Fields::InitializeSlices() 1000 0.4425 0.448 0.4525 2.16%
Fields::ShiftSlices() 1000 0.2885 0.3433 0.4114 1.97%
Hipace::InitializeSxSyWithBeam() 1000 0.2799 0.284 0.2862 1.37%
FabArray::FillBoundary() 4000 0.1221 0.1277 0.1304 0.62%
FillBoundary_nowait() 4000 0.1187 0.1242 0.127 0.61%
Fields::AddRhoIons() 1000 0.09399 0.0944 0.0955 0.46%
MultiBuffer::put_data() 1000 0.005381 0.05126 0.06101 0.29%
DepositCurrentSlice_BeamParticleContainer() 2000 0.04133 0.04216 0.04298 0.21%
shiftSlippedParticles() 678 0.03583 0.03742 0.0389 0.19%
BeamParticleContainer::InitBeamFixedWeightSlice() 125 0 0.004543 0.03634 0.17%
Hipace::InitData() 1 0.03329 0.03493 0.03616 0.17%
AdaptiveTimeStep::GatherMinUzSlice() 1000 0.03028 0.03094 0.03231 0.15%
Fields::AllocData() 1 0.01841 0.01867 0.01933 0.09%
PlasmaParticleContainer::InitParticles() 1 0.0137 0.01417 0.01475 0.07%
FabArray::setVal() 4 0.007093 0.007664 0.008265 0.04%
AnyDST::CreatePlan() 1 0.005701 0.006122 0.006449 0.03%
sortBeamParticlesByBox() 0 0 0.0005569 0.004456 0.02%
BeamParticleContainer::InitBeamFixedWeight3D() 1 9.32e-07 0.0005567 0.004446 0.02%
BeamParticleContainer::resize() 3016 0.002765 0.002876 0.003002 0.01%
FillBoundary_finish() 4000 0.001346 0.001428 0.001509 0.01%
FabArrayBase::getFB() 4000 0.001094 0.001161 0.001266 0.01%
AdaptiveTimeStep::CalculateFromDensity() 1 6.735e-05 8.429e-05 0.0001311 0.00%
FabArrayBase::FB::FB() 1 3.032e-05 3.535e-05 3.68e-05 0.00%
AdaptiveTimeStep::CalculateFromMinUz() 1 3.196e-06 4.429e-06 1.062e-05 0.00%
ParticleContainer::clearParticles() 1 6.31e-07 7.365e-07 8.22e-07 0.00%
--------------------------------------------------------------------------------------------------
BEGIN REGION Hipace::Evolve()
--------------------------------------------------------------------------------------------------
Name NCalls Excl. Min Excl. Avg Excl. Max Max %
--------------------------------------------------------------------------------------------------
hpmg::MultiGrid::solve1() 1000 6.581 6.593 6.607 31.58%
AnyDST::Execute() 6000 3.878 3.885 3.897 18.63%
AdvanceBeamParticlesSlice() 1000 2.784 2.796 2.814 13.45%
ExplicitDeposition() 1000 2.289 2.302 2.315 11.06%
AdvancePlasmaParticles() 1000 1.265 1.269 1.273 6.09%
DepositCurrent_PlasmaParticleContainer() 1001 1.01 1.015 1.021 4.88%
MultiBuffer::get_data() 1000 0.00102 0.3955 0.881 4.21%
FFTPoissonSolverDirichlet::SolvePoissonEquation() 3000 0.4738 0.4754 0.478 2.28%
Fields::InitializeSlices() 1000 0.4425 0.448 0.4525 2.16%
Fields::ShiftSlices() 1000 0.2885 0.3433 0.4114 1.97%
Fields::SolvePoissonPsiExmByEypBxEzBz() 1000 0.3736 0.3756 0.3768 1.80%
Hipace::InitializeSxSyWithBeam() 1000 0.222 0.2226 0.2233 1.07%
FillBoundary_nowait() 4000 0.1176 0.123 0.1257 0.60%
Fields::AddRhoIons() 1000 0.09399 0.0944 0.0955 0.46%
MultiBuffer::put_data() 1000 0.005381 0.05095 0.06064 0.29%
DepositCurrentSlice_BeamParticleContainer() 2000 0.04133 0.04216 0.04298 0.21%
shiftSlippedParticles() 678 0.0352 0.03672 0.03822 0.18%
BeamParticleContainer::InitBeamFixedWeightSlice() 125 0 0.004371 0.03497 0.17%
AdaptiveTimeStep::GatherMinUzSlice() 1000 0.03028 0.03094 0.03231 0.15%
PlasmaParticleContainer::InitParticles() 1 0.0137 0.01417 0.01475 0.07%
Hipace::SolveOneSlice() 1000 0.007951 0.008304 0.008721 0.04%
Hipace::ExplicitMGSolveBxBy() 1000 0.003986 0.004133 0.004308 0.02%
REG::Hipace::Evolve() 1 0.001529 0.002525 0.003085 0.01%
BeamParticleContainer::resize() 3016 0.002765 0.002876 0.003002 0.01%
FabArray::FillBoundary() 4000 0.001996 0.002069 0.002138 0.01%
FillBoundary_finish() 4000 0.001346 0.001428 0.001509 0.01%
FabArrayBase::getFB() 4000 0.001058 0.001126 0.001233 0.01%
FabArray::setVal() 1 0.0008215 0.0008513 0.0009144 0.00%
AdaptiveTimeStep::CalculateFromDensity() 1 6.735e-05 7.693e-05 0.0001112 0.00%
FabArrayBase::FB::FB() 1 3.032e-05 3.535e-05 3.68e-05 0.00%
AdaptiveTimeStep::CalculateFromMinUz() 1 2.966e-06 3.472e-06 4.108e-06 0.00%
ParticleContainer::clearParticles() 1 6.31e-07 7.365e-07 8.22e-07 0.00%
--------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
Name NCalls Incl. Min Incl. Avg Incl. Max Max %
--------------------------------------------------------------------------------------------------
REG::Hipace::Evolve() 1 20.11 20.54 20.89 99.83%
Hipace::SolveOneSlice() 1000 20.08 20.52 20.86 99.71%
Hipace::ExplicitMGSolveBxBy() 1000 6.585 6.597 6.611 31.60%
hpmg::MultiGrid::solve1() 1000 6.581 6.593 6.607 31.58%
Fields::SolvePoissonPsiExmByEypBxEzBz() 1000 4.79 4.803 4.814 23.01%
FFTPoissonSolverDirichlet::SolvePoissonEquation() 3000 4.352 4.361 4.373 20.90%
AnyDST::Execute() 6000 3.878 3.885 3.897 18.63%
AdvanceBeamParticlesSlice() 1000 2.784 2.796 2.814 13.45%
ExplicitDeposition() 1000 2.289 2.302 2.315 11.06%
AdvancePlasmaParticles() 1000 1.265 1.269 1.273 6.09%
DepositCurrent_PlasmaParticleContainer() 1001 1.01 1.015 1.021 4.88%
MultiBuffer::get_data() 1000 0.03736 0.4011 0.8823 4.22%
Fields::InitializeSlices() 1000 0.4425 0.448 0.4525 2.16%
Fields::ShiftSlices() 1000 0.2885 0.3433 0.4114 1.97%
Hipace::InitializeSxSyWithBeam() 1000 0.2799 0.284 0.2862 1.37%
FabArray::FillBoundary() 4000 0.1221 0.1277 0.1304 0.62%
FillBoundary_nowait() 4000 0.1187 0.1242 0.127 0.61%
Fields::AddRhoIons() 1000 0.09399 0.0944 0.0955 0.46%
MultiBuffer::put_data() 1000 0.005381 0.05126 0.06101 0.29%
DepositCurrentSlice_BeamParticleContainer() 2000 0.04133 0.04216 0.04298 0.21%
shiftSlippedParticles() 678 0.03583 0.03742 0.0389 0.19%
BeamParticleContainer::InitBeamFixedWeightSlice() 125 0 0.004543 0.03634 0.17%
AdaptiveTimeStep::GatherMinUzSlice() 1000 0.03028 0.03094 0.03231 0.15%
PlasmaParticleContainer::InitParticles() 1 0.0137 0.01417 0.01475 0.07%
BeamParticleContainer::resize() 3016 0.002765 0.002876 0.003002 0.01%
FillBoundary_finish() 4000 0.001346 0.001428 0.001509 0.01%
FabArrayBase::getFB() 4000 0.001094 0.001161 0.001266 0.01%
FabArray::setVal() 1 0.0008215 0.0008513 0.0009144 0.00%
AdaptiveTimeStep::CalculateFromDensity() 1 6.735e-05 7.693e-05 0.0001112 0.00%
FabArrayBase::FB::FB() 1 3.032e-05 3.535e-05 3.68e-05 0.00%
AdaptiveTimeStep::CalculateFromMinUz() 1 2.966e-06 3.472e-06 4.108e-06 0.00%
ParticleContainer::clearParticles() 1 6.31e-07 7.365e-07 8.22e-07 0.00%
--------------------------------------------------------------------------------------------------
Checklist
The proposed changes:
- [ ] fix a bug or incorrect behavior in AMReX
- [x] add new capabilities to AMReX
- [ ] changes answers in the test suite to more than roundoff level
- [ ] are likely to significantly affect the results of downstream AMReX users
- [ ] include documentation in the code and/or rst files, if appropriate