openmc icon indicating copy to clipboard operation
openmc copied to clipboard

[Point Detector] Add distribution get_pdf functionality

Open itay-space opened this issue 4 months ago • 26 comments

Description

This PR is the first step in breaking down the point detector tally project into smaller, reviewable pieces, following the large PR #3109 and the paper Development and benchmarking of a point detector in OpenMC.

As suggested during the earlier discussion (thanks @Guysten, @gridley, @shimwell !), the natural starting point is to add the ability to correctly calculate PDFs on the C++ side.

Accordingly, this PR introduces the get_pdf implementations for the distribution classes. This is essentially a clean cut from the original large PR, with the goal of keeping the scope focused and review manageable.

We plan to follow up with tests and additional pieces of the point detector work in subsequent PRs.

Checklist

  • [x] I have performed a self-review of my own code
  • [x] I have run clang-format (version 15) on any C++ source files (if applicable)
  • [x] I have followed the style guidelines for Python source files (if applicable)
  • [x] I have made corresponding changes to the documentation (if applicable)
  • [ ] I have added tests that prove my fix is effective or that my feature works (if applicable)

itay-space avatar Aug 27 '25 15:08 itay-space

Hi @itay-space and @GuySten thanks for the great work on this so far! I want to put a quick hold on this merge for coordination purposes. I have a number of comments below.

  1. We need this functionality as well for source biasing (and any general distribution biasing capabilities) as outlined in PR #3460 where there is some duplication of this work. We should coordinate the development efforts to suit everyone's needs.
  2. I'd prefer to go with the terminology evaluate for these methods instead of get_pdf. In my opinion the latter has the connotation of returning the PDF not the result of evaluating the PDF. This would also be consistent with the terminology we use for evaluating surface functions.
  3. To merge this capability we should use the override specifier on the methods for classes that subclass Distribution and we should implement the evaluate method for all types of PDFs that are currently available and not return -1 as the not implemented default.
  4. None of these functions should be taking in random seeds I don't think.

Pinging @j-fletcher as he has been working on some of this over on PR #3460.

eepeterson avatar Sep 03 '25 20:09 eepeterson

@eepeterson, thanks for your feedback.

After looking at the source biasing PR and thinking further I have reached the following conclusions: We have less functionality in common that may appear at first glance.

The common functionality is only for pdf evaluation of 1D distributions.

I propose coordinating in the following way:

  1. I am also not that happy with get_pdf. I prefer to call it simply pdf. I propose that for 1d distributions we will change the signature from get_pdf to pdf. I do not like evaluate because I do not think of a distribution as a function but as an object which has a pdf and the pdf is a function.
  2. For 2d angle energy distributions the name get_pdf is actually confusing because what we really need is a function that calculate the energy integrated pdf (which does not need a seed) for which I propose the name angular_pdf and a function that sample energy conditional on a specific angle (which do need a seed) that I propose we call conditional_sample_energy.
  3. After looking up the override keyword in cpp, I am fine with using it. It is good practice.
  4. I think the current way of default pdf implementation of returning fatal error is fine. We will implement overrides for the 1d distributions we need and the rest will be implemented in the source biasing PR. We just have to agree on the signature.
  5. In the source biasing PR you will implement the 2d pdf function if I understand correctly that you need it.

@eepeterson, @itay-space, @j-fletcher what do you think?

GuySten avatar Sep 04 '25 03:09 GuySten

thanks for the comment @eepeterson and thank you @GuySten for the huge help with the code. regarding the points in the last comment by @GuySten:

  1. I am okay with both names.
  2. I am not sure why implementing the energy-integrated pdf would be an advantage. We need the full 2D angle-energy pdf because we want to get the probability of a particle with some energy to fly into a certain angle.
  3. Agree.
  4. Ok.
  5. As I mentioned, we need the 2D pdf for the point detector.

itay-space avatar Sep 04 '25 08:09 itay-space

  1. We need the energy integrated (integral over E_out) pdf because we only care that the particle scatter with some specific mu and not what is the final energy.

The 2d pdf is deterministic so there is no need to use seed to calculate it. The reason we need the seed is to sample the resulting E_out when the scattering cosine mu is given.

GuySten avatar Sep 04 '25 09:09 GuySten

@GuySten Could you please point me to an example in the code where you plan to do this?

itay-space avatar Sep 04 '25 11:09 itay-space

@itay-space, in my latest commit I changed all the function signatures according to my suggestion. I think I can complete the new approach today and you will be able to look at it.

It will mathematically do everything almost the same as before just with a new clearer structure.

GuySten avatar Sep 04 '25 14:09 GuySten

@GuySten I saw the changes you made, and I have a few conclusions. You’re right that the PDF functions themselves don’t need a seed. But there’s an important caveat: we do need the outgoing energy that was sampled in the original transport step, because the PDF of mu depends on it (see, for example, the Kalbach–Mann distribution).

The distributions where you implemented angular_pdf and conditional_sample_energy are only the special cases where they are independent.

itay-space avatar Sep 04 '25 18:09 itay-space

@itay-space, I am still working on the implementation for more distributions. Some of them are harder than others. Regarding the kalbach-mann distribution I think it is theoretically possible to calculate the energy independent pdf but is sure mathematically complicated. I want more time to think about it.

GuySten avatar Sep 04 '25 18:09 GuySten

@itay-space, after looking at your comment and further thinking I have reached the following conclusions:

Theoretically the angular_pdf for every distribution is defined and computable but it may be hard to calculate it. A trick that we can do and should do especially in the case of the kalbach mann distribution is to sample the pdf and the outgoing energy simultaneously so the get_pdf method in those cases should be named sample_energy_and_pdf and not split into two functions.

Theoretically I could try to derive the exact mathematical expressions for the angular_pdf but it is a headache.

GuySten avatar Sep 04 '25 19:09 GuySten

@GuySten Right, that’s exactly why the seed was used - to sample the outgoing energy. If we can somehow share the outgoing energy that was sampled in the sample function with the get_pdf function, that would be great. But if not, then I suggest we go back to the original implementation with the seed.

itay-space avatar Sep 04 '25 19:09 itay-space

@GuySten Right, that’s exactly why the seed was used - to sample the outgoing energy. If we can somehow share the outgoing energy that was sampled in the sample function with the get_pdf function, that would be great. But if not, then I suggest we go back to the original implementation with the seed.

The fact that we need the seed to sample the outgoing energy is not that surprising. The surprising fact (to me) is that we use the seed to return not the real angular pdf but a stochastic unbiased estimate of the angular pdf.

Because in theory the exact angular pdf does not depend on the outgoing energy at all!

GuySten avatar Sep 04 '25 19:09 GuySten

If all the tests will pass and nobody will object I am planning to merge this PR at the start of next week.

Now the names we use should not clash with the source biasing PR.

@eepeterson, @itay-space, @j-fletcher, if you have further comments you have until start of next week.

GuySten avatar Sep 04 '25 20:09 GuySten

I do have additional comments, thanks for your patience.

  1. I fear the name sample_energy_and_pdf is now less clear than it was before.
  2. Fundamentally we are just dealing with probability distributions here - there are univariate and multivariate distributions and they can be correlated or independent. Regardless, we need to be able to do two things with them for the features under discussion: we need to sample from them and evaluate them. That's why I believe this is the appropriate public interface to expose for both clarity and extensibility. To go back to the surface analogy, a Surface is an object that has an equation that we evaluate to find the value of the equation describing the surface. The surface and the equation that describes it are effectively synonymous. I'd argue that the PDF or PMF are likewise synonymous with the Distribution to warrant this terminology.
  3. For the point detector tally we shouldn't have to sample an energy in the direction of the detector as long as we have the distribution of energies given the virtual particle scatters towards the detector. You can then integrate this spectrum with the attenuation factor that is also energy dependent and the detector response function to get the tally contribution and further reduce the variance. This way we could avoid some of the ambiguity in these functions as well as the duplication of code. My proposal here would be to have a polymorphic sample function for joint probability distributions with an enum class that functions as a case switch to determine whether to sample both E and mu, or mu given E, or E given mu, etc.
  4. It would be good to add some unit tests to make sure functionality is working. As it stands, tests are passing because nothing new or different is being tested.

eepeterson avatar Sep 05 '25 02:09 eepeterson

I do have additional comments, thanks for your patience.

  1. I fear the name sample_energy_and_pdf is now less clear than it was before.
  2. Fundamentally we are just dealing with probability distributions here - there are univariate and multivariate distributions and they can be correlated or independent. Regardless, we need to be able to do two things with them for the features under discussion: we need to sample from them and evaluate them. That's why I believe this is the appropriate public interface to expose for both clarity and extensibility. To go back to the surface analogy, a Surface is an object that has an equation that we evaluate to find the value of the equation describing the surface. The surface and the equation that describes it are effectively synonymous. I'd argue that the PDF or PMF are likewise synonymous with the Distribution to warrant this terminology.
  3. For the point detector tally we shouldn't have to sample an energy in the direction of the detector as long as we have the distribution of energies given the virtual particle scatters towards the detector. You can then integrate this spectrum with the attenuation factor that is also energy dependent and the detector response function to get the tally contribution and further reduce the variance. This way we could avoid some of the ambiguity in these functions as well as the duplication of code. My proposal here would be to have a polymorphic sample function for joint probability distributions with an enum class that functions as a case switch to determine whether to sample both E and mu, or mu given E, or E given mu, etc.
  4. It would be good to add some unit tests to make sure functionality is working. As it stands, tests are passing because nothing new or different is being tested.
  1. I am open to suggestions. Please bear in mind that the name should reflect what the current implementation do.
  2. If you are insistent on evaluate for what I called pdf I do not care enough to argue.
  3. If we want to integrate numerically the spectrum with the attenuation factor then we actually need to sample an energy from the spectrum and multiply it with the attenuation factor which is energy dependent (unless you want to do quadrature based approach on an energy grid which might be possible but IMO out of scope for this PR). I also thought that having sample function that sample E given mu etc is a good idea. But then I did not manage to code it for the more complicated distributions (like kalbach mann). If you can do it I agree that the sample_energy_and_pdf might not be needed anymore. Can we agree that we can merge it as is and If someone will manage to do it we will change it later?
  4. Currently we only test that the program can be built and is formatted correctly. More tests will have to be added in subsequent PRs before this functionality is exposed to users.

GuySten avatar Sep 05 '25 03:09 GuySten

@GuySten @eepeterson Thanks for the feedback! I’m fine with evaluate instead of pdf. In my opinion, for joint distributions the best approach is to get E_out from the original sample and somehow share that with the evaluate/pdf function. If we don’t share it, then yes, we’d need to resample E_out inside evaluate. Just to illustrate: in a one-collision case, if the originally sampled E_out kinematically prevents the particle from reaching the detector, then using the original sample keeps the virtual particle blocked as it should be. Resampling inside evaluate might yield an E_out that allows it to reach, which I think is less consistent (though both converge in the limit of large histories). I suggest we merge this as a baseline and refine the interface + add proper unit tests in follow-up PRs.

itay-space avatar Sep 05 '25 07:09 itay-space

Hello all, apologies for the delay here.

  1. I am inclined to agree with @eepeterson that evaluate fits better with the functionality we're trying to add here. Looking at the univariate Tabular distribution for example, there are already methods called integral and cdf, with the difference being that cdf returns the "running total," so to speak, while integral simply returns the total probability under the curve of the PDF. In keeping with this point of view, pdf would in my mind return a vector containing each probability p(x) used to construct the distribution, which we're already able to access via the .p attribute. What we're after instead is PDF evaluated at a specific point, thus evaluate.
  2. I also agree with @GuySten that I think returning fatal error if evaluate is not implemented would be fine. I'm open to persuasion, however.

j-fletcher avatar Sep 06 '25 19:09 j-fletcher

@itay-space, I noticed that the calculation of the uncollided contribution to the point detector tally is using the probability density function of emitting a particle in a specific direction.

I think that we should extend this PR (or maybe add another PR) that will add evaluate function for the classes that inherit from UnitSphereDistribution.

I propose the signature: double evaluate(Direction u) to match the signature of the rest of the 1d distribution functions.

GuySten avatar Sep 24 '25 16:09 GuySten

@GuySten , Thanks for the suggestion! These classes already rely on two independent 1D distributions (polar and azimuthal), and the evaluate methods are implemented at that level. The sphere distributions just combine those underlying 1D PDFs, so the functionality is already covered.

itay-space avatar Sep 25 '25 04:09 itay-space

@itay-space, the conversion is not trivial because of the reference direction, so I think that adding this functionality is worth while. If you do not object I can add the relevant functionality myself.

GuySten avatar Sep 25 '25 05:09 GuySten

@GuySten I’m not sure why this would be different from the other cases (I can’t see it right now), but sure, go ahead and add the functionality. I have no objections. Thanks!

itay-space avatar Sep 25 '25 05:09 itay-space

I am waiting for #3582 to be merged to add the functionality I talked about. Without that PR it is unclear what is the correct mapping between a point on the unit sphere and the mu, phi angles to calculate the pdf for.

GuySten avatar Sep 28 '25 16:09 GuySten

@paulromano, I am reminding you to look at this PR when you have some time.

GuySten avatar Oct 14 '25 22:10 GuySten

I am waiting for #3582 to be merged to add the functionality I talked about. Without that PR it is unclear what is the correct mapping between a point on the unit sphere and the mu, phi angles to calculate the pdf for.

Thanks for breaking out this small PR and great to see it is now merged

shimwell avatar Nov 03 '25 08:11 shimwell

I've added the functionality I wanted.

@paulromano, could you take some time to review this PR? You asked to review this RP before we are merging it, and I want to merge this PR and continue to the next parts of the point detector work.

GuySten avatar Nov 03 '25 17:11 GuySten

@GuySten Yes, I will try to get to it this week

paulromano avatar Nov 03 '25 17:11 paulromano

I simplified this PR significantly. @paulromano, are these changes enough?

GuySten avatar Dec 03 '25 22:12 GuySten