math Feature/issue 2682 add 7 parameter ddm pdf

Summary

We provide an implementation of the LPDF drift diffusion model with up to 7 parameters. There will be two new functions available in the stan::math namespace: wiener_full_lpdf and wiener_full_prec_lpdf. The only difference between both functions is the control of the precision in the computation of the partial derivatives. We did not deal with the CDF functions. With the new funtion it is possible to compute the 4-, 5-, 6-, 7- parameter drift diffusion models. Therefore, it is also an alternative to the wiener_lpdf by @kendalfoster.

The PR consists of the following new files: Main file: -stan/math/prim/prob/wiener_full_lpdf.hpp. Helper files: -stan/math/prim/fun/wiener5_lpdf.hpp: log density and partial derivatives for 5 parameter model -stan/math/prim/fun/wiener7_lpdf.hpp: log density and partial derivatives for 7 parameter model -stan/math/prim/functor/hcubature.hpp: new multidimensional integrator, adapted from Steven Johnson -test/unit/math/prim/prob/wiener_full_test.cpp: own unit tests -test/unit/math/prim/prob/wiener_full_prec_test.cpp: own unit tests for function with precision control

The following files are adapted in order to be able to implement a function with 8 parameters: -stan/math/prim/functor/operands_and_partials.hpp -stan/math/rev/functor/operands_and_partials.hpp

Discourse threads: Testing for 7-parameter function with known partials -> Therefore, we only let scalar input be accepted and no vectorization. Is there a multidimensional integrate-function? -> Therefore, we implemented hcubature. How to document correctly?/How to add a new function’s documentation to the Stan Function Reference? -> Up to now, we have the doxygen documentation. Adding a function with LARGE known partial derivatives -> We introduce 3 helper files. HMC: Does precision of derivatives affect validity of results?

Issue: Extension operands_and_partials to 8 edges #2698

Tests

stan/math/test/unit/math/prim/prob/wiener_full_test.cpp and wiener_full_prec_test.cpp In this discourse thread we asked on how to test the new function. Since we have more parameters than all other functions, we wrote own unit tests to test all exceptions for invalid input, and tested whether valid scalar types give the correct result. Results are compared with values from the R-package WienR.

Side Effects

We had to adapt the operands_and_partials routine such that it accepts 8 parameters. We had to implement a new, multi-dimensional integrator: hcubature.

Release notes

Two new functions in the stan::math namespace availabe: wiener_full_lpdf and wiener_full_prec_lpdf. Along with some helper functions in stan::math::internal. New integrator: hcubature. Adaptation of operands_and_partials to 8 parameters.

Checklist

[x] Math issue #2682
[x] Copyright holder: - Franziska Henrich ([email protected]) - Valentin Pratz ([email protected]) - Christoph Klauer ([email protected]) - Raphael Hartmann ([email protected])

The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses: - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause) - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
[x] the basic tests are passing
- unit tests pass (to run, use: ./runTests.py test/unit)
- header checks pass, (make test-headers)
- dependencies checks pass, (make test-math-dependencies)
- docs build, (make doxygen)
- code passes the built in C++ standards checks (make cpplint)
[x] the code is written in idiomatic C++ and changes are documented in the doxygen
[x] the new changes are tested

Oct 07 '22 16:10 Franzi2114

I believe you'll want @SteveBronder, @andrjohns, or @rok-cesnovar to have a closer look. I think we'll want to discuss what's the best name for these.

I'm not an expert in these but I'm curious how the 5 parameter one different than the wiener_lpdf already in stan? It's 4 parameters but then we separate in a model whether the outcome hit the lower or upper bound.

Oct 19 '22 13:10 spinkney

I'm not an expert in these but I'm curious how the 5 parameter one different than the wiener_lpdf already in stan? It's 4 parameters but then we separate in a model whether the outcome hit the lower or upper bound.

The 5 parameter model incorporates the variability in the drift rate parameter. This is not possible in wiener_lpdf. You can set this variability to 0, then you obtain the 4 parameter model which has the same output as wiener_lpdf, but since we have all partial derivatives implemented, the whole procedure iscomputationally much more efficient. We also need the wiener5 functions for the calculation of the wiener7 functions.

Oct 19 '22 13:10 Franzi2114

I believe you'll want @SteveBronder, @andrjohns, or @rok-cesnovar to have a closer look. I think we'll want to discuss what's the best name for these.

I'm not an expert in these but I'm curious how the 5 parameter one different than the wiener_lpdf already in stan? It's 4 parameters but then we separate in a model whether the outcome hit the lower or upper bound.

I can review this tomorrow

Oct 20 '22 07:10 andrjohns

The 5 parameter model incorporates the variability in the drift rate parameter. This is not possible in wiener_lpdf. You can set this variability to 0, then you obtain the 4 parameter model which has the same output as wiener_lpdf, but since we have all partial derivatives implemented, the whole procedure is computationally much more efficient. We also need the wiener5 functions for the calculation of the wiener7 functions.

Very cool! When we expose this to Stan we can just use wiener_lpdf and have different signatures for each type. Do we even want to keep the current wiener_lpdf if this one is superior?

If you get up the energy to add the lcdfs and the rng functions, I'd love to have those in. I have a model that needs the lcdf and lccdf I've been wanting to test. I did code up the 4 parameter _rng function (https://github.com/stan-dev/math/issues/2801) in Stan but haven't had the time to put it in.

Oct 20 '22 09:10 spinkney

Very cool! When we expose this to Stan we can just use wiener_lpdf and have different signatures for each type. Do we even want to keep the current wiener_lpdf if this one is superior?

Thanks a lot for your review and your ideas. Putting it together with wiener_lpdf sounds good to me. I think as well that replacing the current wiener_lpdf implementation is worth considering. @Franzi2114 or I could do a comparison to be able to put a number on the performance difference between the implementations.

Oct 20 '22 15:10 valentin12

add the lcdfs and the rng functions,

Yeah, thanks @spinkney! That's true. When setting all variabilities to 0 and calling the function with wiener_full_lpdf(a,v,w,t0,0,0,0) the four parameter model will be computed. Therefore, the wiener_lpdf() gives no advantages anymore.

We have not yet implemented the lcdf- and rng-functions. We have to check how much effort it is to bring them to Stan. But we will check this and may contribute them later in a separate PR. First, we wanted to add the lpdf-functions.

Oct 20 '22 17:10 Franzi2114

We have not yet implemented the lcdf- and rng-functions. We have to check how much effort it is to bring them to Stan. But we will check this and may contribute them later in a separate PR. First, we wanted to add the lpdf-functions.

Yes, I agree! It's best to do these in separate prs.

Oct 20 '22 20:10 spinkney

@Franzi2114 from my understanding this PR comprises three main contributions:

Extending operands_and_partials to 8 parameters
Introducing hcubature integration routine
Introducing wiener_full_lpdf and wiener_full_prec_lpdf

Can you split this into three PRs rather than all together? That will make it much easier to ensure that each component is fully tested, and also gives a bit of a cleaner development history for the repo. Thanks!

Oct 22 '22 11:10 andrjohns

Can you split this into three PRs rather than all together?

Hi @andrjohns, thanks for your input! Yes, we can split our PR into three PRs. I will refer to the same issue in the branchnames (Feature/issue 2682 operands and partials 8, and Feature/issue2682 hcubature).

Up to now, we don't have separate tests for hcubature. The integration is checked in the tests for wiener_full_lpdf(). So we will add separate tests and do a third PR.

PR for operands_and_partials: #2833 PR for hcubature #2838

Oct 24 '22 11:10 Franzi2114

Hey @andrjohns, I am confused. I only merged the develop branch (where the new operands_and_partials_8 routine is integrated now) into this branch and the continuous-integration test fails at a point I do not understand. Did I do anything wrong?

Jan 02 '23 12:01 Franzi2114

@serban-nicusor-toptal would you mind restarting the Threading tests for this Jenkins run?

Is it possible to give my account permissions to restart runs as well? So you don't have to get bothered with pings everytime

Jan 31 '23 06:01 andrjohns

@WardBrian would the Distribution test failures here be resolved by #2869, or is this a different problem?

Jan 31 '23 07:01 andrjohns

Yeah that pr has the fix for what just caused them to fail here

Jan 31 '23 13:01 WardBrian

#2869 has been merged into develop, feel free to merge develop into this branch, that should fix the Jenkins CI. Sorry for all the trouble!

Jan 31 '23 13:01 serban-nicusor-toptal

Hello together,

now the two split-PRs (#2833, #2838) are merged into develop and into this branch and all tests run successfully. Who would review this PR? @andrjohns, @spinkney, @SteveBronder, @rok-cesnovar, @charlesm93, someone else?

We already conducted a simulation recovery study and a simulation-based calibration study. Everything looks fine with the new implementation. Recovery is satisfying and SBC is successful. The paper is still in the review process.

Feb 01 '23 08:02 Franzi2114

I'll take the review for this one

Feb 01 '23 13:02 andrjohns

I'll take the review for this one

Hey @andrjohns, if I can give you any information or background regarding this implementation just let me know ;) Do you have an idea when you can start with the PR?

Feb 14 '23 07:02 Franzi2114

Hey @andrjohns, I just saw that the next release is being planned and that the feature freeze will be on 4th of April, in three weeks. Do you think this PR can make it into the next release? That would be nice, since I am planning to implement the cdf of this function as well and I would like to know which programming style is going to be accepted to make all of us less work.

Mar 14 '23 06:03 Franzi2114

Some fairly major requests for changes, mainly around the function structuring and reducing code duplication

Hey @andrjohns, thank you very much for your response! I hope you are fine again after covid! I have a few questions regarding some of your aspects. I posted them under the corresponding block you highlighted.

Mar 23 '23 11:03 Franzi2114

Was slightly derailed by rstan and rstanarm CRAN prep, I'll have this updated by tomorrow

Apr 06 '23 04:04 andrjohns

I'll have this updated by tomorrow

Hi @andrjohns, is there any news on this PR? Will it be part of this release as @spinkney suggested?

Apr 17 '23 08:04 Franzi2114

I believe it is too late for this distribution to be included in the 2.32 release

Apr 17 '23 14:04 WardBrian

Hello, I plan to also implement the CDF functions. It would be nice to have the PDF functions in the develop-branch before I start, so that I know which style will be accepted. Could we continue with this PR?

Apr 24 '23 09:04 Franzi2114

@Franzi2114 this is just about good to go, but there's a test hang in rev/prob/wiener_full_test that I can't replicate locally in MacOS or linux. Would you mind checking out the latest updates and seeing if the test hangs for you locally as well?

May 10 '23 13:05 andrjohns

@Franzi2114 this is just about good to go, but there's a test hang in rev/prob/wiener_full_test that I can't replicate locally in MacOS or linux. Would you mind checking out the latest updates and seeing if the test hangs for you locally as well?

False alarm sorry, found it

May 10 '23 15:05 andrjohns

@Franzi2114 That's the last of my changes. To summarise, I've:

Updated hcubature to take a tuple of parameters instead of casting structs. There's an additional simplification available by having the function take a parameter pack instead, but that can be changed in the future
Removed wiener_full_prec overload in favour of a default argument for the precision for wiener_full
Consolidated the separate density and derivative functions for wiener5 into a single function to reduce repetition
Added a utility function for handling the process of estimating the density/gradient, checking against an error value, and re-estimating if failed
Added two utility functions to simplify the process of calculating the wiener7 quantities, either directly from the respective wiener5 function or by integrating over it

Overall that reduced the amount of code by a little over half for each of the density functions.

Can you have a look over the changes and let me know if there's anything that doesn't make sense or you'd like changed? Thanks!

May 11 '23 05:05 andrjohns

Can you have a look over the changes

Thanks a lot for all your effort! I'll have a look at this and let you know when I'm through everything.

May 11 '23 07:05 Franzi2114

@Franzi2114 Ah hold on, there's something a bit off with your construction of the wiener_full likelihood here:

  for (size_t i = 0; i < N; i++) {
    // Calculate 4-parameter model without inter-trial variabilities (if
    // sv_vec[i] == 0) or 5-parameter model with inter-trial variability in
    // drift rate (if sv_vec[i] != 0)
    if (sw_vec[i] == 0 && st0_vec[i] == 0) {
      return wiener5_lpdf<propto>(y, a, t0, w, v, sv, precision);
      // Calculate 6-, or 7-parameter model
    } else {
      return wiener7_lpdf<propto>(y, a, t0, w, v, sv, sw, st0, precision);
    }
  }

This loop will only test the first sw and st0 values, and then depending on those will pass all inputs to either wiener5_lpdf or wiener7_lpdf to evaluate.

To simplify this, is the separate wiener_full distribution necessary? We could instead change the wiener7_lpdf to test these values and delegate to wiener5 where appropriate. Let me know if that's ok with you and I'll make the update

May 11 '23 08:05 andrjohns

To simplify this

Yea, thats true. Later we check that every input has length 1. So the user can call the function only for 1 trial and has to loop over all trials in the .stan-file. Therefore, N is always equal to 1. We could also do the sw == 0 and st0 == 0 check in the wiener7 function and redirect to wiener5 at this point. But then, we should rename wiener7 to wiener_full, such that users still call the function with wiener_full_lpdf.

May 11 '23 08:05 Franzi2114

Makes sense! I'll update that now

May 11 '23 09:05 andrjohns