pints icon indicating copy to clipboard operation
pints copied to clipboard

Hierarchical modelling

Open DavAug opened this issue 4 years ago • 8 comments

There are some issues that have discussed hierarchical modelling before #1231 #1232 #1134 . Here is one more attempt that I've implemented in my repository https://erlotinib.readthedocs.io/en/latest/index.html.

New features that are going to be implemented with this issue:

  • [ ] PopulationLogPDF base class for the population model (or top-level distribution)
    • [ ] call evaluates the log-likelihood score (last n_parameters are the population model parameters, first parameters are the individual parameters which play the role of the data)
    • [ ] evaluateS1 will not be implemented for now
  • [ ] LogNormalModel a class that inherits from the PopulationLogPDF and distributes the parameters log-normally
  • [ ] HierarchicalLogLikelihood which inherits from LogPDF and
    • [ ] takes a list of LogPDFs (one for each modelled individual, all with the same number of parameters)
    • [ ] a list of PopulationLogPDF (one for each parameter of the LogPDFs

DavAug avatar Jan 14 '21 17:01 DavAug

@MichaelClerx @martinjrobins @ben18785 @chonlei

I am happy to implement this solution. I am just wondering what your thoughts on identifying the parameters are, because it can get complicated.

Example: Imagine 10 log-likelihoods with 5 parameters. Each parameter is modelled by a population model with 2 parameters. The final HierarchicalLogLikelihood ends up being 60 dimensional, and the order is [parameter 1 loglikelihood 1, parameter 1 loglikelihood 2, ..., parameter 1 loglikelihood 10, parameter 1 poulation model 1, parameter 2 poulation model 1, parameter 2 loglikelihood 1, parameter 2 loglikelihood 2, ..., parameter 2 loglikelihood 10, parameter 1 poulation model 2, parameter 2 poulation model 2, ...]

I hope the pattern is somewhat clear, but you probably see my point of potential confusion. And when the popualtion models start to have different numbers of parameters it becomes more complicated. In my repository I circumvented this by giving each parameter meaningful names, but pints doesn't support names at the moment. Any thoughts?

DavAug avatar Jan 14 '21 17:01 DavAug

What are the actions you want to do where the names are used?

MichaelClerx avatar Jan 14 '21 20:01 MichaelClerx

The names are not really used by anything. It's more for someone evaluating the hierarchical loglikelihood to understand which parameter index corresponds to which model parameter, if that makes sense?

DavAug avatar Jan 14 '21 20:01 DavAug

hmmmm... you could have a dict mapping names to indices, I guess?

MichaelClerx avatar Jan 14 '21 20:01 MichaelClerx

Can you point me towards the classes where you've solved this problem in Erlotinib ?

(I kept thinking it was a german pun. So glad to find out it's a drug :D )

MichaelClerx avatar Jan 14 '21 21:01 MichaelClerx

Yes a dict or just a list of names?

Hahah yes it's a drug 😂 So here is the population model classes that I've implemented https://erlotinib.readthedocs.io/en/latest/population_models.html. In the bottom you can find the base class (the compute_log_likelihood method would probably be the call method in pints).

And here is the HierarchicalLogLikelihood https://erlotinib.readthedocs.io/en/latest/log_pdfs.html#erlotinib.HierarchicalLogLikelihood

DavAug avatar Jan 14 '21 21:01 DavAug

That looks quite similar to what we have for parameter names in plots!

See also #1171 :D and the tickets linked therein

MichaelClerx avatar Jan 14 '21 21:01 MichaelClerx

I've implemented this ProblemModellingController class to make my life a little easier, so I wouldn't have to keep track of it (here is a notebook, if you'd like to see it in action https://nbviewer.jupyter.org/github/DavAug/erlotinib/blob/main/analysis/treating_lung_cancer/control_group_analysis/tgi_koch_2009_reparametrised_model/population_inference.ipynb). But I guess that wouldn't necessarily be something for pints.

DavAug avatar Jan 14 '21 21:01 DavAug