pymc-examples icon indicating copy to clipboard operation
pymc-examples copied to clipboard

survival analysis

Open OriolAbril opened this issue 3 years ago • 24 comments

File: https://github.com/pymc-devs/pymc-examples/blob/main/examples/survival_analysis/survival_analysis.ipynb Reviewers:

The sections below may still be pending. If so, the issue is still available, it simply doesn't have specific guidance yet. Please refer to this overview of updates

Known changes needed

Changes listed in this section should all be done at some point in order to get this notebook to a "Best Practices" state. However, these are probably not enough! Make sure to thoroughly review the notebook and search for other updates.

General updates

  • fix typos
  • fix math renderend incorrectly
  • update notation to use equal tailed interval (or ETI) instead of HPD (or HDI), both are valid credible intervals, but they are different, so we should not write HPD if using ETI

ArviZ related

  • Use arviz (or only import? I don't really understand how that notebook runs)
  • Use InferenceData and xarray for exploratory analysis of the results

Changes for discussion

Changes listed in this section are up for discussion, these are ideas on how to improve the notebook but may not have a clear implementation, or fix some know issue only partially.

General updates

ArviZ related

Notes

Exotic dependencies

Computing requirements

OriolAbril avatar Apr 05 '21 01:04 OriolAbril

In addition, should I update the notebook as mentioned in pymc-devs/pymc3#4027?

olgadk7 avatar Apr 05 '21 13:04 olgadk7

@olgadk7 That would be terrific, thanks!

fonnesbeck avatar Apr 05 '21 14:04 fonnesbeck

That would be amazing indeed, thanks! Just to clarify, we don't expect you to fix all the bullet points above, choose which ones you want to address and let us know so we can review based on that and force extra work on you after reviewing.

OriolAbril avatar Apr 05 '21 15:04 OriolAbril

Regarding ArviZ, in the survival_analysis notebook it doesn’t do anything that pymc3 couldn’t, does it make sense then to only use pymc3 perhaps? Unless I’m missing some rules that govern how to chose between the two (didn’t find this info after a quick search, just that you seemingly always need to set the ArviZ style, as per the Notebook Style Guide).

Related to that, I’m confused how and why the notebook’s source code https://github.com/pymc-devs/pymc-examples/blob/main/examples/survival_analysis/survival_analysis.ipynb calls to az. (without even importing it), but the published notebook https://docs.pymc.io/notebooks/survival_analysis.html uses pymc3 only.

On another note, Can I add the “with model” context argument to get rid of the corresponding FutureWarning of deprecation?

Regarding the other needed changes, once I deal with the low-hanging fruit, I’d be happy to look into it and report back asap but most likely I wouldn’t trust myself with the more consequential stuff just yet (only if you “force” me) :)

olgadk7 avatar Apr 05 '21 18:04 olgadk7

Regarding ArviZ, in the survival_analysis notebook it doesn’t do anything that pymc3 couldn’t, does it make sense then to only use pymc3 perhaps?

PyMC3 has ArviZ as a dependency, and delegates plotting, stats and diagnostics to it. However, for convenience most ArviZ functions are also available as part of the of the pymc3 namespace, pm.plot_trace or pm.summary are aliases to the corresponding ArviZ function. While you are free to use those aliases, we want to encourage using ArviZ directly.

Related to that, I’m confused how and why the notebook’s source code https://github.com/pymc-devs/pymc-examples/blob/main/examples/survival_analysis/survival_analysis.ipynb calls to az. (without even importing it),

I have no idea myself, and what is worse, the notebook looks executed sequentially unless I counted wrong.

but the published notebook https://docs.pymc.io/notebooks/survival_analysis.html uses pymc3 only.

Docs are build only with releases, and those changes were added to pymc-examples after the last release, they will be published with the next release

On another note, Can I add the “with model” context argument to get rid of the corresponding FutureWarning of deprecation?

which FutureWarning? We are currently working on the next major version of PyMC3 so there are several laying around :sweat_smile:

Regarding the other needed changes, once I deal with the low-hanging fruit, I’d be happy to look into it and report back asap but most likely I wouldn’t trust myself with the more consequential stuff just yet (only if you “force” me) :)

If you are interested in working on that go for it! It's only a matter of we'll be here to provide any help necessary

OriolAbril avatar Apr 05 '21 18:04 OriolAbril

Thank you! And, got it, will use ArviZ directly.

Re FutureWarning: it seems like where the trace is called without the context of the model object ("with model:”) you get a FutureWarning saying "Using from_pymc3 without the model will be deprecated in a future release. Not using the model will return less accurate and less useful results. Make sure you use the model argument or call from_pymc3 within a model context." (adding "with model:" gets rid of the warning.)

olgadk7 avatar Apr 05 '21 20:04 olgadk7

Re FutureWarning: it seems like where the trace is called without the context of the model object ("with model:”) you get a FutureWarning saying "Using from_pymc3 without the model will be deprecated in a future release. Not using the model will return less accurate and less useful results. Make sure you use the model argument or call from_pymc3 within a model context." (adding "with model:" gets rid of the warning.)

Yes, that warning is probably triggered when plotting right? What is happening is that to plot, ArviZ first checks if the input is an InferenceData object, if it isn't, it tries to convert the input to inferencedata automatically. Converting pm.MultiTrace to InferenceData works with only the info in the trace in many cases, but in some others info from the model instance is also needed, so the converter needs both. Calling the plotting functions from within the model context make the model available to the converter, but this is not ideal.

The recommended solution is to first convert to inferencedata or directly use return_inferencedata=True in pm.sample and then use an inferencedata for plotting.

OriolAbril avatar Apr 05 '21 23:04 OriolAbril

That worked, thank you for explaining! And you're right: triggered when plotting. Should I then add return_inferencedata=True in every pm.sample in the notebook so that there aren't any of those FutureWarnings?

olgadk7 avatar Apr 07 '21 02:04 olgadk7

Should I then add return_inferencedata=True in every pm.sample in the notebook so that there aren't any of those FutureWarnings?

Yes, moreover, return_inferencedata=True will be the default in next version so if at some place you wanted to avoid inferencedata, you'd need to be explicit and manually pass False. Docs should show how to use inferencedata and other best practices though

OriolAbril avatar Apr 07 '21 07:04 OriolAbril

After a cascade of changes resulting from a simple inferencedata change, including updating pymc3, theano-pymc3, as well as my whole environment, and conda-removing theano, I’m up against the non-descriptive “NotImplementedError:” when calling aesara methods listed in its docs. Looked for solutions everywhere, please direct.

This is the line causing it: “lambda_ = pm.Deterministic("lambda_", T.outer(T.exp(beta * df.metastasized), lambda0))” except I replaced “T.” with “at.” as there was a big messy compilation error after removing theano. Am I off-track?

olgadk7 avatar Apr 12 '21 23:04 olgadk7

Which versions do you have installed/are using? You shouldn't be using Aesara yet, the latest pymc3 stable release (3.11.2) runs on theano-pymc only: https://github.com/pymc-devs/pymc3/blob/v3.11.2/requirements.txt

OriolAbril avatar Apr 12 '21 23:04 OriolAbril

oh ok - i was under the impression that theano-pymc is aesara, as that's where the theano-pymc link leads to on the installation guide, as well as on the "PyMC3 and Theano" page. So to run theano-pymc, i use "import theano"? Versions:

Python implementation: CPython
Python version       : 3.7.10
IPython version      : 7.22.0

arviz     : 0.11.2
cachetools : 4.2.1
dill : 0.3.3
numpy     : 1.19.2
pandas    : 1.2.3
patsy : 0.5.1
scipy : 1.6.2
theano-pymc : 1.1.2
typing_extensions : 3.7.4.3
pymc3     : 3.11.1
aesara    : 2.0.6

Thanks!

olgadk7 avatar Apr 12 '21 23:04 olgadk7

So to run theano-pymc, i use "import theano"?

Yes! I think that should be enough for everything to work.

OriolAbril avatar Apr 13 '21 00:04 OriolAbril

So I'm back to that big messy compilation error (below). everything i read about it is either old, contradictory or beyond me. do you have an idea what's it about? Thank you.

INFO (theano.gof.compilelock): Refreshing lock /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/lock_dir/lock

You can find the C code in this temporary file: /var/folders/n9/bmmyc8hd0x505y9fd0hm1mc40000gn/T/theano_compilation_error_8n8o433a
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-16-bceb15e5e724> in <module>
      5     beta = pm.Normal("beta", 0, sd=1000)
      6 
----> 7     lambda_ = pm.Deterministic("lambda_", T.outer(T.exp(beta * df.metastasized), lambda0))
      8     mu = pm.Deterministic("mu", exposure * lambda_)
      9 

~/anaconda3/lib/python3.7/site-packages/pymc3/model.py in Deterministic(name, var, model)
   1431     """
   1432     model = modelcontext(model)
-> 1433     var = var.copy(model.name_for(name))
   1434     model.deterministics.append(var)
   1435     model.add_random_variable(var)

~/anaconda3/lib/python3.7/site-packages/theano/tensor/var.py in copy(self, name)
    620         Does not copy the tags.
    621         """
--> 622         copied_variable = theano.tensor.basic.tensor_copy(self)
    623         copied_variable.name = name
    624         return copied_variable

~/anaconda3/lib/python3.7/site-packages/theano/gof/op.py in __call__(self, *inputs, **kwargs)
    668                 # compute output value once with test inputs to validate graph
    669                 thunk = node.op.make_thunk(node, storage_map, compute_map,
--> 670                                            no_recycling=[])
    671                 thunk.inputs = [storage_map[v] for v in node.inputs]
    672                 thunk.outputs = [storage_map[v] for v in node.outputs]

~/anaconda3/lib/python3.7/site-packages/theano/gof/op.py in make_thunk(self, node, storage_map, compute_map, no_recycling, impl)
    953             try:
    954                 return self.make_c_thunk(node, storage_map, compute_map,
--> 955                                          no_recycling)
    956             except (NotImplementedError, utils.MethodNotDefined):
    957                 # We requested the c code, so don't catch the error.

~/anaconda3/lib/python3.7/site-packages/theano/gof/op.py in make_c_thunk(self, node, storage_map, compute_map, no_recycling)
    856         _logger.debug('Trying CLinker.make_thunk')
    857         outputs = cl.make_thunk(input_storage=node_input_storage,
--> 858                                 output_storage=node_output_storage)
    859         thunk, node_input_filters, node_output_filters = outputs
    860 

~/anaconda3/lib/python3.7/site-packages/theano/gof/cc.py in make_thunk(self, input_storage, output_storage, storage_map, keep_lock)
   1215         cthunk, module, in_storage, out_storage, error_storage = self.__compile__(
   1216             input_storage, output_storage, storage_map,
-> 1217             keep_lock=keep_lock)
   1218 
   1219         res = _CThunk(cthunk, init_tasks, tasks, error_storage, module)

~/anaconda3/lib/python3.7/site-packages/theano/gof/cc.py in __compile__(self, input_storage, output_storage, storage_map, keep_lock)
   1155                                             output_storage,
   1156                                             storage_map,
-> 1157                                             keep_lock=keep_lock)
   1158         return (thunk,
   1159                 module,

~/anaconda3/lib/python3.7/site-packages/theano/gof/cc.py in cthunk_factory(self, error_storage, in_storage, out_storage, storage_map, keep_lock)
   1622                 node.op.prepare_node(node, storage_map, None, 'c')
   1623             module = get_module_cache().module_from_key(
-> 1624                 key=key, lnk=self, keep_lock=keep_lock)
   1625 
   1626         vars = self.inputs + self.outputs + self.orphans

~/anaconda3/lib/python3.7/site-packages/theano/gof/cmodule.py in module_from_key(self, key, lnk, keep_lock)
   1187             try:
   1188                 location = dlimport_workdir(self.dirname)
-> 1189                 module = lnk.compile_cmodule(location)
   1190                 name = module.__file__
   1191                 assert name.startswith(location)

~/anaconda3/lib/python3.7/site-packages/theano/gof/cc.py in compile_cmodule(self, location)
   1525                 lib_dirs=self.lib_dirs(),
   1526                 libs=libs,
-> 1527                 preargs=preargs)
   1528         except Exception as e:
   1529             e.args += (str(self.fgraph),)

~/anaconda3/lib/python3.7/site-packages/theano/gof/cmodule.py in compile_str(module_name, src_code, location, include_dirs, lib_dirs, libs, preargs, py_module, hide_symbols)
   2397             # difficult to read.
   2398             raise Exception('Compilation failed (return status=%s): %s' %
-> 2399                             (status, compile_stderr.replace('\n', '. ')))
   2400         elif config.cmodule.compilation_warning and compile_stderr:
   2401             # Print errors just below the command line.

Exception: ("Compilation failed (return status=1): /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:400:27: error: non-constant-expression cannot be narrowed from type 'npy_intp' (aka 'long') to 'int' in initializer list [-Wc++11-narrowing].     int init_totals[2] = {V3_n0, V3_n1};.                           ^~~~~. /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:400:27: note: insert an explicit cast to silence this issue.     int init_totals[2] = {V3_n0, V3_n1};.                           ^~~~~.                           static_cast<int>( ). /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:400:34: error: non-constant-expression cannot be narrowed from type 'npy_intp' (aka 'long') to 'int' in initializer list [-Wc++11-narrowing].     int init_totals[2] = {V3_n0, V3_n1};.                                  ^~~~~. /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:400:34: note: insert an explicit cast to silence this issue.     int init_totals[2] = {V3_n0, V3_n1};.                                  ^~~~~.                                  static_cast<int>( ). /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:412:9: error: non-constant-expression cannot be narrowed from type 'ssize_t' (aka 'long') to 'int' in initializer list [-Wc++11-narrowing].         V3_stride0, V3_stride1, .         ^~~~~~~~~~. /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:412:9: note: insert an explicit cast to silence this issue.         V3_stride0, V3_stride1, .         ^~~~~~~~~~.         static_cast<int>( ). /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:412:21: error: non-constant-expression cannot be narrowed from type 'ssize_t' (aka 'long') to 'int' in initializer list [-Wc++11-narrowing].         V3_stride0, V3_stride1, .                     ^~~~~~~~~~. /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:412:21: note: insert an explicit cast to silence this issue.         V3_stride0, V3_stride1, .                     ^~~~~~~~~~.                     static_cast<int>( ). /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:413:1: error: non-constant-expression cannot be narrowed from type 'ssize_t' (aka 'long') to 'int' in initializer list [-Wc++11-narrowing]. V1_stride0, V1_stride1. ^~~~~~~~~~. /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:413:1: note: insert an explicit cast to silence this issue. V1_stride0, V1_stride1. ^~~~~~~~~~. static_cast<int>( ). /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:413:13: error: non-constant-expression cannot be narrowed from type 'ssize_t' (aka 'long') to 'int' in initializer list [-Wc++11-narrowing]. V1_stride0, V1_stride1.             ^~~~~~~~~~. /Users/olgakahn/.theano/compiledir_Darwin-19.6.0-x86_64-i386-64bit-i386-3.7.10-64/tmpss108u77/mod.cpp:413:13: note: insert an explicit cast to silence this issue. V1_stride0, V1_stride1.             ^~~~~~~~~~.             static_cast<int>( ). 6 errors generated.. ", '[Elemwise{identity}(<TensorType(float64, matrix)>)]')

olgadk7 avatar Apr 13 '21 17:04 olgadk7

I always start by deleting my ~/.theano directory. That sometimes fixes odd compilation issues (but not always).

fonnesbeck avatar Apr 13 '21 17:04 fonnesbeck

Thank you, that was a problem related to conda-forge installing an older version seemingly.

I have another question, please: what's the best way to realign trace's shape when it became an InferenceData object (after being a MultiTrace)? Getting a "ValueError: operands could not be broadcast together with shapes" when doing trace.posterior["lambda0"] * np.exp(np.atleast_2d(trace.posterior["beta"]).T).

InferenceData shape:

print(trace.posterior["lambda0"].shape)
print(trace.posterior["beta"].shape)

(4, 1000, 76) (4, 1000)

MultiTrace shape:

print(MultiTrace_trace["lambda0"].shape)
print(MultiTrace_trace["beta"].shape)

(4000, 76) (4000,)

Appreciate your guidance.

olgadk7 avatar Apr 15 '21 01:04 olgadk7

I have another question, please: what's the best way to realign trace's shape when it became an InferenceData object (after being a MultiTrace)? Getting a "ValueError: operands could not be broadcast together with shapes" when doing trace.posterior["lambda0"] * np.exp(np.atleast_2d(trace.posterior["beta"]).T).

One of the coolest features of InferenceData is that the data is stored in xarray Datasets. xarray can be seen as the pandas equivalent for n-d data and it uses named dimensions and coordinates which allow it to automatically align and broadcast arrays automagically. All the code above with the atleast2d and transpose is now unnecessary, you can do trace.posterior["lambda0"] * np.exp(trace.posterior["beta"]).

OriolAbril avatar Apr 15 '21 02:04 OriolAbril

Quick question: ArviZ's label guide says all plotting functions support the 'labeller' argument but I don't see it in plot_hdi. How do we attach the labels to the hdi plot's legend otherwise? ps it's been nice to discover the world of the xarray, thank you!

olgadk7 avatar Apr 17 '21 02:04 olgadk7

plot_hdi doesn't support a legend "by itself". You can use for example fill_kwargs={"label": "var_name"} to get one. It will look like https://docs.pymc.io/notebooks/multilevel_modeling.html. Maybe we could be explicit on that in the label guide, but basically there are a few plotting functions in ArviZ that are "low" level, these don't take InferenceData as input, and are generally called by other plotting functions, so they behave a bit differently from the others, especially when it comes to labelling, as they are designed to take array inputs so labels must be passed to them explicitly. The low level functions are plot_kde, plot_dist and plot_hdi (which has no higher level counterpart for now, hopefully we'll get to add regression specific plots this summer that use it).

Also note that ArviZ docs are for the development version, not for the release. The labeller feature has not been released yet. To use it you'll need to install ArviZ development version, which I would generally recommend (there aren't any differences in testing between releases and commits to main and we don't use release candidates), except for examples and situations where you want to be able to pin down a specific release.

OriolAbril avatar Apr 17 '21 02:04 OriolAbril

Hopefully one last piece I don’t get:

I’m trying to plot on the y-axis the mean of a DataArray tv_met_hazard.mean(dim=(“chain", "draw”)) where

tv_met_hazard = time_varying_trace.posterior["lambda0"] * np.exp(time_varying_trace.posterior["beta"])

and has coords chain, draw, lambda0_dim_0, and beta_dim_0.

It returns the same values as its MultiTrace version but with a different mean of means, and in 2-D:

tv_met_hazard.mean(("chain", "draw")).shape (76, 76)

even
tv_met_hazard.mean(("chain", "draw")).values.shape (76, 76)

While similarly constructed da
met_hazard = trace.posterior["lambda0"] * np.exp(trace.posterior["beta"])

has coords chain, draw, and lambda0_dim_0 and returns

met_hazard.mean(("chain", "draw")).shape (76,)

The only difference is that the former da’s beta parameter is modeled with pm.GaussianRandomWalk and the latter’s beta with pm.Normal. I've also tried to include the beta_dim_0 parameter in calculating the mean but the values are off.

What am I missing? Thank you.

olgadk7 avatar Apr 21 '21 03:04 olgadk7

InferenceData uses named dimensions, if no dimension names are given, default names based on the variable name are used. This is where lambda0_dim_0 and beta_dim_0 come from. When you add them, the broadcasting and alignment is based on the dimension names, not on the shape. You are therefore adding arrays with chain, draw, lambda0_dim_0 and chain, draw, beta_dim_0, so the result is an array with chain, draw, lambda0_dim_0, and beta_dim_0. As this dimensions are actually the same, you have to use named dimensions when writing the model (recommended) or when converting to inferencedata. I wrote a blogpost about this: https://oriolabril.github.io/oriol_unraveled/python/arviz/pymc3/xarray/2020/09/22/pymc3-arviz.html

In MultiTrace, you have shapes 4, 1000, 76 and 4, 1000, 76 (I guessed the number of chains and draws based on defaults) so the result is 4, 1000, 76 (you may even have 4000 with chain and draw flattened). In this particular case, the sum works automatically, but if you had 4, 1000, 76 and 4, 1000, 35 and wanted to get a 4, 1000, 76, 53 output you'd need to do a[:, :, :, None] + b[:, :, None, :] whereas in xarray with correct dimension names it is still a + b alone and everything works.

OriolAbril avatar Apr 21 '21 12:04 OriolAbril

Thank you! Submitted the commit https://github.com/pymc-devs/pymc-examples/pull/96. Happy to hear if any other changes are needed.

olgadk7 avatar Apr 26 '21 18:04 olgadk7

Can I take this one to update it to v4?

cuchoi avatar Jun 05 '22 03:06 cuchoi

reopening to place in v4 branch as it still needs style updates

OriolAbril avatar Jun 06 '22 21:06 OriolAbril