pymc icon indicating copy to clipboard operation
pymc copied to clipboard

pass names from RV dims to change_rv_size

Open flo-schu opened this issue 3 years ago • 3 comments

@canyon289 to resolve #5669, I have made some small changes mainly to aesaraf.py to include the dim names in the new_size tensor (https://github.com/pymc-devs/pymc/commit/6585f9021a17a7f0c9c833bf4700fab240d2164b)

This solves the issue mainly and test_model runs without errors. There are a few things that probably need to be addressed.

  • do dims passed to change_rv_size need to be tested more rigorously?
  • is None a good default name for the broadcasting tensor dimension?
  • should information if the broadcasting comes from observed or other processes be included in the name
  • things that I'm not aware of :)

changes of the PR

  • adds argument new_size_dims to change_rv_size in aesaraf.py
  • tests if dims is None and wraps in tuple if RV has no dim
  • creates new_size_name by adding dim name and decorations
  • passes new_size_name as name argument to at.as_tensor method

flo-schu avatar Jun 26 '22 13:06 flo-schu

Thanks much for the pr! Are you planning to add tests?

canyon289 avatar Jun 26 '22 13:06 canyon289

Codecov Report

Merging #5931 (6585f90) into main (be048a4) will decrease coverage by 0.92%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5931      +/-   ##
==========================================
- Coverage   89.53%   88.60%   -0.93%     
==========================================
  Files          73       73              
  Lines       13267    13270       +3     
==========================================
- Hits        11878    11758     -120     
- Misses       1389     1512     +123     
Impacted Files Coverage Δ
pymc/aesaraf.py 92.13% <100.00%> (+0.05%) :arrow_up:
pymc/distributions/distribution.py 90.83% <100.00%> (ø)
pymc/variational/updates.py 37.93% <0.00%> (-54.19%) :arrow_down:
pymc/backends/arviz.py 86.53% <0.00%> (-4.09%) :arrow_down:
pymc/step_methods/hmc/base_hmc.py 89.76% <0.00%> (-0.79%) :arrow_down:
pymc/model.py 87.79% <0.00%> (-0.27%) :arrow_down:

codecov[bot] avatar Jun 26 '22 13:06 codecov[bot]

This PR should now be a bit more straightforward due to #6063 Dims will now always be used when shape is not provided.

ricardoV94 avatar Aug 29 '22 10:08 ricardoV94

Hi folks, I was interested in this so I took a look with the latest codebase. As @flo-schu was finding, it's not sufficient to simply add the dim name to the call to pytensor.tensor.constant here https://github.com/pymc-devs/pymc/blob/e1d36cacb2cfb946f56872ee03e8678dc6ccf7f4/pymc/model.py#L1066-L1069

Somewhere along the RV creation call stack, the name of the Variable holding dim length gets lost. I believe this is actually happening in pytensor.tensor.random.utils.normalize_size_parameter(), here https://github.com/pymc-devs/pytensor/blob/a1a805c9119e986be9f3e08591daf3698cede0ef/pytensor/tensor/random/utils.py#L136

With shared variables, the call to as_tensor_variable returns a MakeVector, which preserves the original variable(s) holding dim lengths. But with constant variables, as_tensor_variable squashes the dim length(s) into one TensorConstant and wipes out the names.

Using the following model,

with pm.Model() as m:
    mutable = True
    d2 = pm.Data("data1", range(5), dims="d1", mutable=mutable)
    d1 = pm.Data("data2", range(7), dims="d2", mutable=mutable)
    x = pm.Normal("x", dims=("d1", "d2"))

Before that line in normalize_size_parameter, dprint(size) shows

Subtensor{int64} [id A]
 |Shape [id B]
 | |data1 [id C]
 |ScalarConstant{0} [id D]
Subtensor{int64} [id E]
 |Shape [id F]
 | |data2 [id G]
 |ScalarConstant{0} [id H]

and after,

MakeVector{dtype='int64'} [id A]
 |Subtensor{int64} [id B]
 | |Shape [id C]
 | | |data1 [id D]
 | |ScalarConstant{0} [id E]
 |Subtensor{int64} [id F]
   |Shape [id G]
   | |data2 [id H]
   |ScalarConstant{0} [id I]

With pymc patched as described above to pass along name to pytensor.tensor.constant when creating dims, we can try the same model but with mutable=False. That line in normalize_size_parameter changes

d1{5} [id A]
d2{7} [id B]

to

TensorConstant{[5 7]} [id A]

So, what to do? If this issue is still of interest, would be glad to work on it given some advice.

covertg avatar Feb 24 '23 05:02 covertg

N.B. the tensorconstant name gets disappeared even if there is only one dim, a la

>>> pt.as_tensor_variable((pt.constant(5, name="disappears"),)).name
None

Because normalize_size_parameter will always receive a tuple.

A couple naive thoughts include:

  • We could pre-empt pytensor's (re)creation of TensorConstant by always turning the dim length information into a more complex Variable, somewhere close to the below line. For example call pt.join(create_size) before the call to rv_op so that the dim length is always a MakeVector.
    • https://github.com/pymc-devs/pymc/blob/e1d36cacb2cfb946f56872ee03e8678dc6ccf7f4/pymc/distributions/distribution.py#L388-L389
    • But, we might break things if there are parts of the codebase that rely on the shape information being TensorConstant
  • Or perhaps allow the TensorConstants to get squashed into one, but preserve name information by concatenating the strings of the original names if available. (would by a pytensor change)

covertg avatar Feb 24 '23 05:02 covertg

@covertg thanks for investigating. This issue was more relevant before when all dims used SharedVariable, and hence names were preserved. With TensorConstant's PyTensor is very liberal and happy to return you objects that aren't linked to the original ones, as you found out. I think we can simply close the PR and issue. It was a nice to have, but doesn't seem worth any further trouble at this point.

ricardoV94 avatar Feb 26 '23 12:02 ricardoV94