posterior icon indicating copy to clipboard operation
posterior copied to clipboard

New helper function repair_variable_names

Open jgabry opened this issue 1 year ago • 8 comments

closes #317

Summary

repair_variable_names converts names using periods (theta.1.1) to square brackets (`theta[1,1])

Copyright and Licensing

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

  • Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
  • Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

jgabry avatar Nov 17 '23 18:11 jgabry

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Comparison is base (c6ab463) 95.60% compared to head (0ebe8e9) 95.61%.

:exclamation: Current head 0ebe8e9 differs from pull request most recent head 00d32c6. Consider uploading reports for the commit 00d32c6 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #318   +/-   ##
=======================================
  Coverage   95.60%   95.61%           
=======================================
  Files          47       47           
  Lines        3686     3694    +8     
=======================================
+ Hits         3524     3532    +8     
  Misses        162      162           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Nov 17 '23 18:11 codecov-commenter

Maybe repair isn't the best name actually? @avehtari suggested convert in https://github.com/stan-dev/posterior/issues/317#issuecomment-1816897197. I'm open to other suggestions too.

jgabry avatar Nov 17 '23 18:11 jgabry

I like "repair". Having the dots instead of [] is kind of a thing we don't really want. So we are repairing them to look reasonable again. But I also don't have strong opinions. So completely fine by me if we call it differently.

paul-buerkner avatar Nov 17 '23 23:11 paul-buerkner

Re: function name: is the intention to create a function that specifically converts from Stan names, or is the intention that this function might (in the future) support other non-posterior-friendly naming conventions? I think that should influence the naming; e.g. is this "convert_from_stan_names" (though perhaps less verbose) or the more generic "repair_variable_names" or "convert_variable_names"? Or is it "repair_dotted_indices" or "repair_dotted_index_names" or something like that?

mjskay avatar Nov 18 '23 04:11 mjskay

Good point @mjskay. If this will only ever be used to convert Stan names we could include that in the function name. What do others think?

jgabry avatar Nov 22 '23 00:11 jgabry

I think it would be best to make the function specific for Stan naming conventions, and then add other functions if we add support for other packages. The name "convert_from_stan_names" would be then very descriptive, and I don't know how to make it shorter

Furthermore, in #317 @WardBrian wrote

Complex variables use .real and .imag as their CSV names in Stan. So something like a complex_vector[2] foo has names foo.1.real,foo.1.imag,foo.2.real,foo.2.imag

Tuples use : to separate "slots" in a tuple. So a tuple(real, int) a is printed as a:1,a:2. This stacks naturally with the . for arrays of tuples/tuples of arrays.

If we want to support these, too, I assume that we would like to have something like "foo.1.real" -> "foo_real[1]" and "a:1" -> "a_1"

And the natural complex type support is a separate issue #319 and PR #321

avehtari avatar Nov 22 '23 08:11 avehtari

I agree with Jonah and Aki.

paul-buerkner avatar Nov 22 '23 16:11 paul-buerkner

If you did want to support tuple names, this essentially just needs to split on : and loop over the resulting parts. See the code that does something similar in Stan: https://github.com/stan-dev/stan/blob/667d5b2d7c5ea1072011919a1acd76a9d3061d01/src/stan/io/stan_csv_reader.hpp#L16

WardBrian avatar Nov 27 '23 21:11 WardBrian