nimble icon indicating copy to clipboard operation
nimble copied to clipboard

Derived quantities

Open danielturek opened this issue 7 months ago • 2 comments

Adds derived quantities to the MCMC.

Let's see how many issues the testing turns up.

danielturek avatar May 13 '25 13:05 danielturek

@paciorek Do you have any idea what's going on with the testing failures?

danielturek avatar May 14 '25 12:05 danielturek

Looks like we can't use Ubuntu 20.04 for Linux anymore. I'll need to update that.

Not sure why Windows is having a problem with the voluminous Eigen warnings. I'll check again once I get the Linux runner working.

paciorek avatar May 14 '25 21:05 paciorek

I merged devel into derived_quantities to address a few testing issues that I had fixed in recent weeks. That should fix all but the mcmc and filtering gold file issues.

Let's see what this run of CI shows in terms of the gold file issues. Hopefully those are heisen-failures.

paciorek avatar Jun 20 '25 15:06 paciorek

There's a failure in test-bnp that I can replicate running manually on my machine. It occurs in the MCMC compilation in this particular chunk of code:

  code=nimbleCode(
    {
      xi[1:10] ~ dCRP(1 , size=10)
      thetatilde[1] ~ dnorm(0, 1)
      thetatilde[2] ~ dt(0, 1, 1)
      thetatilde[3] ~ dt(0, 1, 1)
      s2tilde[1] ~ dinvgamma(2, 1)
      s2tilde[2] ~ dgamma(1, 1)
      s2tilde[3] ~ dgamma(1, 1)
      for(i in 1:10){
        y[i] ~ dnorm(thetatilde[xi[i]], var=s2tilde[xi[i]])
      }
    }
  )
  Inits=list(xi=rep(1, 10), thetatilde=rep(0,3), s2tilde=rep(1,3))#
  Data=list(y=rnorm(10, 0,1))
  m <- nimbleModel(code, data=Data, inits=Inits)
  cm <- compileNimble(m)
  mConf <- configureMCMC(m, monitors =  c('thetatilde', 's2tilde', 'xi'))
  expect_message(mMCMC <- buildMCMC(mConf), "The number of clusters")
  cMCMC <- compileNimble(mMCMC, project = m) 
  cMCMC$run(1, reset=FALSE) 
[...snip...]
Compiling
  [Note] This may take a minute.
  [Note] Use 'showCompilerOutput = TRUE' to see C++ compilation details.
terminate called after throwing an instance of 'std::length_error'
  what():  vector::_M_default_append
Aborted

paciorek avatar Jun 20 '25 17:06 paciorek

@paciorek What is the intention of including reset = FALSE in the call

cMCMC$run(1, reset=FALSE) 

?

I believe that's what's causing the problem here, and the MCMC system is not designed to run with reset = FALSE, unless a prior run of the MCMC has already taken place.

danielturek avatar Jun 21 '25 02:06 danielturek

Hmm.

  1. I don't know why I put reset=FALSE in that test.
  2. That said, this test has been fine for years. And to confirm, I just ran it on current nimble and does run without error.
  3. I would hope that even if we don't intend users to use reset=FALSE on first run that it would not cause an error and certainly not a crash. That said, I see that we tell users not to use FALSE on first run in roxygen.

I'm ok with taking out reset=FALSE but before doing so, I think it would be good if we understand why it causes problems now but not before.

paciorek avatar Jun 22 '25 17:06 paciorek

@paciorek I'm away from a computer this week, this is from my phone; forgive briefness. This is caused by some reorganization of the initialization code I did with the setup of the MCMC, wrt mvSamples and related objects. The changes (alongside new initialization code for the derived quantities) is more logical and correct, for how the MCMC is designed to operate. I wouldn't mind also adding an error trap for this case (first run of MCMC using reset = FALSE), which would error trap this case, but I can't do that now. What do you think is best?

danielturek avatar Jun 22 '25 19:06 danielturek

To me the most natural thing is that if one does reset=FALSE on first run of the MCMC it has no effect. I.e. if one does not reset MCMC sampling quantities before even running the MCMC, then the MCMC just runs as it otherwise would with whatever the initial quantities are set to..

But you have probably thought this through/understand this more than I do, so if you prefer to error trap this case, it is ok with me.

Perhaps when you work on this you can just remove the reset=FALSE from that test in test-bnp.R.

No immediate hurry here, but I'll note to myself that we are planning on making this change as part of 1.4.0 release.

paciorek avatar Jun 22 '25 19:06 paciorek

If I'm seeing this correctly, iIt looks like the testing failures are related to the MCMC gold file. Chris, are you able to look more closely at those, at any point?

On Sun, Jun 22, 2025 at 3:59 PM Christopher Paciorek < @.***> wrote:

paciorek left a comment (nimble-dev/nimble#1547) https://github.com/nimble-dev/nimble/pull/1547#issuecomment-2994415603

To me the most natural thing is that if one does reset=FALSE on first run of the MCMC it has no effect. I.e. if one does not reset MCMC sampling quantities before even running the MCMC, then the MCMC just runs as it otherwise would with whatever the initial quantities are set to..

But you have probably thought this through/understand this more than I do, so if you prefer to error trap this case, it is ok with me.

Perhaps when you work on this you can just remove the reset=FALSE from that test in test-bnp.R.

No immediate hurry here, but I'll note to myself that we are planning on making this change as part of 1.4.0 release.

— Reply to this email directly, view it on GitHub https://github.com/nimble-dev/nimble/pull/1547#issuecomment-2994415603, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCNNYLFRIY74WN55PZ2DZD3E4DJLAVCNFSM6AAAAAB5ATJWNCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSOJUGQYTKNRQGM . You are receiving this because you authored the thread.Message ID: @.***>

danielturek avatar Jun 25 '25 01:06 danielturek

Sure -- I need to look at the MCMC gold file for another PR, so it may be related.

paciorek avatar Jun 25 '25 14:06 paciorek

Thanks. I made a minor fix for the MCMC reset argument.

On Wed, Jun 25, 2025 at 10:25 AM Christopher Paciorek < @.***> wrote:

paciorek left a comment (nimble-dev/nimble#1547) https://github.com/nimble-dev/nimble/pull/1547#issuecomment-3004989556

Sure -- I need to look at the MCMC gold file for another PR, so it may be related.

— Reply to this email directly, view it on GitHub https://github.com/nimble-dev/nimble/pull/1547#issuecomment-3004989556, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCNNYJCLGXZWY7V6WQVLBL3FKWL7AVCNFSM6AAAAAB5ATJWNCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZTAMBUHE4DSNJVGY . You are receiving this because you authored the thread.Message ID: @.***>

danielturek avatar Jun 25 '25 17:06 danielturek

@danielturek there's a problem with your fix:

** byte-compile and prepare package for lazy loading
Warning in .checkFieldsInMethod(def, fieldNames, allMethods) :
  non-local assignment to non-field names (possibly misspelled?)
    reset <<- TRUE
( in method "run" for class "MCMC")

paciorek avatar Jun 25 '25 23:06 paciorek

@paciorek I just made a small update.

danielturek avatar Jun 26 '25 01:06 danielturek

So some (not sure if all) of the gold file problems seem to be caused by a new space character inserted in the reporting of sampler assignment.

E.g., in line 2709 of mcmcTestLog_Correct.Rout, we have:

RW sampler: a[2]

The saved gold file has no space after "[2]", but running tests now produces a space after "a[2]".

On first glance, it looks like this is coming from the change to cat() in the show method for samplerConf.

paciorek avatar Jun 26 '25 15:06 paciorek

I think all of the mcmc gold file issues are the extra space. Hopefully the filtering gold issues are too.

paciorek avatar Jun 26 '25 15:06 paciorek

That sounds about right, and yes, I would prefer the space not be there. Is there any chance of re-writing the gold file? I suspect you'll say the correct path is to reintroduce the space, the (theoretically) pass testing, then re-remove the space.

danielturek avatar Jun 26 '25 16:06 danielturek

@danielturek

  • Ok to merge this in? Do you want to do it?
  • I see various roxygen entries, so I am guessing roxygen is more-or-less done, but wanted to check.
  • Were you planning to draft a section for the manual or would you like me to do that?

paciorek avatar Jun 27 '25 17:06 paciorek

@paciorek

  • Yes, I consider this ready to merge. I'll take care of it.
  • Yes, the roxygen is accurate.
  • There is currently no manual entry. If you're willing to draft that I'd appreciate it. It might be reasonable to (in a large part) just reference the vignette; or if not, you could repurpose a lot of information from there.

danielturek avatar Jun 28 '25 10:06 danielturek

@paciorek Looks like I didn't update the NEWS file, however.

danielturek avatar Jun 28 '25 15:06 danielturek

That's fine -- as of late I've just been doing NEWS all in one go right before release by looking through all the commits/closed PRs/closed issues.

I'm not sure how to feel about vignette vs. manual. Modularity can be nice and I know other packages structure docs around vignettes and the manual can be overwhelming, but I also like have all the content in one place, particularly since we're thinking of this as a core part of the MCMC system.

@perrydv @kenkellner do you have any inclinations in terms of how much to say about derived quantities in the manual vs. the material already in the vignette?

paciorek avatar Jun 28 '25 16:06 paciorek

The existing vignette covers everything and is pretty concise, so maybe it just all goes in the manual?

If most of it stays in the vignette, is it OK that the vignette is not actually included with the package but instead is on Daniel's site? Not sure if there are other situations like that in the manual.

kenkellner avatar Jun 29 '25 12:06 kenkellner

I'm working on manual content for derived quantities.

@danielturek (and no need to respond until you have time) in your vignette, you distinguish between (1) "predictive nodes" and (2) "posterior derived quantities". I think by the latter you mean deterministic predictive "quantities", but elsewhere in our docs we refer to both determ and stoch predictive cases as "predictive nodes". So I am going to modify your language and not distinguish between (1) and (2). Please let me know if I'm misunderstanding or you otherwise are opposed to me modifying your language.

paciorek avatar Jul 03 '25 17:07 paciorek

@paciorek Yes, in the text I generally said "predictive nodes" for PP stochastic nodes, and "posterior derived quantities" for PP deterministic nodes. And yes, as you said both of these are just different cases of "predictive nodes". It sounds like we're on the same page, and if you've already modified text according to how you think this should be worded, that's fine with me.

danielturek avatar Jul 07 '25 18:07 danielturek