HydeNet Deterministic nodes with factor parents

This can be difficult, since we have to write formulas which reference factor levels. Currently, we have to do so by referencing the integer index of the factor level we want.

Rather than fumble around trying to give you a closed-form example of R code that tries to describe what I'm talking about, I will invite you to go into the Decision Networks vignette and attempt defining the payoff utility node using setNode(). Or, if that's actually not possible, modifying the network structure (e.g., by adding nodes) in such a way that the payoff can be calculated.

In the meantime, I'll switch to working on the Setting Nodes and Getting Started vignettes.

Jun 09 '15 18:06 jarrod-dalton

This is another one I'll spend some time thinking about. I'll come up with something.

Jun 16 '15 00:06 nutterb

Consider this example:

net <- setNode(net, payoff, "determ", define=fromFormula(),
         nodeFormula = payoff ~
                         ifelse(playerFinalPoints > 21, -1,
                           ifelse(playerFinalPoints == 21,
                             ifelse(dealerOutcome == 1, 0,
                               ifelse(dealerOutcome == 7, 0, 1)),
                             ifelse(dealerOutcome == 2,
                               ifelse(playerFinalPoints < 22, 1, -1),
                               ifelse(dealerOutcome == 3,
                                 ifelse(playerFinalPoints == 17, 0,
                                 ifelse(playerFinalPoints > 17, 1, -1)),
                                 ifelse(dealerOutcome == 4,
                                   ifelse(playerFinalPoints == 18, 0,
                                     ifelse(playerFinalPoints > 18, 1, -1)),
                                   ifelse(dealerOutcome == 5,
                                     ifelse(playerFinalPoints == 19, 0,
                                       ifelse(playerFinalPoints > 19, 1, -1)),
                                     ifelse(dealerOutcome == 6,
                                       ifelse(playerFinalPoints == 20, 0,
                                         ifelse(playerFinalPoints > 20, 1, -1)),
                                       ifelse(playerFinalPoints == 21, 0, -1)))))))))

Given the current structure, the only thing I can think that would make it feasible to give the factor level would be to use a utility function here. So if I wanted the equivalent of dealerOutcome == 2, I could use a utility such as

dealerOutcome == numericLevel("Bust", BJDealer$dealerOutcome)

the numericLevel function would then return the number 2.

The upside is that you don't have to remember all of the variable codings. The downside is that it has the potential to be much more typing. But the only other place this gets processed is in writing the JAGS code, and there's no good way to tie the numeric coding to a factor variable at that point.

I'll write up the function. You can tell me if you want to use it at all. :)

Oct 30 '15 15:10 nutterb

Is there maybe an escape character that we could use instead of quotation marks, like

#Bust#

which would tell HydeNet to call such a function?

On Fri, Oct 30, 2015 at 11:08 AM, Benjamin [email protected] wrote:

Consider this example:

net <- setNode(net, payoff, "determ", define=fromFormula(), nodeFormula = payoff ~ ifelse(playerFinalPoints > 21, -1, ifelse(playerFinalPoints == 21, ifelse(dealerOutcome == 1, 0, ifelse(dealerOutcome == 7, 0, 1)), ifelse(dealerOutcome == 2, ifelse(playerFinalPoints < 22, 1, -1), ifelse(dealerOutcome == 3, ifelse(playerFinalPoints == 17, 0, ifelse(playerFinalPoints > 17, 1, -1)), ifelse(dealerOutcome == 4, ifelse(playerFinalPoints == 18, 0, ifelse(playerFinalPoints > 18, 1, -1)), ifelse(dealerOutcome == 5, ifelse(playerFinalPoints == 19, 0, ifelse(playerFinalPoints > 19, 1, -1)), ifelse(dealerOutcome == 6, ifelse(playerFinalPoints == 20, 0, ifelse(playerFinalPoints > 20, 1, -1)), ifelse(playerFinalPoints == 21, 0, -1)))))))))

Given the current structure, the only thing I can think that would make it feasible to give the factor level would be to use a utility function here. So if I wanted the equivalent of dealerOutcome == 2, I could use a utility such as

dealerOutcome == numericLevel("Bust", BJDealer$dealerOutcome)

the numericLevel function would then return the number 2.

The upside is that you don't have to remember all of the variable codings. The downside is that it has the potential to be much more typing. But the only other place this gets processed is in writing the JAGS code, and there's no good way to tie the numeric coding to a factor variable at that point.

— Reply to this email directly or view it on GitHub https://github.com/nutterb/HydeNet/issues/65#issuecomment-152550477.

Oct 30 '15 15:10 jarrod-dalton

It's possible we could use something like dealerOutcome == "#Bust,BJDealer$dealerOutcome#", but I don't think that saves much typing. The major issue is that therToJags` function deals with converting R code into JAGS and only takes a single argument--a formula object. The variable has to be passed with the variable level.

However, as I think about it, we could create our own handy dandy little intermediary function with a weird syntax. for example:

jagsFunc(formula, ...)

where the ... arguments takes named arguments, each giving a factor variable referenced in formula.

jags(payoff ~ dealerOutcome == "#Bust:dealerOutcome#",
     dealerOutcome = BJDealer$dealerOutcome)

returns a formula object payoff ~ dealerOutcome == 2.

Alternatively, we might have jagsFunc take a character argument, which would allow jagsFunc("payoff ~ #dealerOutcome == 'Bust'#"). I'm a little nervous about this one, however, because I think it will likely fail if someone tries to use it in any way other than the == sense. I can't think of why anyone would do something like dealerOutcome * "Bust" or what that would mean. Perhaps I'm being paranoid?

I'm rambling. that might work actually.

Oct 30 '15 15:10 nutterb

This is now implemented into the current-devel branch. The final function name is factorFormula and I even implemented it in the Decision Networks vignette if you'd like to see it in action. If you feel like you can get behind this, let me know.

Oct 30 '15 18:10 nutterb

Beautiful. Now on to the beggar->chooser transition: is it possible to build logic into the formula evaluation such that if it sees any quoted elements it knows to pass it through factorFormula() without the user explicitly calling it?

Jarrod

On Oct 30, 2015, at 2:16 PM, Benjamin [email protected] wrote:

This is now implemented into the current-devel branch. The final function name is factorFormula and I even implemented it in the Decision Networks vignette if you'd like to see it in action. If you feel like you can get behind this, let me know.

— Reply to this email directly or view it on GitHub.

Oct 30 '15 18:10 jarrod-dalton

Truthfully, yes. It just means passing every formula through factorFormula within setNode and not exporting factorFormula. (well, we could still export it, we just wouldn't have to, and I would probably opt not to, since there isn't much need for it otherwise). would you like to beg and choose that option?

Oct 30 '15 18:10 nutterb

I like that. All under the hood. Thanks!

On Fri, Oct 30, 2015 at 2:36 PM, Benjamin [email protected] wrote:

Truthfully, yes. It just means passing every formula through factorFormula within setNode and not exporting factorFormula. (well, we could still export it, we just wouldn't have to, and I would probably opt not to, since there isn't much need for it otherwise). would you like to beg and choose that option?

— Reply to this email directly or view it on GitHub https://github.com/nutterb/HydeNet/issues/65#issuecomment-152612015.

Oct 30 '15 18:10 jarrod-dalton

I seem to be unable to pass node formulas through factorFormula() when the node is not deterministic. Below, I attempt to manually write a logistic regression equation for pe given wells, where wells is treated as a three-level categorical variable.

# Set up some stuff...
net <- HydeNetwork(~ wells
                   + pe | wells
                   + d.dimer | pregnant*pe
                   + angio | pe
                   + treat | d.dimer*angio
                   + death | pe*treat)

net <- setNode(net, wells,
               nodeType = "dcat",
               pi = vectorProbs(p = c(37, 164, 49), wells),
               factorLevels = c("Low","Medium","High"))

# These two attempts do not work...
net <- setNode(net, "pe", nodeType = "dbern", 
               define = fromFormula(),
               nodeFormula = pe ~ ilogit(-2.94
                                         + 1.56*(wells == "Medium")
                                         + 3.14*(wells == "High")))  

net <- setNode(net, "pe", nodeType = "dbern", 
               p = plogis(-2.94 + 1.56*(wells == "Medium")
                          + 3.14*(wells == "High")))

Nov 23 '15 14:11 jarrod-dalton

I think I got it...

net <- setNode(net, "pe", nodeType = "dbern", 
                p = fromFormula(),
                nodeFormula = pe ~ ilogit(-2.94
                                          + 1.56*(wells == "Medium")
                                          + 3.14*(wells == "High")))

Nov 23 '15 14:11 jarrod-dalton

Do we want to alert the user to unconverted factor levels? In the below example, we try to use a factor level for node pe in the regression equation for d.dimer before we've used setNode() to define node pe (and told it that the factorLevels are c("No","Yes").

It is generally a good idea to proceed through the network in topological order (basically starting from the root nodes and populating children only when all parent nodes have been populated). Doing so will avoid issues like this.

Do we want to go so far as disallowing setNode() from working if all parents' models have not yet been specified? This wouldn't catch all possible ways to screw up inputting node distributions via setNode() (as I seem to be adept at demonstrating), but on the other hand I can't seem to think of a good reason not to work under this restriction.

net <- HydeNetwork(~ wells
                   + pe | wells
                   + d.dimer | pregnant*pe)

net <- setNode(network = net, node = pregnant,
               nodeType = "dbern", p=.4,
               factorLevels = c("No","Yes"))

wells.p <- paste("pi.wells[1] <- 0.148",
                 "pi.wells[2] <- 0.656",
                 "pi.wells[3] <- 0.196",
                 sep = "; ")
net <- setNode(net, wells, nodeType = "dcat", pi = wells.p)

# Not run, but it should be...
#
#net <- setNode(net, "pe", nodeType = "dbern", 
#                p = fromFormula(),
#                nodeFormula = pe ~ ilogit(-2.94
#                                          + 1.56*(wells == "Medium")
#                                          + 3.14*(wells == "High")))

net <- setNode(net, d.dimer, nodeType="dnorm",
               mu=fromFormula(), tau=1/30,  #sigma^2 = 30
               nodeFormula = d.dimer ~ 210 + 29*(pregnant=="Yes") + 68*(pe=="Yes"))

net$nodeFormula$d.dimer

d.dimer ~ 210 + 29 * (pregnant == 1) + 68 * (pe == character(0))
<environment: 0x10615f5c0>

Nov 23 '15 15:11 jarrod-dalton

I added an error in circumstances where there is no accompanying factorLevels entry for the variable. I think it's important to make this a hard error--the downstream consequences are catastrophic. Let me know if you think the error message is sufficient or if it needs more information.

Dec 11 '15 14:12 nutterb

HydeNet HydeNet copied to clipboard

Deterministic nodes with factor parents

HydeNet
HydeNet copied to clipboard