HydeNet
HydeNet copied to clipboard
Deterministic nodes with factor parents
This can be difficult, since we have to write formulas which reference factor levels. Currently, we have to do so by referencing the integer index of the factor level we want.
Rather than fumble around trying to give you a closed-form example of R code that tries to describe what I'm talking about, I will invite you to go into the Decision Networks vignette and attempt defining the payoff
utility node using setNode()
. Or, if that's actually not possible, modifying the network structure (e.g., by adding nodes) in such a way that the payoff can be calculated.
In the meantime, I'll switch to working on the Setting Nodes and Getting Started vignettes.
This is another one I'll spend some time thinking about. I'll come up with something.
Consider this example:
net <- setNode(net, payoff, "determ", define=fromFormula(),
nodeFormula = payoff ~
ifelse(playerFinalPoints > 21, -1,
ifelse(playerFinalPoints == 21,
ifelse(dealerOutcome == 1, 0,
ifelse(dealerOutcome == 7, 0, 1)),
ifelse(dealerOutcome == 2,
ifelse(playerFinalPoints < 22, 1, -1),
ifelse(dealerOutcome == 3,
ifelse(playerFinalPoints == 17, 0,
ifelse(playerFinalPoints > 17, 1, -1)),
ifelse(dealerOutcome == 4,
ifelse(playerFinalPoints == 18, 0,
ifelse(playerFinalPoints > 18, 1, -1)),
ifelse(dealerOutcome == 5,
ifelse(playerFinalPoints == 19, 0,
ifelse(playerFinalPoints > 19, 1, -1)),
ifelse(dealerOutcome == 6,
ifelse(playerFinalPoints == 20, 0,
ifelse(playerFinalPoints > 20, 1, -1)),
ifelse(playerFinalPoints == 21, 0, -1)))))))))
Given the current structure, the only thing I can think that would make it feasible to give the factor level would be to use a utility function here. So if I wanted the equivalent of dealerOutcome == 2
, I could use a utility such as
dealerOutcome == numericLevel("Bust", BJDealer$dealerOutcome)
the numericLevel
function would then return the number 2
.
The upside is that you don't have to remember all of the variable codings. The downside is that it has the potential to be much more typing. But the only other place this gets processed is in writing the JAGS code, and there's no good way to tie the numeric coding to a factor variable at that point.
I'll write up the function. You can tell me if you want to use it at all. :)
Is there maybe an escape character that we could use instead of quotation marks, like
#Bust#
which would tell HydeNet to call such a function?
On Fri, Oct 30, 2015 at 11:08 AM, Benjamin [email protected] wrote:
Consider this example:
net <- setNode(net, payoff, "determ", define=fromFormula(), nodeFormula = payoff ~ ifelse(playerFinalPoints > 21, -1, ifelse(playerFinalPoints == 21, ifelse(dealerOutcome == 1, 0, ifelse(dealerOutcome == 7, 0, 1)), ifelse(dealerOutcome == 2, ifelse(playerFinalPoints < 22, 1, -1), ifelse(dealerOutcome == 3, ifelse(playerFinalPoints == 17, 0, ifelse(playerFinalPoints > 17, 1, -1)), ifelse(dealerOutcome == 4, ifelse(playerFinalPoints == 18, 0, ifelse(playerFinalPoints > 18, 1, -1)), ifelse(dealerOutcome == 5, ifelse(playerFinalPoints == 19, 0, ifelse(playerFinalPoints > 19, 1, -1)), ifelse(dealerOutcome == 6, ifelse(playerFinalPoints == 20, 0, ifelse(playerFinalPoints > 20, 1, -1)), ifelse(playerFinalPoints == 21, 0, -1)))))))))
Given the current structure, the only thing I can think that would make it feasible to give the factor level would be to use a utility function here. So if I wanted the equivalent of dealerOutcome == 2, I could use a utility such as
dealerOutcome == numericLevel("Bust", BJDealer$dealerOutcome)
the numericLevel function would then return the number 2.
The upside is that you don't have to remember all of the variable codings. The downside is that it has the potential to be much more typing. But the only other place this gets processed is in writing the JAGS code, and there's no good way to tie the numeric coding to a factor variable at that point.
— Reply to this email directly or view it on GitHub https://github.com/nutterb/HydeNet/issues/65#issuecomment-152550477.
It's possible we could use something like dealerOutcome == "#Bust,BJDealer$dealerOutcome#", but I don't think that saves much typing. The major issue is that the
rToJags` function deals with converting R code into JAGS and only takes a single argument--a formula object. The variable has to be passed with the variable level.
However, as I think about it, we could create our own handy dandy little intermediary function with a weird syntax. for example:
jagsFunc(formula, ...)
where the ...
arguments takes named arguments, each giving a factor variable referenced in formula
.
jags(payoff ~ dealerOutcome == "#Bust:dealerOutcome#",
dealerOutcome = BJDealer$dealerOutcome)
returns a formula object payoff ~ dealerOutcome == 2
.
Alternatively, we might have jagsFunc
take a character argument, which would allow jagsFunc("payoff ~ #dealerOutcome == 'Bust'#")
. I'm a little nervous about this one, however, because I think it will likely fail if someone tries to use it in any way other than the ==
sense. I can't think of why anyone would do something like dealerOutcome * "Bust"
or what that would mean. Perhaps I'm being paranoid?
I'm rambling. that might work actually.
This is now implemented into the current-devel
branch. The final function name is factorFormula
and I even implemented it in the Decision Networks vignette if you'd like to see it in action. If you feel like you can get behind this, let me know.
Beautiful. Now on to the beggar->chooser transition: is it possible to build logic into the formula evaluation such that if it sees any quoted elements it knows to pass it through factorFormula() without the user explicitly calling it?
- Jarrod
On Oct 30, 2015, at 2:16 PM, Benjamin [email protected] wrote:
This is now implemented into the current-devel branch. The final function name is factorFormula and I even implemented it in the Decision Networks vignette if you'd like to see it in action. If you feel like you can get behind this, let me know.
— Reply to this email directly or view it on GitHub.
Truthfully, yes. It just means passing every formula through factorFormula
within setNode
and not exporting factorFormula
. (well, we could still export it, we just wouldn't have to, and I would probably opt not to, since there isn't much need for it otherwise). would you like to beg and choose that option?
I like that. All under the hood. Thanks!
On Fri, Oct 30, 2015 at 2:36 PM, Benjamin [email protected] wrote:
Truthfully, yes. It just means passing every formula through factorFormula within setNode and not exporting factorFormula. (well, we could still export it, we just wouldn't have to, and I would probably opt not to, since there isn't much need for it otherwise). would you like to beg and choose that option?
— Reply to this email directly or view it on GitHub https://github.com/nutterb/HydeNet/issues/65#issuecomment-152612015.
I seem to be unable to pass node formulas through factorFormula()
when the node is not deterministic. Below, I attempt to manually write a logistic regression equation for pe
given wells
, where wells
is treated as a three-level categorical variable.
# Set up some stuff...
net <- HydeNetwork(~ wells
+ pe | wells
+ d.dimer | pregnant*pe
+ angio | pe
+ treat | d.dimer*angio
+ death | pe*treat)
net <- setNode(net, wells,
nodeType = "dcat",
pi = vectorProbs(p = c(37, 164, 49), wells),
factorLevels = c("Low","Medium","High"))
# These two attempts do not work...
net <- setNode(net, "pe", nodeType = "dbern",
define = fromFormula(),
nodeFormula = pe ~ ilogit(-2.94
+ 1.56*(wells == "Medium")
+ 3.14*(wells == "High")))
net <- setNode(net, "pe", nodeType = "dbern",
p = plogis(-2.94 + 1.56*(wells == "Medium")
+ 3.14*(wells == "High")))
I think I got it...
net <- setNode(net, "pe", nodeType = "dbern",
p = fromFormula(),
nodeFormula = pe ~ ilogit(-2.94
+ 1.56*(wells == "Medium")
+ 3.14*(wells == "High")))
Do we want to alert the user to unconverted factor levels? In the below example, we try to use a factor level for node pe
in the regression equation for d.dimer
before we've used setNode()
to define node pe
(and told it that the factorLevels
are c("No","Yes")
.
It is generally a good idea to proceed through the network in topological order (basically starting from the root nodes and populating children only when all parent nodes have been populated). Doing so will avoid issues like this.
Do we want to go so far as disallowing setNode()
from working if all parents' models have not yet been specified? This wouldn't catch all possible ways to screw up inputting node distributions via setNode()
(as I seem to be adept at demonstrating), but on the other hand I can't seem to think of a good reason not to work under this restriction.
net <- HydeNetwork(~ wells
+ pe | wells
+ d.dimer | pregnant*pe)
net <- setNode(network = net, node = pregnant,
nodeType = "dbern", p=.4,
factorLevels = c("No","Yes"))
wells.p <- paste("pi.wells[1] <- 0.148",
"pi.wells[2] <- 0.656",
"pi.wells[3] <- 0.196",
sep = "; ")
net <- setNode(net, wells, nodeType = "dcat", pi = wells.p)
# Not run, but it should be...
#
#net <- setNode(net, "pe", nodeType = "dbern",
# p = fromFormula(),
# nodeFormula = pe ~ ilogit(-2.94
# + 1.56*(wells == "Medium")
# + 3.14*(wells == "High")))
net <- setNode(net, d.dimer, nodeType="dnorm",
mu=fromFormula(), tau=1/30, #sigma^2 = 30
nodeFormula = d.dimer ~ 210 + 29*(pregnant=="Yes") + 68*(pe=="Yes"))
net$nodeFormula$d.dimer
d.dimer ~ 210 + 29 * (pregnant == 1) + 68 * (pe == character(0))
<environment: 0x10615f5c0>
I added an error in circumstances where there is no accompanying factorLevels
entry for the variable. I think it's important to make this a hard error--the downstream consequences are catastrophic. Let me know if you think the error message is sufficient or if it needs more information.