CausalDiscoveryToolbox
CausalDiscoveryToolbox copied to clipboard
Error when using bnlearn packages with integer or character values for variables
I am working on windows, python version (3.9.1), cdt (0.5.23), pytorch (1.10.0).
The variables in the data I'm using to predict a DAG are discrete and can either take a value of -1, 0, 1, 2 or 3. When I use this data with any of the bnlearn algorithms I get the error:
RuntimeError: RProcessError
R Process Error Output
Error in data.type(x) :
variable X0 is not supported in bnlearn (type: integer).
Calls: gs -> bnlearn -> check.data -> data.type
Execution halted
However when I change the the values to -1.1, 0.1, 1.1, 2.1 and 3.1, there are no errors and it can predict a graph. Is it possible to get this to work with discete integer or character data? And does this mean the algorithm is applying the wrong independence tests, since in the CDT documentation it says that it can apply either discrete or continuous tests? I couldn't figure out whether I was able to pass an argument to GS() or GS().predict() to tell the R package what type of data it is and what test should be applied.
Any help advice would be greatly appreciated!
Hello ! Sorry for the delayed answer, did you try changing the independence test for one fit for discrete values ?
Here is the doc on the available tests: https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/causality.html#bnlearn-based-models
discrete case (categorical variables)
– mutual information: an information-theoretic distance measure.
It’s proportional to the log-likelihood ratio (they differ by a 2n factor) and is related to the deviance of the tested models. The asymptotic χ2 test (mi
and mi-adf
, with adjusted degrees of freedom), the Monte Carlo permutation test (mc-mi
), the sequential Monte Carlo permutation test (smc-mi
), and the semiparametric test (sp-mi
) are implemented.
– shrinkage estimator for the mutual information (mi-sh
)
An improved asymptotic χ2 test based on the James-Stein estimator for the mutual information.
– Pearson’s X2 the classical Pearson’s X2 test for contingency tables.
The asymptotic χ2 test (x2
and x2-adf
, with adjusted degrees of freedom), the Monte Carlo permutation test (mc-x2
), the sequential Monte Carlo permutation test (smc-x2
) and semiparametric test (sp-x2
) are implemented .
discrete case (ordered factors)
– Jonckheere-Terpstraa trend test for ordinal variables.
The asymptotic normal test (jt
), the Monte Carlo permutation test (mc-jt
) and the sequential Monte Carlo permutation test (smc-jt) are implemented.
just give your wanted test to the score
parameter of your algorithm
Thanks for getting back to me
I tried to specify a score previously but when I tried to input a score I receive the error:
algorithms.append(cdt.causality.graph.bnlearn.GS(score='mi')) # Error! - bnlearn integer issue
TypeError: __init__() got an unexpected keyword argument 'score'
I also wasn't sure based on the documentation what specific strings you're meant to input but it looks like from here that it's mi
, mc-mi
, x2
etc.?
Hello, did you solved that issue? I am trying to run bnlearn causal discovery algorithms- but I have to change the conditional independence test. I run into errors if I do : GS (score='mi-cg')-- TypeError: init() got an unexpected keyword argument 'score' Any suggestions?
Hello, Sorry, error in the implementation ! I just checked, I'll fix this ASAP
Best, Diviyan
Thank you!!
Hi, The fix is pushed to the github repo, but I need to migrate the CI to CircleCI to be able to push to PyPi/dockerhub Best, Diviyan
I use "m = GS(score='cor')", still got error: TypeError: init() got an unexpected keyword argument 'score'
Yes the package isn't updated on pypi, I'll do it asap!
Le ven. 10 juin 2022, 05:41, XMAHA @.***> a écrit :
I use "m = GS(score='cor')", still got error: TypeError: init() got an unexpected keyword argument 'score'
— Reply to this email directly, view it on GitHub https://github.com/FenTechSolutions/CausalDiscoveryToolbox/issues/120#issuecomment-1151898332, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5ZM5CX4DZUEV6TCAK6SCDVOK2NHANCNFSM5NVZUVNA . You are receiving this because you commented.Message ID: @.***>
Thank you. And I wonder if there are any methods to extract the test score value of GS algorithm, so we can know the dependency value. If we use the default mutual information score ('mi'), any methods in "cdt.independence.stats" can be used?
Yes the package isn't updated on pypi, I'll do it asap! Le ven. 10 juin 2022, 05:41, XMAHA @.> a écrit : … I use "m = GS(score='cor')", still got error: TypeError: init() got an unexpected keyword argument 'score' — Reply to this email directly, view it on GitHub <#120 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5ZM5CX4DZUEV6TCAK6SCDVOK2NHANCNFSM5NVZUVNA . You are receiving this because you commented.Message ID: @.>
I think that it should be okaym, I should check the bnlearn doc. There might be some compatibility issues depending on the method. Sorry I was really busy, I'll update this soon
bnlearn
expects categorical variables to be factors. Just add dataset[sapply(dataset, is.character)] <-lapply(dataset[sapply(dataset, is.character)], as.factor)
after https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/d0bc352534dcbfac19a84a1bb05f33fe311378d2/cdt/causality/graph/R_templates/bnlearn.R#L25 . (I'm not sure why bnlearn
doesn't do this internally tbh.)