CausalDiscoveryToolbox icon indicating copy to clipboard operation
CausalDiscoveryToolbox copied to clipboard

Error when using bnlearn packages with integer or character values for variables

Open OJ4227 opened this issue 3 years ago • 11 comments

I am working on windows, python version (3.9.1), cdt (0.5.23), pytorch (1.10.0).

The variables in the data I'm using to predict a DAG are discrete and can either take a value of -1, 0, 1, 2 or 3. When I use this data with any of the bnlearn algorithms I get the error:

RuntimeError: RProcessError R Process Error Output

Error in data.type(x) : variable X0 is not supported in bnlearn (type: integer). Calls: gs -> bnlearn -> check.data -> data.type Execution halted

However when I change the the values to -1.1, 0.1, 1.1, 2.1 and 3.1, there are no errors and it can predict a graph. Is it possible to get this to work with discete integer or character data? And does this mean the algorithm is applying the wrong independence tests, since in the CDT documentation it says that it can apply either discrete or continuous tests? I couldn't figure out whether I was able to pass an argument to GS() or GS().predict() to tell the R package what type of data it is and what test should be applied.

Any help advice would be greatly appreciated!

OJ4227 avatar Feb 06 '22 20:02 OJ4227

Hello ! Sorry for the delayed answer, did you try changing the independence test for one fit for discrete values ?

Here is the doc on the available tests: https://fentechsolutions.github.io/CausalDiscoveryToolbox/html/causality.html#bnlearn-based-models

discrete case (categorical variables)

– mutual information: an information-theoretic distance measure.

It’s proportional to the log-likelihood ratio (they differ by a 2n factor) and is related to the deviance of the tested models. The asymptotic χ2 test (mi and mi-adf, with adjusted degrees of freedom), the Monte Carlo permutation test (mc-mi), the sequential Monte Carlo permutation test (smc-mi), and the semiparametric test (sp-mi) are implemented.

– shrinkage estimator for the mutual information (mi-sh)

An improved asymptotic χ2 test based on the James-Stein estimator for the mutual information.

– Pearson’s X2 the classical Pearson’s X2 test for contingency tables.

The asymptotic χ2 test (x2 and x2-adf, with adjusted degrees of freedom), the Monte Carlo permutation test (mc-x2), the sequential Monte Carlo permutation test (smc-x2) and semiparametric test (sp-x2) are implemented .

discrete case (ordered factors)

– Jonckheere-Terpstraa trend test for ordinal variables.

The asymptotic normal test (jt), the Monte Carlo permutation test (mc-jt) and the sequential Monte Carlo permutation test (smc-jt) are implemented.

just give your wanted test to the score parameter of your algorithm

diviyank avatar Mar 01 '22 08:03 diviyank

Thanks for getting back to me

I tried to specify a score previously but when I tried to input a score I receive the error:

algorithms.append(cdt.causality.graph.bnlearn.GS(score='mi')) # Error! - bnlearn integer issue TypeError: __init__() got an unexpected keyword argument 'score'

I also wasn't sure based on the documentation what specific strings you're meant to input but it looks like from here that it's mi, mc-mi, x2 etc.?

OJ4227 avatar Mar 02 '22 20:03 OJ4227

Hello, did you solved that issue? I am trying to run bnlearn causal discovery algorithms- but I have to change the conditional independence test. I run into errors if I do : GS (score='mi-cg')-- TypeError: init() got an unexpected keyword argument 'score' Any suggestions?

Angela446-lgtm avatar Apr 12 '22 11:04 Angela446-lgtm

Hello, Sorry, error in the implementation ! I just checked, I'll fix this ASAP

Best, Diviyan

diviyank avatar Apr 12 '22 17:04 diviyank

Thank you!!

Angela446-lgtm avatar Apr 12 '22 17:04 Angela446-lgtm

Hi, The fix is pushed to the github repo, but I need to migrate the CI to CircleCI to be able to push to PyPi/dockerhub Best, Diviyan

diviyank avatar Apr 12 '22 17:04 diviyank

I use "m = GS(score='cor')", still got error: TypeError: init() got an unexpected keyword argument 'score'

XMAHA avatar Jun 10 '22 03:06 XMAHA

Yes the package isn't updated on pypi, I'll do it asap!

Le ven. 10 juin 2022, 05:41, XMAHA @.***> a écrit :

I use "m = GS(score='cor')", still got error: TypeError: init() got an unexpected keyword argument 'score'

— Reply to this email directly, view it on GitHub https://github.com/FenTechSolutions/CausalDiscoveryToolbox/issues/120#issuecomment-1151898332, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5ZM5CX4DZUEV6TCAK6SCDVOK2NHANCNFSM5NVZUVNA . You are receiving this because you commented.Message ID: @.***>

diviyank avatar Jun 10 '22 05:06 diviyank

Thank you. And I wonder if there are any methods to extract the test score value of GS algorithm, so we can know the dependency value. If we use the default mutual information score ('mi'), any methods in "cdt.independence.stats" can be used?

Yes the package isn't updated on pypi, I'll do it asap! Le ven. 10 juin 2022, 05:41, XMAHA @.> a écrit : I use "m = GS(score='cor')", still got error: TypeError: init() got an unexpected keyword argument 'score' — Reply to this email directly, view it on GitHub <#120 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE5ZM5CX4DZUEV6TCAK6SCDVOK2NHANCNFSM5NVZUVNA . You are receiving this because you commented.Message ID: @.>

XMAHA avatar Jun 11 '22 08:06 XMAHA

I think that it should be okaym, I should check the bnlearn doc. There might be some compatibility issues depending on the method. Sorry I was really busy, I'll update this soon

diviyank avatar Aug 13 '22 09:08 diviyank

bnlearn expects categorical variables to be factors. Just add dataset[sapply(dataset, is.character)] <-lapply(dataset[sapply(dataset, is.character)], as.factor) after https://github.com/FenTechSolutions/CausalDiscoveryToolbox/blob/d0bc352534dcbfac19a84a1bb05f33fe311378d2/cdt/causality/graph/R_templates/bnlearn.R#L25 . (I'm not sure why bnlearn doesn't do this internally tbh.)

AMabona avatar Nov 20 '22 10:11 AMabona