dodiscover PC algo only working with int data inputs

PC algo only working with int data inputs

Open robertness opened this issue 2 years ago • 1 comments

Right now, the PC algorithm I believe requires discrete variables to be integers instead of characters. I tried running PC on this data:

A	S	T	L	B	E	X	D
no	yes	no	no	yes	no	no	yes
no	yes	no	no	no	no	no	no
no	no	yes	no	no	yes	yes	yes
no	no	no	no	yes	no	no	yes
no	no	no	no	no	no	no	yes

But it threw an error. To get it to work I had to convert the values to ints.

def convert_to_int(df):
    for var in df.columns:
        data[var] = [1 if x == "yes" else 0 for x in data[var]]
    return df
data_mod = convert_to_int(data)

pc.fit(data_mod, context)

Calling this a bug. pc.fit(data, context) should work.

Dec 09 '22 20:12 robertness

Could the user just call an Encoder preprocessing function from scikit-learn? Or should we add that step for them? Either way good catch, we should document this accordingly for any categorical/discrete tests.

Dec 09 '22 22:12 adam2392

A	S	T	L	B	E	X	D
no	yes	no	no	yes	no	no	yes
no	yes	no	no	no	no	no	no
no	no	yes	no	no	yes	yes	yes
no	no	no	no	yes	no	no	yes
no	no	no	no	no	no	no	yes

A	S	T	L	B	E	X	D
no	yes	no	no	yes	no	no	yes
no	yes	no	no	no	no	no	no
no	no	yes	no	no	yes	yes	yes
no	no	no	no	yes	no	no	yes
no	no	no	no	no	no	no	yes

dodiscover dodiscover copied to clipboard

PC algo only working with int data inputs

dodiscover
dodiscover copied to clipboard

A	S	T	L	B	E	X	D
no	yes	no	no	yes	no	no	yes
no	yes	no	no	no	no	no	no
no	no	yes	no	no	yes	yes	yes
no	no	no	no	yes	no	no	yes
no	no	no	no	no	no	no	yes