pycid
pycid copied to clipboard
random_cids sometimes hangs forever
It's likely the cause of these test failures: https://github.com/causalincentives/pycid/actions/runs/667798848 https://github.com/causalincentives/pycid/actions/runs/666392113 https://github.com/causalincentives/pycid/actions/runs/666341225 https://github.com/causalincentives/pycid/actions/runs/665993248
Those were from before I split up the test_random functions. When I test locally its test_random_cids_create_one that hangs.
Traceback from stopping the test:
File "/home/eric/dev/pycid/pycid/core/cpd.py", line 219, in initialize_tabular_cpd
[[complete_dictionary(self.stochastic_function(**i))[t] for i in self.parent_values(cid)] for t in domain]
File "/home/eric/dev/pycid/pycid/core/cpd.py", line 219, in <listcomp>
[[complete_dictionary(self.stochastic_function(**i))[t] for i in self.parent_values(cid)] for t in domain]
File "/home/eric/dev/pycid/pycid/core/cpd.py", line 219, in <listcomp>
[[complete_dictionary(self.stochastic_function(**i))[t] for i in self.parent_values(cid)] for t in domain]
File "/home/eric/dev/pycid/pycid/core/cpd.py", line 207, in complete_dictionary
missing_keys = set(domain) - set(dictionary.keys())
KeyboardInterrupt
I don't know the details of what it's trying to do but in my tests sometimes matrix
in initialize_tabular_cpd
will have shapes like (107, 4096)
or (32, 5120)
and take a long time to generate. When it runs quickly the shapes are more like (9, 80)
. The first index is card
.
Interesting. I've never noticed it failing locally, but I have seen the github actions version time out occasionally.
My guess is that the matrices get really large when a single node have many parents, because the number of possible parent outcomes grows exponentially with the number of parents. Probably we should add a max_degree parameter to random_cid, and avoid adding edges going into nodes with many parents.
Alternatively/additionally, we can set the random seed in the test
Also, I did just push an improvement to random_cpd which I think should generally lead to smaller matrices. So with a bit of luck, the problem has already been solved.