BOSS algorithm
I feed the BOSS() with a continuous dataset and get the following error message about singular matrix, is this expected?
Traceback (most recent call last):
File "/home/min/a/lee4094/projects/bayesian_rcd/experiment_sockshop.py", line 203, in <module>
cpdag = boss(df_n_without_time.to_numpy(), score_func='local_score_BIC')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/search/PermutationBased/BOSS.py", line 145, in boss
gsts[v].trace(order[:i], parents[v])
File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/search/PermutationBased/gst.py", line 69, in trace
return self.root.trace(prefix, available, parents)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/search/PermutationBased/gst.py", line 44, in trace
if self.branches is None: self.grow(available, parents)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/search/PermutationBased/gst.py", line 20, in grow
score = -self.tree.score.score_nocache(self.tree.vertex, parents)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/score/LocalScoreFunctionClass.py", line 52, in score_nocache
return self.local_score_fun((self.cov, self.n), i, PAi, self.parameters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/score/LocalScoreFunction.py", line 73, in local_score_BIC_from_cov
H = np.log(cov[i, i] - yX @ np.linalg.inv(XX) @ yX.T)
^^^^^^^^^^^^^^^^^
File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/numpy/linalg/linalg.py", line 561, in inv
ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/numpy/linalg/linalg.py", line 112, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix")
numpy.linalg.LinAlgError: Singular matrix
I guess it is related to the calculations based on covariance in BIC, which may fail due to singularity. Could you please try adding a small Gaussian noise to the data to see if the singularity persists?
It seems to be the case, but how can I avoid that without adding noise term though?
Honestly, I'm not sure what would be the best way to avoid it--the algorithm does not work if we have singular covariance, which indicates one or more variables might be redundant. Perhaps you could try removing some variables?
But this is definitely not the optimal solution in practice. We also discussed this in #155, but I still haven't found a good solution. Please let me know if you have any suggestions on this.