causal-learn icon indicating copy to clipboard operation
causal-learn copied to clipboard

BOSS algorithm

Open kenneth-lee-ch opened this issue 11 months ago • 3 comments

I feed the BOSS() with a continuous dataset and get the following error message about singular matrix, is this expected?

Traceback (most recent call last):
  File "/home/min/a/lee4094/projects/bayesian_rcd/experiment_sockshop.py", line 203, in <module>
    cpdag  = boss(df_n_without_time.to_numpy(), score_func='local_score_BIC')
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/search/PermutationBased/BOSS.py", line 145, in boss
    gsts[v].trace(order[:i], parents[v])
  File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/search/PermutationBased/gst.py", line 69, in trace
    return self.root.trace(prefix, available, parents)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/search/PermutationBased/gst.py", line 44, in trace
    if self.branches is None: self.grow(available, parents)
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/search/PermutationBased/gst.py", line 20, in grow
    score = -self.tree.score.score_nocache(self.tree.vertex, parents)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/score/LocalScoreFunctionClass.py", line 52, in score_nocache
    return self.local_score_fun((self.cov, self.n), i, PAi, self.parameters)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/causallearn/score/LocalScoreFunction.py", line 73, in local_score_BIC_from_cov
    H = np.log(cov[i, i] - yX @ np.linalg.inv(XX) @ yX.T)
                                ^^^^^^^^^^^^^^^^^
  File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/numpy/linalg/linalg.py", line 561, in inv
    ainv = _umath_linalg.inv(a, signature=signature, extobj=extobj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/min/a/lee4094/miniconda3/envs/bayes_rcd/lib/python3.11/site-packages/numpy/linalg/linalg.py", line 112, in _raise_linalgerror_singular
    raise LinAlgError("Singular matrix")
numpy.linalg.LinAlgError: Singular matrix

kenneth-lee-ch avatar Jan 22 '25 21:01 kenneth-lee-ch

I guess it is related to the calculations based on covariance in BIC, which may fail due to singularity. Could you please try adding a small Gaussian noise to the data to see if the singularity persists?

kunwuz avatar Jan 22 '25 21:01 kunwuz

It seems to be the case, but how can I avoid that without adding noise term though?

kenneth-lee-ch avatar Jan 22 '25 23:01 kenneth-lee-ch

Honestly, I'm not sure what would be the best way to avoid it--the algorithm does not work if we have singular covariance, which indicates one or more variables might be redundant. Perhaps you could try removing some variables?

But this is definitely not the optimal solution in practice. We also discussed this in #155, but I still haven't found a good solution. Please let me know if you have any suggestions on this.

kunwuz avatar Jan 23 '25 22:01 kunwuz