lingam icon indicating copy to clipboard operation
lingam copied to clipboard

Pruning uses the maximum number of lags not the selected number of lags

Open paullabonne opened this issue 8 months ago • 1 comments

Hello,

Thank you very much for this work.

I believe there is an issue with the pruning in the VARLiNGAM code. Two models with the same lags and trained on the same data should give identical results, regardless of the initial maximum number of lags. While this is true if prune=False, adjacency matrices can differ if prune=True. See:

from lingam import VARLiNGAM
X = pd.read_csv('examples/data/sample_data_var_lingam.csv')

model1 = VARLiNGAM(lags = 10)
model1.fit(X)
print(f"model1 has {model1._lags} lags") # 1 lag 

model2 = VARLiNGAM(lags = 1)
model2.fit(X)
print(f"model2 has {model2._lags} lags") # 1 lag 

print(model1.adjacency_matrices_ - model2.adjacency_matrices_)

Running the same code with prune=False will give identical adjacency matrices.

This issue arises because the number of lags used in the lasso procedure is the initial maximum number lags, not the one selected with the criteria.

Moving this line https://github.com/cdt15/lingam/blob/1495ba515024a27d0ea0cabbc2e15d4aee76823a/lingam/var_lingam.py#L109

before this one https://github.com/cdt15/lingam/blob/1495ba515024a27d0ea0cabbc2e15d4aee76823a/lingam/var_lingam.py#L106

should solve the problem.

Apologies if I’ve misunderstood something.

Many thanks, Paul

paullabonne avatar May 13 '25 22:05 paullabonne

Hi, @paullabonne .

Thanks for reporting and analyzing this problem. As you pointed out, we have confirmed that the reference to the number of lags is incorrect when pruning edges. I'll fix it in the next few days (or you can send us a pull request).

ikeuchi-screen avatar May 15 '25 04:05 ikeuchi-screen