scipy.sparse.linalg._eigen.arpack.arpack.ArpackError
🐛 Describe the bug
transform = T.Compose([T.RemoveIsolatedNodes() ,T.AddSelfLoops(), T.AddRandomWalkPE(20,attr_name='x'), T.ToSparseTensor()])
T.AddRandomWalkPE() transform works good, when I train the GIN model.
transform = T.Compose([T.RemoveIsolatedNodes() ,T.AddSelfLoops(), T.AddLaplacianEigenvectorPE(3,attr_name='x'), T.ToSparseTensor()])
But, T.AddLaplacianEigenvectorPE got the following errors.
Please help me, thank you!
Namespace(dataset='MalNetTiny', batch_size=256, hidden_channels=64, num_layers=5, lr=0.0001, epochs=500, wandb='True', transform='LEPE')
cuda
Use LEPE node feautre!
Traceback (most recent call last):
File "/home/xxx/Projects/new_ideas/main.py", line 121, in <module>
model = Net(train_dataset.num_features, args.hidden_channels, train_dataset.num_classes, args.num_layers).to(device)
File "/home/xxx/miniconda3/envs/malnet/lib/python3.10/site-packages/torch_geometric/data/in_memory_dataset.py", line 66, in num_classes
return super().num_classes
File "/home/xxx/miniconda3/envs/malnet/lib/python3.10/site-packages/torch_geometric/data/dataset.py", line 159, in num_classes
data_list = _get_flattened_data_list([data for data in self])
File "/home/xxx/miniconda3/envs/malnet/lib/python3.10/site-packages/torch_geometric/data/dataset.py", line 159, in <listcomp>
data_list = _get_flattened_data_list([data for data in self])
File "/home/xxx/miniconda3/envs/malnet/lib/python3.10/site-packages/torch_geometric/data/dataset.py", line 259, in __getitem__
data = data if self.transform is None else self.transform(data)
File "/home/xxx/miniconda3/envs/malnet/lib/python3.10/site-packages/torch_geometric/transforms/compose.py", line 24, in __call__
data = transform(data)
File "/home/xxx/miniconda3/envs/malnet/lib/python3.10/site-packages/torch_geometric/transforms/add_positional_encoding.py", line 79, in __call__
eig_vals, eig_vecs = eig_fn(
File "/home/xxx/miniconda3/envs/malnet/lib/python3.10/site-packages/scipy/sparse/linalg/_eigen/arpack/arpack.py", line 1354, in eigs
return params.extract(return_eigenvectors)
File "/home/xxx/miniconda3/envs/malnet/lib/python3.10/site-packages/scipy/sparse/linalg/_eigen/arpack/arpack.py", line 782, in extract
raise ArpackError(ierr, infodict=self.extract_infodict)
scipy.sparse.linalg._eigen.arpack.arpack.ArpackError: ARPACK error 1: The Schur form computed by LAPACK routine slahqr could not be reordered by LAPACK routine strsen . Re-enter subroutine dneupd with IPARAM(5)=NCV and increase the size of the arrays DR and DI to have dimension at least dimension NCV and allocate at least NCV columns for Z. NOTE: Not necessary if Z and V share the same space. Please notify the authors if this error occurs.
Environment
No response
Sorry for late response. Do you have a minimal example to reproduce? This may look dataset specific as I cannot reproduce it in a test.
Sorry for late response. Do you have a minimal example to reproduce? This may look dataset specific as I cannot reproduce it in a test.
Thank you for your response. I give you a minimal example.
First Net model get following errors:
TypeError: Cannot use scipy.linalg.eig for sparse A with k >= N - 1. Use scipy.linalg.eig(A.toarray()) or reduce k.
Second Net model get following errors:
ArpackError: ARPACK error 1: The Schur form computed by LAPACK routine slahqr could not be reordered by LAPACK routine strsen . Re-enter subroutine dneupd with IPARAM(5)=NCV and increase the size of the arrays DR and DI to have dimension at least dimension NCV and allocate at least NCV columns for Z. NOTE: Not necessary if Z and V share the same space. Please notify the authors if this error occurs.
import torch
from torch_geometric.datasets import MalNetTiny
import torch.nn.functional as F
from torch.nn import BatchNorm1d, Linear, ReLU, Sequential
from torch_geometric.loader import DataLoader
from torch_geometric.logging import init_wandb, log
from torch_geometric.nn import MLP, GINConv, global_add_pool
from torch.nn import BatchNorm1d as BatchNorm
import torch_geometric.transforms as T
from torch_scatter import segment_csr
from torch_geometric.data import Data
from torch_geometric.testing import onlyLinux
from torch_geometric.transforms import (
AddLaplacianEigenvectorPE,
AddRandomWalkPE,
LocalDegreeProfile,
)
# AddLaplacianEigenvectorPE does not work
transform = T.Compose([T.RemoveIsolatedNodes() ,T.AddSelfLoops(), T.AddLaplacianEigenvectorPE(3,attr_name='x'),T.ToSparseTensor()])
# AddRandomWalkPE works
#transform = T.Compose([T.RemoveIsolatedNodes() ,T.AddSelfLoops(), T.AddRandomWalkPE(5,attr_name='x'),T.ToSparseTensor()])
train_dataset = MalNetTiny(root='/home/xxx/Datasets/MalNetTiny', split='train', transform=transform)
val_dataset = MalNetTiny(root='/home/xxx/Datasets/MalNetTiny', split='val', transform=transform)
test_dataset = MalNetTiny(root='/home/xxx/Datasets/MalNetTiny', split='test', transform=transform)
train_loader = DataLoader(train_dataset, 256, shuffle=True, pin_memory=True)
val_loader = DataLoader(val_dataset, 256, shuffle=False)
test_loader = DataLoader(test_dataset, 256, shuffle=False)
# class Net(torch.nn.Module):
# def __init__(self, in_channels, hidden_channels, out_channels, num_layers):
# super().__init__()
# self.convs = torch.nn.ModuleList()
# self.batch_norms = torch.nn.ModuleList()
# for i in range(num_layers):
# mlp = Sequential(
# Linear(in_channels, 2 * hidden_channels),
# BatchNorm(2 * hidden_channels),
# ReLU(),
# Linear(2 * hidden_channels, hidden_channels),
# )
# conv = GINConv(mlp, train_eps=False).jittable()
# self.convs.append(conv)
# self.batch_norms.append(BatchNorm(hidden_channels))
# in_channels = hidden_channels
# self.lin1 = Linear(hidden_channels, hidden_channels)
# self.batch_norm1 = BatchNorm(hidden_channels)
# self.lin2 = Linear(hidden_channels, out_channels)
# def forward(self, x, adj_t, batch):
# for conv, batch_norm in zip(self.convs, self.batch_norms):
# x = F.relu(batch_norm(conv(x, adj_t)))
# #x = global_add_pool(x, batch)
# x = segment_csr(x, batch)
# x = F.relu(self.batch_norm1(self.lin1(x)))
# x = F.dropout(x, p=0.5, training=self.training)
# x = self.lin2(x)
# return F.log_softmax(x, dim=-1)
class Net(torch.nn.Module):
def __init__(self, in_channels, hidden_channels, out_channels, num_layers):
super().__init__()
self.convs = torch.nn.ModuleList()
for _ in range(num_layers):
mlp = MLP([in_channels, hidden_channels, hidden_channels])
self.convs.append(GINConv(nn=mlp, train_eps=False))
in_channels = hidden_channels
self.mlp = MLP([hidden_channels, hidden_channels, out_channels],
norm=None, dropout=0.5)
def forward(self, x, edge_index, batch):
for conv in self.convs:
x = conv(x, edge_index).relu()
x = segment_csr(x, batch)
return self.mlp(x)
num_classes = 5
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = Net(train_dataset.num_features, 64, num_classes, 5).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)
def train():
model.train()
total_loss = 0
for data in train_loader:
data = data.to(device)
optimizer.zero_grad()
out = model(data.x, data.adj_t, data.ptr)
loss = F.cross_entropy(out, data.y)
loss.backward()
optimizer.step()
total_loss += float(loss) * data.num_graphs
return total_loss / len(train_loader.dataset)
@torch.no_grad()
def test(loader):
model.eval()
total_correct = 0
for data in loader:
data = data.to(device)
pred = model(data.x, data.adj_t, data.ptr).argmax(dim=-1)
total_correct += int((pred == data.y).sum())
return total_correct / len(loader.dataset)
for epoch in range(1, 5 + 1):
loss = train()
test_acc = test(test_loader)
log(Epoch=epoch, Loss=loss,Test=test_acc)
I also have a lightning code.
import os.path as osp
import pytorch_lightning as pl
import torch
import torch.nn.functional as F
from torchmetrics import Accuracy
import torch_geometric.transforms as T
from torch_geometric.data.lightning import LightningDataset
from torch_geometric.datasets import TUDataset
from torch_geometric.nn import GIN, MLP
from torch_geometric.datasets import MalNetTiny
import torch_geometric.transforms as T
from torch_scatter import segment_csr
from pytorch_lightning import seed_everything
torch.set_float32_matmul_precision('high')
import warnings
warnings.filterwarnings('ignore', category=UserWarning, message='TypedStorage is deprecated')
from torch.nn import Linear, ReLU, Sequential
from torch.nn import BatchNorm1d as BatchNorm
from torch_geometric.nn import GINConv
class Model(pl.LightningModule):
def __init__(self, in_channels: int, out_channels: int,
hidden_channels: int = 64, num_layers: int = 5,
dropout: float = 0.5):
super().__init__()
self.convs = torch.nn.ModuleList()
self.batch_norms = torch.nn.ModuleList()
for i in range(num_layers):
mlp = Sequential(
Linear(in_channels, 2 * hidden_channels),
BatchNorm(2 * hidden_channels),
ReLU(),
Linear(2 * hidden_channels, hidden_channels),
)
conv = GINConv(mlp, train_eps=False)#.jittable()
self.convs.append(conv)
self.batch_norms.append(BatchNorm(hidden_channels))
in_channels = hidden_channels
self.lin1 = Linear(hidden_channels, hidden_channels)
self.batch_norm1 = BatchNorm(hidden_channels)
self.lin2 = Linear(hidden_channels, out_channels)
self.train_acc = Accuracy(task='multiclass', num_classes=out_channels)
self.val_acc = Accuracy(task='multiclass', num_classes=out_channels)
self.test_acc = Accuracy(task='multiclass', num_classes=out_channels)
def forward(self, x, adj_t, batch):
for conv, batch_norm in zip(self.convs, self.batch_norms):
x = F.relu(batch_norm(conv(x, adj_t)))
#x = global_add_pool(x, batch)
x = segment_csr(x, batch)
x = F.relu(self.batch_norm1(self.lin1(x)))
x = F.dropout(x, p=0.5, training=self.training)
x = self.lin2(x)
return F.log_softmax(x, dim=-1)
def training_step(self, data, batch_idx):
y_hat = self(data.x, data.adj_t, data.ptr)
loss = F.cross_entropy(y_hat, data.y)
self.train_acc(y_hat.softmax(dim=-1), data.y)
self.log('train_acc', self.train_acc, prog_bar=True, on_step=False,
on_epoch=True)
return loss
def validation_step(self, data, batch_idx):
y_hat = self(data.x, data.adj_t, data.ptr)
self.val_acc(y_hat.softmax(dim=-1), data.y)
self.log('val_acc', self.val_acc, prog_bar=True, on_step=False,
on_epoch=True)
def test_step(self, data, batch_idx):
y_hat = self(data.x, data.adj_t, data.ptr)
self.test_acc(y_hat.softmax(dim=-1), data.y)
self.log('test_acc', self.test_acc, prog_bar=True, on_step=False,
on_epoch=True)
def configure_optimizers(self):
return torch.optim.Adam(self.parameters(), lr=0.0001)
if __name__ == '__main__':
seed_everything(0, workers=True)
#transform = T.Compose([T.RemoveIsolatedNodes() ,T.AddSelfLoops(), T.LocalDegreeProfile(), T.ToSparseTensor()])
transform = T.Compose([T.RemoveIsolatedNodes() ,T.AddSelfLoops(), T.AddLaplacianEigenvectorPE(3,attr_name='x'), T.ToSparseTensor()])
train_dataset = MalNetTiny(root='/home/xxx/Datasets/MalNetTiny', split='train', transform=transform).shuffle()
val_dataset = MalNetTiny(root='/home/xxx/Datasets/MalNetTiny', split='val', transform=transform)
test_dataset = MalNetTiny(root='/home/xxx/Datasets/MalNetTiny', split='test', transform=transform)
datamodule = LightningDataset(train_dataset, val_dataset, test_dataset,
batch_size=256, num_workers=14)
model = Model(test_dataset.num_node_features, test_dataset.num_classes)
print(model)
devices = torch.cuda.device_count()
strategy = pl.strategies.DDPStrategy(accelerator='gpu')
checkpoint = pl.callbacks.ModelCheckpoint(monitor='val_acc', save_top_k=1,
mode='max')
trainer = pl.Trainer(strategy=strategy, devices=devices, max_epochs=500,
log_every_n_steps=5, callbacks=[checkpoint],deterministic=True)
trainer.fit(model, datamodule)
trainer.test(ckpt_path='best', datamodule=datamodule)
errors:
scipy.sparse.linalg._eigen.arpack.arpack.ArpackError: ARPACK error 1: The Schur form computed by LAPACK routine slahqr could not be reordered by LAPACK routine strsen . Re-enter subroutine dneupd with IPARAM(5)=NCV and increase the size of the arrays DR and DI to have dimension at least dimension NCV and allocate at least NCV columns for Z. NOTE: Not necessary if Z and V share the same space. Please notify the authors if this error occurs.
Thanks for the example. I can reproduce it on example 957:
Data(edge_index=[2, 109], y=[1], num_nodes=51)
I am not totally sure why this happens though, but it seems more related to an issue in scipy?
Yes, I think so. I have no idea to solve this issue.
arpack error 3 and arpack error 3: no shifts could be applied during a cycle of the implicitly restarted arnoldi iteration.
I reproduced the same error. However if you convert L to dense, it works. So maybe a try/ except to switch to dense matrix computations. Seems like some convergence issue for Arnoldi iterations and/or the matrix is ill conditioned.