pytorch_geometric icon indicating copy to clipboard operation
pytorch_geometric copied to clipboard

Segmentation Fault

Open chituma110 opened this issue 5 years ago • 18 comments

I have run the six scripts in the examples folder. The arma.py, rgcn.py, sgc.py ere fine, however the cora.py, gat.py and gat.py were failed.

Error: Segmentation fault (core dumped)

System: python ==3.6.3 pytorch == 1.0.1.post2 torch-geometric == 1.0.3

ubuntu 14.04, CUDA=8.0 cudnn=7.1.3

chituma110 avatar Mar 11 '19 07:03 chituma110

Can you run the test suite of torch-scatter?

rusty1s avatar Mar 11 '19 07:03 rusty1s

Thank you for your reply: I have run the following test code of torch-scatter:

input:

from torch_scatter import scatter_max import torch src = torch.tensor([[2,0,1,4,3],[0,2,1,3,4]]) index = torch.tensor([[4,5,4,2,3],[0,0,2,2,1]]) out, argmax = scatter_max(src, index)

output: Segmentation fault (core dumped)

How can I solve the problem?

chituma110 avatar Mar 11 '19 07:03 chituma110

Can you show me the output of a clean torch-scatter installation?

rm -rf build/ && python setup.py install

rusty1s avatar Mar 11 '19 09:03 rusty1s

running install running bdist_egg running egg_info writing torch_scatter.egg-info/PKG-INFO writing dependency_links to torch_scatter.egg-info/dependency_links.txt writing top-level names to torch_scatter.egg-info/top_level.txt reading manifest file 'torch_scatter.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' writing manifest file 'torch_scatter.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py creating build creating build/lib.linux-x86_64-3.6 creating build/lib.linux-x86_64-3.6/test copying test/init.py -> build/lib.linux-x86_64-3.6/test copying test/test_backward.py -> build/lib.linux-x86_64-3.6/test copying test/test_forward.py -> build/lib.linux-x86_64-3.6/test copying test/test_multi_gpu.py -> build/lib.linux-x86_64-3.6/test copying test/test_std.py -> build/lib.linux-x86_64-3.6/test copying test/utils.py -> build/lib.linux-x86_64-3.6/test creating build/lib.linux-x86_64-3.6/torch_scatter copying torch_scatter/init.py -> build/lib.linux-x86_64-3.6/torch_scatter copying torch_scatter/add.py -> build/lib.linux-x86_64-3.6/torch_scatter copying torch_scatter/div.py -> build/lib.linux-x86_64-3.6/torch_scatter copying torch_scatter/max.py -> build/lib.linux-x86_64-3.6/torch_scatter copying torch_scatter/mean.py -> build/lib.linux-x86_64-3.6/torch_scatter copying torch_scatter/min.py -> build/lib.linux-x86_64-3.6/torch_scatter copying torch_scatter/mul.py -> build/lib.linux-x86_64-3.6/torch_scatter copying torch_scatter/std.py -> build/lib.linux-x86_64-3.6/torch_scatter copying torch_scatter/sub.py -> build/lib.linux-x86_64-3.6/torch_scatter creating build/lib.linux-x86_64-3.6/torch_scatter/utils copying torch_scatter/utils/init.py -> build/lib.linux-x86_64-3.6/torch_scatter/utils copying torch_scatter/utils/ext.py -> build/lib.linux-x86_64-3.6/torch_scatter/utils copying torch_scatter/utils/gen.py -> build/lib.linux-x86_64-3.6/torch_scatter/utils running build_ext building 'torch_scatter.scatter_cpu' extension creating build/temp.linux-x86_64-3.6 creating build/temp.linux-x86_64-3.6/cpu gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include/TH -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include/THC -I/home/amax/anaconda2/envs/pytorch10_python3/include/python3.6m -c cpu/scatter.cpp -o build/temp.linux-x86_64-3.6/cpu/scatter.o -Wno-unused-variable -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=scatter_cpu -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option \u2018-Wstrict-prototypes\u2019 is valid for C/ObjC but not for C++ g++ -pthread -shared -L/home/amax/anaconda2/envs/pytorch10_python3/lib -Wl,-rpath=/home/amax/anaconda2/envs/pytorch10_python3/lib,--no-as-needed -L/home/amax/anaconda2/envs/pytorch10_python3/lib -Wl,-rpath=/home/amax/anaconda2/envs/pytorch10_python3/lib,--no-as-needed build/temp.linux-x86_64-3.6/cpu/scatter.o -L/home/amax/anaconda2/envs/pytorch10_python3/lib -lpython3.6m -o build/lib.linux-x86_64-3.6/torch_scatter/scatter_cpu.cpython-36m-x86_64-linux-gnu.so building 'torch_scatter.scatter_cuda' extension creating build/temp.linux-x86_64-3.6/cuda gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include/TH -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/amax/anaconda2/envs/pytorch10_python3/include/python3.6m -c cuda/scatter.cpp -o build/temp.linux-x86_64-3.6/cuda/scatter.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=scatter_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 cc1plus: warning: command line option \u2018-Wstrict-prototypes\u2019 is valid for C/ObjC but not for C++ /usr/local/cuda/bin/nvcc -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include/torch/csrc/api/include -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include/TH -I/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/amax/anaconda2/envs/pytorch10_python3/include/python3.6m -c cuda/scatter_kernel.cu -o build/temp.linux-x86_64-3.6/cuda/scatter_kernel.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=scatter_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning). g++ -pthread -shared -L/home/amax/anaconda2/envs/pytorch10_python3/lib -Wl,-rpath=/home/amax/anaconda2/envs/pytorch10_python3/lib,--no-as-needed -L/home/amax/anaconda2/envs/pytorch10_python3/lib -Wl,-rpath=/home/amax/anaconda2/envs/pytorch10_python3/lib,--no-as-needed build/temp.linux-x86_64-3.6/cuda/scatter.o build/temp.linux-x86_64-3.6/cuda/scatter_kernel.o -L/usr/local/cuda/lib64 -L/home/amax/anaconda2/envs/pytorch10_python3/lib -lcudart -lpython3.6m -o build/lib.linux-x86_64-3.6/torch_scatter/scatter_cuda.cpython-36m-x86_64-linux-gnu.so creating build/bdist.linux-x86_64 creating build/bdist.linux-x86_64/egg creating build/bdist.linux-x86_64/egg/test copying build/lib.linux-x86_64-3.6/test/init.py -> build/bdist.linux-x86_64/egg/test copying build/lib.linux-x86_64-3.6/test/test_backward.py -> build/bdist.linux-x86_64/egg/test copying build/lib.linux-x86_64-3.6/test/test_forward.py -> build/bdist.linux-x86_64/egg/test copying build/lib.linux-x86_64-3.6/test/test_multi_gpu.py -> build/bdist.linux-x86_64/egg/test copying build/lib.linux-x86_64-3.6/test/test_std.py -> build/bdist.linux-x86_64/egg/test copying build/lib.linux-x86_64-3.6/test/utils.py -> build/bdist.linux-x86_64/egg/test creating build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/init.py -> build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/add.py -> build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/div.py -> build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/max.py -> build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/mean.py -> build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/min.py -> build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/mul.py -> build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/scatter_cpu.cpython-36m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/scatter_cuda.cpython-36m-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/std.py -> build/bdist.linux-x86_64/egg/torch_scatter copying build/lib.linux-x86_64-3.6/torch_scatter/sub.py -> build/bdist.linux-x86_64/egg/torch_scatter creating build/bdist.linux-x86_64/egg/torch_scatter/utils copying build/lib.linux-x86_64-3.6/torch_scatter/utils/init.py -> build/bdist.linux-x86_64/egg/torch_scatter/utils copying build/lib.linux-x86_64-3.6/torch_scatter/utils/ext.py -> build/bdist.linux-x86_64/egg/torch_scatter/utils copying build/lib.linux-x86_64-3.6/torch_scatter/utils/gen.py -> build/bdist.linux-x86_64/egg/torch_scatter/utils byte-compiling build/bdist.linux-x86_64/egg/test/init.py to init.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/test/test_backward.py to test_backward.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/test/test_forward.py to test_forward.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/test/test_multi_gpu.py to test_multi_gpu.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/test/test_std.py to test_std.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/test/utils.py to utils.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/init.py to init.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/add.py to add.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/div.py to div.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/max.py to max.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/mean.py to mean.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/min.py to min.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/mul.py to mul.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/std.py to std.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/sub.py to sub.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/utils/init.py to init.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/utils/ext.py to ext.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/utils/gen.py to gen.cpython-36.pyc creating stub loader for torch_scatter/scatter_cpu.cpython-36m-x86_64-linux-gnu.so creating stub loader for torch_scatter/scatter_cuda.cpython-36m-x86_64-linux-gnu.so byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/scatter_cpu.py to scatter_cpu.cpython-36.pyc byte-compiling build/bdist.linux-x86_64/egg/torch_scatter/scatter_cuda.py to scatter_cuda.cpython-36.pyc creating build/bdist.linux-x86_64/egg/EGG-INFO copying torch_scatter.egg-info/PKG-INFO -> build/bdist.linux-x86_64/egg/EGG-INFO copying torch_scatter.egg-info/SOURCES.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying torch_scatter.egg-info/dependency_links.txt -> build/bdist.linux-x86_64/egg/EGG-INFO copying torch_scatter.egg-info/top_level.txt -> build/bdist.linux-x86_64/egg/EGG-INFO writing build/bdist.linux-x86_64/egg/EGG-INFO/native_libs.txt zip_safe flag not set; analyzing archive contents... torch_scatter.pycache.scatter_cpu.cpython-36: module references file torch_scatter.pycache.scatter_cuda.cpython-36: module references file creating 'dist/torch_scatter-1.1.2-py3.6-linux-x86_64.egg' and adding 'build/bdist.linux-x86_64/egg' to it removing 'build/bdist.linux-x86_64/egg' (and everything under it) Processing torch_scatter-1.1.2-py3.6-linux-x86_64.egg removing '/home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch_scatter-1.1.2-py3.6-linux-x86_64.egg' (and everything under it) creating /home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch_scatter-1.1.2-py3.6-linux-x86_64.egg Extracting torch_scatter-1.1.2-py3.6-linux-x86_64.egg to /home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages torch-scatter 1.1.2 is already the active version in easy-install.pth

Installed /home/amax/anaconda2/envs/pytorch10_python3/lib/python3.6/site-packages/torch_scatter-1.1.2-py3.6-linux-x86_64.egg Processing dependencies for torch-scatter==1.1.2 Finished processing dependencies for torch-scatter==1.1.2

chituma110 avatar Mar 11 '19 09:03 chituma110

I created a FAQ for common installation errors. Let me know if this fixes your issues.

rusty1s avatar Mar 11 '19 20:03 rusty1s

I ran into the same error "Segmentation fault (core dumped)" for some of the examples.

jzhou316 avatar Mar 12 '19 19:03 jzhou316

The error "Segmentation fault" occurred for gat.py

ez4lionky avatar Mar 15 '19 09:03 ez4lionky

I ran into the same error, some examples worked fine and some not. I also tried to re-install the torch-scatter module but also got Segmentation Fault. How can I solve this problem?

weitianxin avatar Apr 08 '19 08:04 weitianxin

I believe all of the segfaults are caused by building with an older version of gcc. I'm using a conda environment and did conda install gcc_linux-64 gxx_linux-64 and reinstalled pytorch and each of pytorch geometric libraries.

dblakely avatar Apr 27 '19 21:04 dblakely

I found the "Segmentation Fault" happens in particular with scatter_max function. That's why gat.py, etc. failed.

jzhou316 avatar May 21 '19 04:05 jzhou316

Hi, I met this error while my server gcc version is : 4.8.5. When i update it using conda install -c omgarcia gcc-6 and just re-install torch-scatter, torch-sparse, torch-cluster, torch-geometric only, everything goes well.

dragen1860 avatar May 22 '19 06:05 dragen1860

I believe all of the segfaults are caused by building with an older version of gcc. I'm using a conda environment and did conda install gcc_linux-64 gxx_linux-64 and reinstalled pytorch and each of pytorch geometric libraries.

Thanks a lot! This really helps.

ChandlerBang avatar Jun 05 '19 03:06 ChandlerBang

After doing conda install gcc_linux-64 gxx_linux-64 the gcc is not found, can some tell me how can i activate the gcc installed?

yrf1 avatar Oct 11 '19 00:10 yrf1

You can try CC=... CXX=... pip install torch-scatter to explicitly set your path to gcc.

rusty1s avatar Oct 11 '19 04:10 rusty1s

After doing conda install gcc_linux-64 gxx_linux-64 the gcc is not found, can some tell me how can i activate the gcc installed?

I ran into the same error, how did you fix it?

prokia avatar Nov 01 '19 09:11 prokia

I found the "Segmentation Fault" happens in particular with scatter_max function. That's why gat.py, etc. failed.

@jzhou316 I met the same problem. Do you know how to address it? Thank you!

bbjy avatar Oct 31 '21 08:10 bbjy

Can you ensure that scatter_max works for you?

import torch
from torch_scatter import scatter_max

src = torch.tensor([[2, 0, 1, 4, 3], [0, 2, 1, 3, 4]]).cuda()
index = torch.tensor([[4, 5, 4, 2, 3], [0, 0, 2, 2, 1]]).cuda()

out, argmax = scatter_max(src, index, dim=-1)

rusty1s avatar Nov 02 '21 07:11 rusty1s

In my case I had to include /usr/local/cuda/bin in my path and reinstall torch-scatter and torch-sparse and then gatconv works.

thunderock avatar Sep 14 '22 16:09 thunderock

Issue:

After installing pytorch and pyg in a fresh conda environment following quickstart docs, I got segmentation fault.

System: Ubuntu with CUDA 11.7, pytorch=1.13.1

Solved by the following script in another freshly created environment:

conda create -n pmg-2023 python=3.8
conda activate pgm-2023
conda install gcc_linux-64 gxx_linux-64
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
pip install pyg-lib torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.13.0+cu117.html

Note that I did not modify the pip install from quickstart page. It may be better to have 1.13.1 there since I installed pytorch 1.13.1

Updating gcc via conda does not fix my broken old environment with segfault.

conda install gcc_linux-64 gxx_linux-64

Then

conda uninstall pytorch
conda uninstall pyg

Finally, reinstall again

conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
conda install pyg -c pyg

segmentation fault still shows up in the same conda environment.

dummyindex avatar Jan 24 '23 09:01 dummyindex

Also experiencing this

kylesargent avatar May 28 '24 02:05 kylesargent