GloVe icon indicating copy to clipboard operation
GloVe copied to clipboard

Segmentation fault in shuffle

Open WasifurRahman opened this issue 8 years ago • 4 comments

I am getting the following message on my ubuntu 14.04 - 5877 Segmentation fault (core dumped) $BUILDDIR/shuffle -memory $MEMORY -verbose $VERBOSE < $COOCCURRENCE_FILE > $COOCCURRENCE_SHUF_FILE

WasifurRahman avatar Sep 30 '16 01:09 WasifurRahman

Interesting. To help debugging, can you modify the demo.sh file on that line to instead read:

gdb -ex=r --args $BUILDDIR/shuffle -memory $MEMORY -verbose $VERBOSE <

$COOCCURRENCE_FILE > $COOCCURRENCE_SHUF_FILE

And when you get the segfault, type bt and paste the results below?

On Thu, Sep 29, 2016 at 6:50 PM, WasifurRahman [email protected] wrote:

I am getting the following message on my ubuntu 14.04 - 5877 Segmentation fault (core dumped) $BUILDDIR/shuffle -memory $MEMORY -verbose $VERBOSE < $COOCCURRENCE_FILE > $COOCCURRENCE_SHUF_FILE

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/GloVe/issues/44, or mute the thread https://github.com/notifications/unsubscribe-auth/ABBSMdOilnU8Y6HDwUthmPMqw43EVgFhks5qvGr1gaJpZM4KKpri .

ghost avatar Sep 30 '16 03:09 ghost

@Russell91 Hi. I started using GloVe today. I'm getting the same error. After modifying the demo.sh file to what you wrote above, I'm getting this output.

grimangel@HP-Pavilion-g6-Notebook-PC:~/Documents/GloVe-master$ ./demo.sh python mkdir -p build gcc src/glove.c -o build/glove -lm -pthread -Ofast -march=native -funroll-loops -Wno-unused-result gcc src/shuffle.c -o build/shuffle -lm -pthread -Ofast -march=native -funroll-loops -Wno-unused-result gcc src/cooccur.c -o build/cooccur -lm -pthread -Ofast -march=native -funroll-loops -Wno-unused-result gcc src/vocab_count.c -o build/vocab_count -lm -pthread -Ofast -march=native -funroll-loops -Wno-unused-result $ build/vocab_count -min-count 5 -verbose 2 < text8 > vocab.txt BUILDING VOCABULARY Processed 17005207 tokens. Counted 253854 unique words. Truncating vocabulary at min count 5. Using vocabulary of size 71290.

$ build/cooccur -memory 4.0 -vocab-file vocab.txt -verbose 2 -window-size 15 < text8 > cooccurrence.bin COUNTING COOCCURRENCES window size: 15 context: symmetric max product: 13752509 overflow length: 38028356 Reading vocab from file "vocab.txt"...loaded 71290 words. Building lookup table...table contains 94990279 elements. Processed 17005206 tokens. Writing cooccurrences to disk.........2 files in total. Merging cooccurrence files: processed 60666466 lines.

$ build/shuffle -memory 4.0 -verbose 2 < cooccurrence.bin > cooccurrence.shuf.bin SHUFFLING COOCCURRENCES array size: 255013683 Shuffling by chunks: processed 0 lines.124 ../sysdeps/x86_64/multiarch/../memcpy.S: No such file or directory. Undefined command: "". Try "help". Undefined command: "fI9". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "1". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Invalid character '�' in expression. Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Invalid character '�' in expression. Undefined command: "". Try "help". Undefined command: "6". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Invalid character '�' in expression. Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". A syntax error in expression, near `}@'. Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined info command: "@". Try "help info". Undefined command: "z". Try "help". Undefined command: "". Try "help". Invalid character '�' in expression. TUI mode not allowed Undefined command: "". Try "help". Undefined command: "". Try "help". Undefined command: "". Try "help". warning: bad breakpoint number at or near '��V�@' Undefined command: "". Try "help". Ambiguous command "w@": . Undefined info command: "׋/s@". Try "help info". Undefined command: "". Try "help". Ambiguous command "o@": . SHUFFLING COOCCURRENCES array size: 127506841 Shuffling by chunks: processed 60664008 lines. Wrote 1 temporary file(s). Merging temp files: processed 0 lines.bt Merging temp files: processed 60664008 lines.

$ build/glove -save-file vectors -threads 8 -input-file cooccurrence.shuf.bin -x-max 10 -iter 15 -vector-size 50 -binary 2 -vocab-file vocab.txt -verbose 2 TRAINING MODEL Read 60664181 lines. Initializing parameters...done. vector size: 50 vocab size: 71290 x_max: 10.000000 alpha: 0.750000 ./demo.sh: line 41: 6599 Segmentation fault (core dumped) $BUILDDIR/glove -save-file $SAVE_FILE -threads $NUM_THREADS -input-file $COOCCURRENCE_SHUF_FILE -x-max $X_MAX -iter $MAX_ITER -vector-size $VECTOR_SIZE -binary $BINARY -vocab-file $VOCAB_FILE -verbose $VERBOSE

vermaarjun7 avatar Nov 06 '16 16:11 vermaarjun7

If you use gdb directly, i.e.

gdb -ex=r build/shuffle -memory 4.0 -verbose 2 < cooccurrence.bin > cooccurrence.shuf.bin

And then type bt after the crash, what line did cooccurrency fail on? Also, what are your system specs?

ghost avatar Nov 20 '16 00:11 ghost

I also see this error.

To get the bt output per @Russell91 's request, I ran gdb build/shuffle and then passed the arguments at the (gdb) prompt using r as below, reproducing the segfault, then ran bt at next gdb prompt:

$ gdb build/shuffle
GNU gdb (GDB) Fedora 7.10.1-31.fc23
[...]
Reading symbols from build/shuffle...done.

(gdb) r -memory 4.0 -verbose 2 < cooccurrence.bin > cooccurrence.shuf.bin

Starting program: /home/user/projs/3rd/GloVe/build/shuffle -memory 4.0 -verbose 2 < cooccurrence.bin > cooccurrence.shuf.bin
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
SHUFFLING COOCCURRENCES
array size: 255013683
Shuffling by chunks: processed 0 lines.
Program received signal SIGSEGV, Segmentation fault.
__memcpy_sse2 () at ../sysdeps/x86_64/memcpy.S:124
124		movq	%rcx,  (%rdi)

(gdb) bt
#0  __memcpy_sse2 () at ../sysdeps/x86_64/memcpy.S:124
#1  0x00007ffff7574563 in __GI__IO_file_xsgetn (fp=0x7ffff78b6900 <_IO_2_1_stdin_>, data=<optimized out>, n=16) at fileops.c:1383
#2  0x00007ffff7569926 in __GI__IO_fread (buf=<optimized out>, size=16, count=1, fp=0x7ffff78b6900 <_IO_2_1_stdin_>) at iofread.c:42
#3  0x0000000000401d94 in shuffle_by_chunks ()
#4  0x00007ffff751b580 in __libc_start_main (main=0x400a80 <main>, argc=5, argv=0x7fffffffe518, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, 
    stack_end=0x7fffffffe508) at libc-start.c:289
#5  0x0000000000400c69 in _start ()

(the output from bt is at the end)

willy-b avatar Jan 19 '17 07:01 willy-b