sense2vec icon indicating copy to clipboard operation
sense2vec copied to clipboard

Prodigy and version of sense2vec - process is constantly killed

Open kuatroka opened this issue 5 years ago • 3 comments

Hi, When I follow this tutorial on how to combine Prodigy and the 2019 version of Sense2vec

I constantly get CLI message "killed" with no further description on what to do to correct it. This only happens with the s2v_reddit_2019_lg/s2v_reddit_2019_lg version. The s2v_reddit_2015_md/s2v_old is working perfectly with the same parameters

In CLI I run prodigy sense2vec.teach ner-client-dataset ./assets/s2v_reddit_2019_lg/s2v_reddit_2019_lg --seeds "Walmart, Apple"

and I get Killed

When I use prodigy sense2vec.teach ner-client-dataset ./assets/s2v_reddit_2015_md/s2v_old --seeds "Walmart, Apple" all works fine

Thanks

kuatroka avatar Dec 15 '20 20:12 kuatroka

Hey it gets killed most likey due to memory issues, the 2015 edition is just a gig, while the 2019 verson is 3.9gb in size alone. So there's a lot more of memory usage and when the resources get exhausted the system terminates the process.

abishekvashok avatar May 07 '21 12:05 abishekvashok

I have the same problem! I have trained my own S2V, but as soon as I run it, it kill the kernel

myeghaneh avatar Oct 15 '21 14:10 myeghaneh

This is essentially a RAM-related issue. You need lots of RAM. We were having the same problem and we tackled it using a dedicated server from Hetzner. They have some 512 GB RAM boxes in their "Server Auction" section which are pretty cost-effective.

corradofiore avatar Jul 21 '23 04:07 corradofiore