neuralcoref
neuralcoref copied to clipboard
Makes prediction work on GPUs
When you use numpy.zeros
to create the embed_arr
, you can't later add it to embed_vector
, because embed_vector
might not be a numpy array. This re-works the code such that the array type of embed_vector
is preserved all the way through.
I stole this approach from https://github.com/explosion/spaCy/pull/3362. Thanks, @danielkingai2!
To help those who Google, the error message you get ends with
File "neuralcoref.pyx", line 596, in neuralcoref.neuralcoref.NeuralCoref.__call__
File "neuralcoref.pyx", line 723, in neuralcoref.neuralcoref.NeuralCoref.predict
File "neuralcoref.pyx", line 913, in neuralcoref.neuralcoref.NeuralCoref.get_mention_embeddings
File "neuralcoref.pyx", line 904, in neuralcoref.neuralcoref.NeuralCoref.get_average_embedding
File "cupy/core/core.pyx", line 1238, in cupy.core.core.ndarray.__array_ufunc__
File "cupy/core/_kernel.pyx", line 816, in cupy.core._kernel.ufunc.__call__
File "cupy/core/_kernel.pyx", line 99, in cupy.core._kernel._preprocess_args
TypeError: Unsupported type <class 'numpy.ndarray'>
Though it does seem there are other problems too when you run on GPU.
I fixed all the other problems I am aware of at this point. On my machine, it runs about 8x faster on one GPU.
@thomwolf Can you take a look at this PR? I am trying to use neuralcoref
on a large corpus but w/o GPU it's taking too much of time.
I could get it to work on GPU using this fix, thanks @dirkgr!
Hi guys this is awesome work is it possible to merge the PR ?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi guys this is awesome work is it possible to merge the PR ?
Looks like the answer is "no".
Sorry for the late response to this. The PR closed automatically but I think this is valuable work so I reopened. We're probably going to work on a closer integration with spaCy
& thinc
too.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Any updates on this?
Would love some updates on this!
I want to keep this PR open as the code may be useful for those who want to build from source and try this out.
However moving forward, when spaCy v.3 will be released, we'll update this code significantly to be compatible with Thinc 8. At that point, GPU support will be automatic...
For those trying to make the pipeline work with GPU support on spaCy > 2.1, here's the additional step for patching prior to installing from source (after activating your venv and pip install
ing spaCy):
git clone https://github.com/huggingface/neuralcoref.git
cd neuralcoref
git fetch origin pull/149/head:gpufix
git checkout gpufix
pip install -r requirements.txt
pip install -e .
Any progress on spaCy v.3 integration?
You mean the v3 that was released yesterday? ;-)
It's definitely on our roadmap, but it's not the only thing we're working on ;-)
@svlandeg Friendly ping :)
Hi! Please refer to https://github.com/huggingface/neuralcoref/issues/295#issuecomment-859005189 for more info :-)
@svlandeg Thx for the update, I'm curious wether the updated implementation target the latest state of the art (cf https://github.com/huggingface/neuralcoref/issues/334 ) AKA 81% accuracy on Ontonotes (or at least 79% since the second paper is quite old already)