datashare
datashare copied to clipboard
[POC] implement spaCy nlp pipeline to Datashare in python
Is your feature request related to a problem? Please describe. Add a new Nlp pipeline for spaCy which is a powerful library
There is a strong architecture decision for this : spacy is written in python. So either you have to call a shell script with java (and assume that spacy is installed in the system or provide ways to install it), either using a JNI interface with C++ bindings (if used in spacy), either calling a server with whatever API (HTTP for ex) and operate the server (i.e. installing, start, stop...).
For example https://github.com/manzurola/spaCy4j is using a spacy server. It is a http server. spaCy4j is using the HTTP API to turn java API calls into HTTP requests.
This issue is stale because it has been open for 40 days with no activity.
Planned for Q4/2022 - Q1/2023.
This issue is stale because it has been open for 40 days with no activity.
This issue was closed because it has been inactive for 20 days since being marked as stale.