datashare icon indicating copy to clipboard operation
datashare copied to clipboard

[POC] implement spaCy nlp pipeline to Datashare in python

Open mvanzalu opened this issue 3 years ago • 1 comments

Is your feature request related to a problem? Please describe. Add a new Nlp pipeline for spaCy which is a powerful library

mvanzalu avatar Sep 15 '22 13:09 mvanzalu

There is a strong architecture decision for this : spacy is written in python. So either you have to call a shell script with java (and assume that spacy is installed in the system or provide ways to install it), either using a JNI interface with C++ bindings (if used in spacy), either calling a server with whatever API (HTTP for ex) and operate the server (i.e. installing, start, stop...).

For example https://github.com/manzurola/spaCy4j is using a spacy server. It is a http server. spaCy4j is using the HTTP API to turn java API calls into HTTP requests.

bamthomas avatar Sep 19 '22 13:09 bamthomas

This issue is stale because it has been open for 40 days with no activity.

github-actions[bot] avatar Nov 11 '22 00:11 github-actions[bot]

Planned for Q4/2022 - Q1/2023.

pirhoo avatar Nov 14 '22 10:11 pirhoo

This issue is stale because it has been open for 40 days with no activity.

github-actions[bot] avatar Dec 25 '22 00:12 github-actions[bot]

This issue was closed because it has been inactive for 20 days since being marked as stale.

github-actions[bot] avatar Jan 15 '23 00:01 github-actions[bot]