spaCy
spaCy copied to clipboard
💫 Industrial-strength Natural Language Processing (NLP) in Python
ERROR is in windows 11 with p ython 3.14 and latest spacy downloadable: D:\work\unir\materias\Procesamiento Lenguaje Natural>python -m spacy download es_core_news_sm C:\Users\Jim\AppData\Roaming\Python\Python314\site-packages\confection\__init__.py:38: UserWarning: Core Pydantic V1 functionality isn't compatible with Python...
## Problem `displaycy.render` seems to be broken when used in Jupyter Mode with a current ipython version. The method yields an `ImportError: cannot import name 'display' from 'IPython.core.display'` when in...
According to the IPython docs(https://ipython.readthedocs.io/en/stable/whatsnew/version7.html#pending-deprecated-imports), `IPython.core.display` was deprecated in IPython 7.14 and should be replace by `IPython.display`. ## How to reproduce the behaviour example code in jupyter: ```python import spacy...
This PR enhances support for Spanish (es) and Portuguese (pt) in their respective `spacy/lang` modules by updating the `lex_attrs.py` files. Each change is accompanied with regression tests in their `test_text.py`...
Noticed the following error on the CI runs, might be due to pydantic incompatibility ? ## How to reproduce the behaviour ```python from spacy.language import Language ``` ### Traceback ```sh...
## Description This PR resolves issue #13883, which caused incorrect sentence segmentation for text containing quoted dialogue, particularly with guillemets (`« »`). ### The Bug The `Sentencizer` did not track...
## Description ### Types of change ## Checklist - [x] I confirm that I have the right to submit this contribution under the project's MIT license. - [x] I ran...
Sentence segmentation doesn't seem to handle guillemets '«' / `»`. I end up with very large sentences merged together when there is dialogue. I see an old pr that added...
## Description ### Types of change ```bash codespell --skip="./spacy/lang/*,./spacy/tests/lang/*,./spacy/util.py,*.json*,*.pyx,*.svg" \ --ignore-words-list=bu,fo,fpr,ines,ist,nam,nd,noo,notin,oder,pres,sie,te,teh,testin,uner,varius \ --write-changes ``` https://pypi.org/project/codespell Unfortunately, the required `black` and `isort` formatting and `mypy` fixes make the diff more difficult...
As it says in the title, the lemmatisation of "whitelisting" is wrong. ## How to reproduce the behaviour ``` >>> import spacy >>> nlp = spacy.load("en_core_web_sm") >>> sent = nlp("I...