spaCy icon indicating copy to clipboard operation
spaCy copied to clipboard

💫 Industrial-strength Natural Language Processing (NLP) in Python

Results 210 spaCy issues
Sort by recently updated
recently updated
newest added

## Description Remove default stop words Stop words are task-specific and attempting to maintain "general-purpose" stop word lists for many different languages is not feasible. None of the underlying functionality...

enhancement
🔜 v4.0

## Description Refactor pipe(as_tuples) into a separate method ### Types of change ? ## Checklist - [x] I confirm that I have the right to submit this contribution under the...

🔜 v4.0

Hello, I've recently upgraded the spaCy pretrained models from v3.2 to 3.4, but I found that the tagger and lemmatizer performance dropped significantly for italian and spanish. I've prepared a...

lang / it
feat / tagger
feat / lemmatizer
perf / accuracy

## How to reproduce the behaviour Download https://www.gutenberg.org/files/1342/1342-0.txt — Pride & Prejudice, about 0.8MB. Then run: ```python import spacy nlp = spacy.load("en_core_web_sm") with open("./1342-0.txt") as f: book = f.read() result...

feat / tokenizer
feat / doc
perf / memory
🔜 v4.0

I want to create a custom NER tag using GPT2. I want to use [this model](https://huggingface.co/openai-community/gpt2). I am familiar with SpaCy custom training framework. I formatted the config.cfg file as...

**Description** Build a custom component to: 1. identify coordinations in a document 2. split the coordinations 3. return a new `Doc` object with the split coordinations

enhancement
feat / pipeline

I'm using version 1.3.4 of spacy-transformers but it has incompatibility with the latest version of transformers (4.37.2). Is an update planned? Thanks

feat / transformer

## Description Modify EL batching to work doc-based instead of a mention-based. For prior discussion as to why this is useful see https://github.com/explosion/spaCy/pull/11669#issuecomment-1283666113. Review and merge after https://github.com/explosion/spaCy/pull/12341. Split off...

feat / nel
🔜 v4.0

## Description Adds Azure API key example for LLM configuration. This is useful as it is not the same as the expected OpenAI Azure client variable (`AZURE_OPENAI_API_KEY`) ### Types of...

docs
feat/llm

Extended list of abbreviations in Faroese language extension's tokenizer exceptions. ## Checklist - [x] I confirm that I have the right to submit this contribution under the project's MIT license....

feat / tokenizer
lang / fo