spaCy icon indicating copy to clipboard operation
spaCy copied to clipboard

💫 Industrial-strength Natural Language Processing (NLP) in Python

Results 210 spaCy issues
Sort by recently updated
recently updated
newest added

## Error message ValueError: [E102] Can't merge non-disjoint spans. 'opvlamming' is already part of tokens to merge. If you want to find the longest non-overlapping spans, you can use the...

bug
lang / nl
feat / doc

## How to reproduce the behaviour I'm trying to get the part-of-speach from some sentences to use it in a ML model. According to the [documentation](https://spacy.io/api/token#attributes) the possible values should...

models

For some context, here was the master issue for problems in lemmatization for the lookup-based lemmatizer for German: https://github.com/explosion/spaCy/issues/2486 And here was the announcement that German would be prioritized for...

lang / de
feat / lemmatizer

## Description Enable the specification of a group of commands within a spaCy project workflow that are to be executed in parallel. ### Features 1. Each spaCy projects command is...

enhancement
feat / cli
scaling

Currently, URLs under private networks won't be matched as URLs by the tokenizer e.g. ``` https://10.140.12.13/foo/bar?arg1=baz&arg2=taz # not a URL https://142.140.12.13/foo/bar?arg1=baz&arg2=taz # A URL ``` From my understanding, at some...

enhancement
feat / tokenizer
🔜 v4.0

**WIP**: This PR adds a Softmax scorer for the spancat component that should be used in cases where classes are exclusive. ## Description Spancat currently has a `LinearLogistic` layer that...

enhancement
⚠️ wip
feat / spancat

## Feature description Here is the background: in order to use a spacy document it is necessary to have the correct Vocab/StringStore. But when documents are created/processed in a distributed/multiprocessing...

enhancement
feat / serialize
scaling

## Description This is a work-in-progress PR with a proposal on how to split up the `universe.json` file. This PR is nowhere near ready to be merged, but I made...

docs

This issue is related to Spacy Project ## How to reproduce the behaviour Add a non-python command in the script section of the `project.yml` For eg. In the below snippet,...

windows
projects

feature request: Sentencepiece is the tokenizer used in XLNet. I think if `Language` tokenize text with `sentencepiece`, the alignment process can be skipped and it make model efficient.

enhancement
feat / tokenizer
feat / transformer
new language