RedditScore issues

Problem importing CrazyTokenizer in Colab

There used to be no problems when importing CrazyTokenizer in Colab until February, but now it throws an error of "Protocol not found". To reproduce errors, Type: !pip install git+https://github.com/crazyfrogspb/RedditScore.git...

yoonwonj

Spacy's nlp_maxlength

With the CrazyTokenizer (excellent results, btw, thanks!) I am running into an issue with a maximum character length for SpaCY: "[E088] Text of length 3029371 exceeds maximum of 1000000." You...

D0cRandom

[Bug] Incorrect results with some numbers

## Steps ```py from redditscore.tokenizer import CrazyTokenizer tokenizer = CrazyTokenizer(hashtags='split') tokenizer.tokenize("#20yearsago") ``` ## Actual Result ```py ['2', '0', 'y', 'e', 'a', 'r', 's', 'a', 'g', 'o'] ``` ## Expected Result...

OlehOnyshchak

[Bug]: Tutorial example isn't working properly

## Installation ```sh pip install git+https://github.com/crazyfrogspb/RedditScore.git ``` ## Steps to reproduce ```py from redditscore.tokenizer import CrazyTokenizer tokenizer = CrazyTokenizer(hashtags=False) text = "Let's #makeamericagreatagain#americafirst" print(tokenizer.tokenize(text)) ``` ## Expected output ```py ["let's",...

OlehOnyshchak

RedditScore
RedditScore copied to clipboard

Metadata

Problem importing CrazyTokenizer in Colab

Spacy's nlp_maxlength

[Bug] Incorrect results with some numbers

[Bug]: Tutorial example isn't working properly

← Metadata

Owner

Metadata

RedditScore RedditScore copied to clipboard

Metadata

Problem importing CrazyTokenizer in Colab

Spacy's nlp_maxlength

[Bug] Incorrect results with some numbers

[Bug]: Tutorial example isn't working properly

← Metadata

Owner

Metadata

RedditScore
RedditScore copied to clipboard