Joel Grus comments

Results 18 comments of


                                            Joel Grus

Jupyter notebooks?

probably someone has done this, but if so I don't know about it. if you want to contribute the notebooks back to this repo, let me know. or if you...

NLP Error

some day I'll learn my lesson about relying on the oreilly website not to change and break things ☹

Class Parameter Hints do not show up when inheriting from NamedTuple

btw, it seems like pylance fixes this

Include Flair Embeddings

I don't know much about Flair embedddings, but I took a quick look at their paper and it looks like they're just doing character-level embeddings and then taking the last...

Include Flair Embeddings

wouldn't you just use the character tokenizer (which would keep spaces) and then compute the offsets in the token indexer?

Include Flair Embeddings

are the rules for word boundaries that complicated that you couldn't just include them in the token indexer?

Include Flair Embeddings

what does "originally tokenized" mean here? say I have a sentence "go." I feed that to the character tokenizer and get ["g", "o", "."] if the sentence were "go .",...

Include Flair Embeddings

ok, I think I get it now. but the spacy tokenizer is already returning the offsets as `token.idx`: ``` In [11]: t = WordTokenizer() In [12]: tokens = t.tokenize("This isn't...

Include Flair Embeddings

if your text is pre-tokenized you're out of luck in any case. I am extremely comfortable enforcing "if you want to use flair embeddings, you must use a tokenizer that...

Include Flair Embeddings

in this case your DatasetReader must be (I assume) somehow creating `Token` objects to populate a `TextField`? in which case I'd say that yes it's the dataset reader's job to...