Benedikt Fuchs comments

Results 120 comments of


                                            Benedikt Fuchs

Failing to install Flair on Apple Silicon

Hi @mishraaditya595 this looks like it is failing to install the tokenizers dependency, I suppose the contributors and maintainers of that repository will be able to help you with that...

Questions on TARS data set-up and prediction scores

About Question 1: A `Sentence` is more representing a textual unit which you want to classify. The length does not matter, as long as you are not over the subtoken...

Automated Concatenation of Embeddings

Hi @stefan-it thanks, I am also excited to hear about it. It's important to note, that the authors of ACE achieved the best results by concatenating transformer models that were...

Automated Concatenation of Embeddings

Hi @lukasgarbas thank you for testing it! I noticed one implementation detail where I deviated from the original: I made it unable to use the same configuration twice, while the...

All settings the same, but using "make_label_dictionary" and "make_tag_dictionary" to create "tag_dictionary" for NER model give completely different results.

about **4.**: there is [this](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4331690/) and more generally [this](https://arxiv.org/pdf/1603.01360.pdf) about **1.**: this is done by the tagger itself, you don't need to add it.

Masked Relation Classifier

hi @dobbersc as you introduce some kind of special tokens, have you tried adding them specifically to the vocabulary of the transformer embeddings? You could do this by adding something...

Masked Relation Classifier

Hi @dobbersc interesting and surprising results. looking at the tokenization without special tokens: ``` '[', 'h', '-', 'lo', '##c', ']', 'and', '[', 't', '-', 'lo', '##c', ']', ``` we see...

Masked Relation Classifier

Hi again, I did some testing and basically all my ideas lead to an decrease of scores, here are my runs, all with some adjustments in the tokens : with...

roberta with long text instances

Hi @miwieg, TransformerEmbeddings provide a parameter `allow_long_sentences` if that parameter is set to True, the embeddings will take some overlap to compute the token embeddings. (E.g. "This is a very...

roberta with long text instances

I don't know if that works, I would rather add it to the constructor: ```document_embeddings = TransformerDocumentEmbeddings(..., allow_long_sentences=True, cls_pooling="mean")```