Benedikt Fuchs

Results 120 comments of Benedikt Fuchs

Hi @mishraaditya595 this looks like it is failing to install the tokenizers dependency, I suppose the contributors and maintainers of that repository will be able to help you with that...

About Question 1: A `Sentence` is more representing a textual unit which you want to classify. The length does not matter, as long as you are not over the subtoken...

Hi @stefan-it thanks, I am also excited to hear about it. It's important to note, that the authors of ACE achieved the best results by concatenating transformer models that were...

Hi @lukasgarbas thank you for testing it! I noticed one implementation detail where I deviated from the original: I made it unable to use the same configuration twice, while the...

about **4.**: there is [this](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4331690/) and more generally [this](https://arxiv.org/pdf/1603.01360.pdf) about **1.**: this is done by the tagger itself, you don't need to add it.

hi @dobbersc as you introduce some kind of special tokens, have you tried adding them specifically to the vocabulary of the transformer embeddings? You could do this by adding something...

Hi @dobbersc interesting and surprising results. looking at the tokenization without special tokens: ``` '[', 'h', '-', 'lo', '##c', ']', 'and', '[', 't', '-', 'lo', '##c', ']', ``` we see...

Hi again, I did some testing and basically all my ideas lead to an decrease of scores, here are my runs, all with some adjustments in the tokens : with...

Hi @miwieg, TransformerEmbeddings provide a parameter `allow_long_sentences` if that parameter is set to True, the embeddings will take some overlap to compute the token embeddings. (E.g. "This is a very...

I don't know if that works, I would rather add it to the constructor: ```document_embeddings = TransformerDocumentEmbeddings(..., allow_long_sentences=True, cls_pooling="mean")```