Thibault Clérice
Thibault Clérice
So, digging in to `--pad` as I missed it. https://github.com/mittagessen/kraken/blob/daf39d8023a8f6013caaef21937cf89f25c4ed9e/kraken/lib/dataset/utils.py#L97-L100 It seems that 2 fixed dimensions disables padding. And, moreover, Padding is a [constant, and is not dynamic](https://pytorch.org/vision/stable/auto_examples/plot_transforms.html#pad): in means...
Bonjour quelle instance Pyrrha utilisez vous ? A ma connaissance, celle de l'ENC a des problèmes avec le serveur mail, mais https://pyrrha.huma-num.fr/ marche.
Hi :) Are you using your own installation of Pyrrha or one that is findable on internet ?
This issue is also present in the JS version: https://github.com/digitalbazaar/jsonld.js/issues/424
I love this contribution, it's exactly what I was missing in the tool :) :+1: Could I recommend changing `Remove duplicate fields` to `Treat URL as a duplicate of DOI`...
I replicated the thing with French training data ```py >>> print(fast_tokenizer.tokenize("Macquart a appelé cet insecte dacus oleæ")) ['M', 'ac', 'qu', 'art', 'Ġa', 'Ġap', 'pe', 'le', 'Ìģ', 'Ġcet', 'Ġinsecte', 'Ġdacus', 'Ġoleæ']...
I have the same issue: ``` { "@context": { "lang": "@language", "value": "@value", "dublinCore": { "@id": "http://foo.bar/dc", "@context": { "title": "http://purl.org/dc/terms/title" } }, "title": "http://foo.bar/title" }, "@id": "http://foo.bar/obj/test", "title": "test",...
@jbarth-ubhd I think you are seeing the limit of the dataset behind CATMuS Print Large ( https://hal.science/hal-04557457v1 ). No German and nearly no English, and not a lot of typed...