Tatu Ylonen

Results 48 comments of Tatu Ylonen

I think it is important for downstream usability of the data that the editions be as consistent as possible - same fields, same parts-of-speech, same tags - as much as...

I added code in wiktextract to limit the number of errors, warnings, and debug messages collected to 100000 each. The huge errors file was crashing web site generation, as it...

Looks like I was wrong here and the "tags" field really is a list of strings now (it was once a space-separated string). One of the reasons for it being...

My original idea was that there should be a core set of grammatical/semantic/typographic tags that would be the same for all languages. For example, "transitive", "intransitive", "infinitive". Generally, these tags...

I just fixed the check for the "tags" field to require that it is a list of strings. I also fixed the test for "forms"/"form" to not check it if...

Also, xlat_head_map is completely specific to the English wiktionary, and only serves to convert text from Wiktionary into one or more tags or to ignore it. One should draw no...

Tagsets are represented as lists of lists of tags internally (or-of-ands) and generate multiple entries in the data if there are multiple alternatives. We discussed this with Kristian and I'll...

I was having the same issue. 259 osds on 15 hosts, running reef (18.2.4). It looks like the problem pg belongs to a 2.4PB 8+2 erasure-coded pool and is active+undersized+degraded+remapped+backfill_toofull....