Matyáš Kopp
Matyáš Kopp
@RePierre your sample is too large with annotated files. Can you please remove some pairs of TEI and TEI.ana files (and also `
Maybe you use an additional setting, something like (I haven't tested it): ```python import pandas as pd import csv current_df = pd.read_csv(file, sep="\t", index_col=False, quoting=csv.QUOTE_NONE, escapechar=None) ```
This looks like a Java issue: https://stackoverflow.com/questions/76327/how-can-i-prevent-java-from-creating-hsperfdata-files Changing java setting and/or additional validation needs to be done However, I have no idea how this error ends in the file, and...
Sorry, I missed this. I will implement this in future, and together with this, a kind of derived format validation can be implemented. - eg TSV is valid
@TomazErjavec I have inserted new taxonomies and reinserted taxonomies with missing translations (the checklist is up to date)
> This could be included in our metadata files (*-meta.tsv). However, since SI will be the only corpus containing this additional information, the other corpora would be missing this information...
I did almost similar things in a separate branch; I am now testing it before merging it to develop... https://github.com/clarin-eric/ParlaMint/pull/894
@TomazErjavec I had to trigger the action again with empty commit, now it seems to work
> > @TomazErjavec I had to trigger the action again with empty commit, now it seems to work > > @matyaskopp, indeed, it did finish now, now checks are failing...
I checked the language tag documentation: https://www.rfc-editor.org/rfc/rfc5646.html#page-5, and it contains a more detailed structure than I expected. I can see a problem with using the _-region_ part of the language...