Indig Balázs

Results 13 comments of Indig Balázs

I've just realized it parses one reference at once. The length of the input was the key. Smaller input works, larger triggers the error. It would be nice, if it...

I was reccently bitten by this problem in multiple ways: - Headers could end with space - Content can use \r\n line endings and end with an empty line Both...

@wumpus > If I can see the input file, it would be helpful. The input file is attached to the OP and the bug should be reproducible with it: https://github.com/webrecorder/warcio/files/5721385/input.warc.gz...

Can we expect any improvement on this topic in the future? This is still a major headache. The worst form is when I want to pass around multiple select records...

> @dlazesz your solution does not work in a streaming environment. The solution at the top works for both streaming and non-streaming. I think your solution also breaks the checksum...

@GregoryMorse I would be glad if you steped up to maintain this nice code. Feel free (by the license) to take my patch and integrate it in your fork. Personally,...

The source code is rather old and not maintained. It compiles well on Debian stable an OpenJDK 11. In your case increasing [the source and target versions](https://github.com/ppke-nlpg/purepos/blob/master/pom.xml#L25-L26) may help, but...

The output of PurePOS: ``` form lemma xpostag A a [/Det|Art.Def] spanyol spanyol [/Adj][Nom] nagydíj nagydíj [/N][Nom] harmadik három [/Num][_Ord/Adj][Nom] szabadedzését szabadedzés [/N][Poss.3Sg][Acc] május május [/N][Nom] 21-én 21-én [/N][Nom] ,...

I don't have access to the corpus version which emDep has trained with, but in the public UD Szeged corpus the problem is fixed and simply changing the dependency parser...

CoNLL-U comments need to be explicitly enabled with [conllu-comments parameter](https://github.com/nytud/emtsv#client). We may flip the default behaviour to enabled in some future release. I agree that the documentation is very coarse...