Vilém Zouhar comments

Results 48 comments of


                                            Vilém Zouhar

Typst template

cc @gsarti for interest and @mjpost from Twitter discussion.

seq_len gives generator has no len error

Hi Anna! Thanks for reporting this issue. I just pushed `v1.1.2` that fixes it. :slightly_smiling_face: Could you update (`pip3 install -U tokenization-scorer`) and also confirm it on your end? Before:...

seq_len gives generator has no len error

Actually hold on this introduced another bug.

seq_len gives generator has no len error

Should return correct values now with `v1.1.4`: ``` python3 -c "import tokenization_scorer; print(tokenization_scorer.score('Hello there', metric='seq_len'))" -2.0 $ python3 -c "import tokenization_scorer; print(tokenization_scorer.score('Hell @@o the @re', metric='seq_len'))" -4.0 $ python3 -c...

seq_len gives generator has no len error

Oh no. Let me look into that.

seq_len gives generator has no len error

The reason why this simplest metric is broken so much is because it tries to compute average tokens per line. But what is a line is kinda tough to define...

seq_len gives generator has no len error

Resolved with your merge request in v1.1.6. :slightly_smiling_face:

seq_len gives generator has no len error

I finally added automatic tests to the package to catch this. The result should be visible as a badge on README.