Leandro von Werra

Results 160 comments of Leandro von Werra

Sounds good, do you want to take a stab at it?

Also regarding perplexity: The currently implemented version aims at being a data measurement tool if I understand correctly - you provide text and model. Although one could evaluate a model...

## Postprocessing IIRC we discussed for the postprocessing step that a more flexible alternative is to add a method to the class such that we could do: ```Python >>> metric...

Would these be the metrics that you call measurement? Could we set `references=None` in these cases so they don't throw an error if a reference is passed? That way they...

I have been thinking about the scalar vs. dict question. Having a dict across all metrics at least internally is nice as it allows to treat them all the same...

I think @lhoestq point was that for many metrics the user might just want a single value but always needs to access the dict `load_metric("accuracy").compute()["accuracy"]` When we combine metrics we...

Could this be similar to this issue: https://github.com/pandas-dev/pandas/issues/27532?

For reference: https://github.com/vrama91/cider cc @sashavor

Yes, this is a current limitation of combine: you can't pass any settings to `compute` only the features. Rather than fixing this in `combine` we aim to enable changing the...

Hi @conceptofmind, 1. `num_of_sequences` (number of sequences preprocessed at a time) is an estimate of how many sequences of length `seq_length` (length of sequence fed into model) we want to...