Lars Hillebrand

Results 11 comments of Lars Hillebrand

@zyLiu6707 In the meantime I developed my own solution since I was not sattisfied with addict and other comparable projects. It's called [MetaDict](https://github.com/LarsHill/metadict) and behaves exactly like a `dict` with...

Hey, Are there any updates on this matter? After checking `metric.py` I don't see a particular reason for this restriction but maybe I overlooked something. Also, the `reset` method could...

Thanks for the input. I checked the source code of `collections.namedtuple` and they essentially create an entire class object with its namespace on the fly, which is probably why PyCharm...

As an additional argument: [simplejson](https://github.com/simplejson/simplejson), the externally maintained development version of the inbuilt json library, supports Decimal serialization out of the box and serializes to string. Since new features in...

I came across this issue as well. Besides large data payloads being created on every run, it is annoying that there is no way to create the data only after...

@lucidrains Given that pytorch 2.0 released the new `F.scaled_dot_product_attention` would it make sense to include it in this repo as an optional parameter? One could think about some asserts to...

I actually encountered a similar scenario. The standard Huggingface [bert-base-cased](https://huggingface.co/bert-base-cased/blob/main/config.json) model trained with 16 bit mixed precision (using pytorch-lightning), a vocab size of 100K and a seq len of 1024...

Hi @danthe3rd Thanks for the very quick response! 1. Indeed, it is a training iteration including backward pass, optimizer step, etc. (see the linked `train.py` script with the training loop.)....

@danthe3rd Thanks for the added information. I added autocast and gradient scaling to the training and validation loop and tested the performance again. Unfortunately the overall picture is still the...