Arthur

Results 795 comments of Arthur

On it 🤗

Okay, the `1b lyrics` and `5b lyrics` match the original code. Just need refactoring to have better variable names and wrap the sampling kwargs for easy use.

Hi, Awesome work thank you very much! I am also wondering if you have any plans of releasing the Effective Receptive Field visualization code? There's pretty much none out there......

Hey, before diving a bit deeper, sorry for the long delay, and thanks for the PR. Would you mind adding a test? 🤗 I can take care of it otherwise!

I am having a look RN, will tell you when I know more 👍🏻

Hey @miguelwon it seems that you are right about the training not converging at all using current version. However, since loading a trained model in the new versions does not...

Hey! Little update on this : the problem comes from the previously introduced "hack" : ```python return tf.Variable(emb, trainable=False, name="model.embed_positions.weights") ``` This appears [here](https://github.com/huggingface/transformers/blob/main/src/transformers/models/xglm/modeling_tf_xglm.py#L86). This hack can also be seen...

Should still be deterministic from my intuition, let me have a look

Just a small comment, in terms of performances I think the decorator can be a little bit improve to only run on the model's forward and not on every single...