Sylvain Gugger comments

Results 633 comments of


                                            Sylvain Gugger

Refactor Pytorch `model.generate` method to work on TPU

This is not a prioritized feature as you can already use TPUs for generation in Flax and TensorFlow. Since you can easily convert a model from one framework to the...

bloom-7b inference - RuntimeError: expected scalar type Half but found Float

The above PR has been merged, so this should be solved :-)

ipex intel extension for pytorch integration

Could you fix the conflicts with main first please?

ipex intel extension for pytorch integration

@jianan-gu You do not need write access to make a rebase/merge with the main branch.

Add OPT to huggingface conversion guidelines/scripts

The `parallelize` API is going to be deprecated in the coming days. The way to parallelize the model is now: ```py model = AutoModelForXxx.from_pretrained(checkpoint, device_map="auto") ``` or passing an explicit...

Add OPT to huggingface conversion guidelines/scripts

Same naive model parallelism, and this all for inference only where the speed gain is going to be minimal. For training we recommend the use of DeepSpeed.

Implement Maximal Update Parametrization (muP)

From what I gather of the `mup` repository, it's not general enough (yet?) to be integrated into Accelerate as it seems to be very targeted toward Transformer models, whereas Accelerate...

Implement Maximal Update Parametrization (muP)

You're right, I should have said that the adaptations you mention seem very targeted toward Transformer (in particularly point 3 above).

Implement Maximal Update Parametrization (muP)

You can create a randomly initialized model with `AutoModel.from_config`, with the config pulled with `AutoConfig.from_pretrained`: ```py from transformers import AutoConfig, AutoModel config = AutoConfig.from_pretrained(checkpoint_name) model = AutoModel.from_config(config) ``` As for...

Adding RelationExtraction head to layoutLMv2 and layoutXLM models

> I had to add an argument _configuration_file to the model init but this is only required in transformers post v16.0 (inclusive), works without in v15.0 (think this is related...