Nicolas Patry comments

Results 978 comments of


                                            Nicolas Patry

Reverting Deta cloning mecanism.

> There is however test_can_use_safetensors failing after this PR. Is this test still relevant (at least while we keep the changes in this PR) The new code should fix everything....

Reverting Deta cloning mecanism.

> So we tried it your way and it doesn't work. Can we try to use Accelerate to detect the tied weights instead as suggested initially? Because `find_tied_weights` looks at...

Reverting Deta cloning mecanism.

> Why ? It seems you're using the hash (via is) in accelerate, I will switch to that since we want entirely shared tensors like in accelerate. So actually `hash`...

Reverting Deta cloning mecanism.

> makes it unusable in practice Why are we even caring about `_keys_to_ignore` and `tie_weights` if it's so inconvenient ? Why are we trying to even find tied weights in...

Reverting Deta cloning mecanism.

In order to help with ease of use of `safetensors` by itself I created this PR: https://github.com/huggingface/safetensors/pull/236 which sorts of mimics what is done here. However I still think this...

Reverting Deta cloning mecanism.

> Thanks for considering shared weights in `safetensors` directly. I agree it would still be cleaner to have the same kind of mechanism in Transformers. Could you please explain to...

Reverting Deta cloning mecanism.

Seeing the rebase, `hash` doesn't work on tensors unfortunately: ```python import torch A = torch.zeros((10, 10)) B = A[1] A.untyped_storage().data_ptr() == B.untyped_storage().data_ptr() hash(A) != hash(B) ```

Reverting Deta cloning mecanism.

> (which will become the default utlimately) Hurray !!!

Reverting Deta cloning mecanism.

Failing tests seem to be linked to newly release huggingface_hub==0.14.0 @sgugger Merge if you think it's OK, I'm going to not merge given this PR affects core modeling.

[`pipeline`] Add conditional text support for `ImageToTextPipeline`

I have doubts about this: My biggest issue is with the expected I/O. Pipeline are **defined** by I/O. In the particular case it's (image) -> (text) And here we're modifying...