NielsRogge comments

Results 388 comments of


NielsRogge

Unable to use BLIP2 with caption_coco_opt6.7b at HEAD via salesforce-lavis (also HEAD)

I have a PR here which aims to further verify equivalence: https://github.com/huggingface/transformers/pull/24854. The conversion script can be found [here](https://github.com/NielsRogge/transformers/blob/improve_blip2/src/transformers/models/blip_2/convert_blip_2_original_to_pytorch.py) and can be run as follows: ``` pip install -U git+https://github.com/nielsrogge/LAVIS.git@blip2_float32...

fix LayoutLMv3TokenizerFast subword label after 'Ġ' token

Thanks a lot for this fix, would you be able to take into account my comment such that we can merge it? 🙏 Thanks! Btw the same fix could then...

Mask2Former - ValueError: cost matrix is infeasible

Cc @alaradirik

Add BioGptForSequenceClassification

Hi, it seems the CI didn't run properly. Could you push an empty commit to trigger it?

Add/fix documentation around VideoMAEForPretraining's `bool_masked_pos` argument

Thanks for raising this issue! VideoMAE indeed uses the same mask ratio (number of masked patches) per video to make batching possible. See [this class](https://github.com/MCG-NJU/VideoMAE/blob/main/masking_generator.py) which the authors use to...

Add/fix documentation around VideoMAEForPretraining's `bool_masked_pos` argument

Reopening as it's not resolved yet

[RFC] Add variant to transformers

cc @sgugger would it be possible to add this feature to `push_to_hub` as well? I'd like to use it for BLIP-2. For the moment it seems the only way to...

Image Classification Pipeline returns score= 1.0

Yes that's why the pipeline is called classification, rather than regression. We would need an `ImageRegressionPipeline` for this use case ;)

Image Classification Pipeline returns score= 1.0

Closing this issue as it seems resolved.

how to fine tune BlipForImageTextRetrieval?

I'd recommend fine-tuning CLIP if you want to do image-text retrieval using this script: https://github.com/huggingface/transformers/tree/main/examples/pytorch/contrastive-image-text. Fine-tuning BLIP might be harder as it involves some very specific loss functions.