NielsRogge

Results 79 issues of NielsRogge

# What does this PR do? This PR adds FocalNet to the library. To do: - [x] remove print_values - [ ] transfer checkpoints, update integration tests

# 🚀 Feature request Currently, the `EncoderDecoderModel` class in PyTorch automatically creates the `decoder_input_ids` based on the `labels` provided by the user (similar to how this is done for T5/BART)....

Good First Issue

### Model description Microsoft just open-sourced BEiTv3: https://github.com/microsoft/unilm/tree/master/beit3 This is a very powerful vision-language model that can be used as backbone for a variety of downstream tasks, from image classification...

New model

### Model description DFFT is a new fully Transformer-based object detector. The model doesn't require a decoder, unlike DETR. ### Open source status - [X] The model implementation is available...

New model

# What does this PR do? This PR adds the H3 model by Hazy Research (Stanford University). I've removed the Flash Attention dependency, and main author @DanFu09 has removed the...

# What does this PR do? This PR adds UDOP as described in [Unifying Vision, Text, and Layout for Universal Document Processing](https://arxiv.org/abs/2212.02623). The model can be seen as an encoder-decoder...

### Feature request BLIP and GIT are 2 recent additions in the library, providing state-of-the-art performance for tasks like image captioning and visual question answering (VQA). GIT is even capable...

Good First Issue

# What does this PR do? This PR adds the classic Mask R-CNN framework for object detection and instance segmentation. To do/to be discussed: - [ ] where to place...

**Is your feature request related to a problem? Please describe.** Hi, thanks for open-sourcing the algorithm behind "For You". So recently, a lot of people in the AI community (including...

Hi folks, Thanks for this amazing work. I have a question regarding the fine-tuning of Guanaco. Specifically, this model was trained on this dataset: https://huggingface.co/datasets/timdettmers/openassistant-guanaco, which only contains a "text"...