Arto comments

Results 6 comments of


                                            Arto

Remote sensing image captioning

Hello. Thanks for your interest. The CLIP model cannot be used to generate captions directly. We have plans to add a finetuned prefix GPT or similar model at some point...

Training logs

Hi. You can find links to wandb runs on modelcards. Here is the one with best performance [modelcard](https://huggingface.co/flax-community/clip-rsicd-v2) : [logs](https://wandb.ai/wandb/hf-flax-clip-rsicd/runs/2dj1exsw) Here is a small [report highlighting impact of augmentations on...

Generalizable Text Transformer Usage

@iejMac and I are going to give this a shot. Our current understanding of what needs to be done for a start can be summarized as follows: Create a version...

Generalizable Text Transformer Usage: Adding HuggingFace Model

Update on current state of things: we have training working with pre-trained HF model, so basic building blocks should be in place. We branched out current work [here](https://github.com/iejMac/open_clip/tree/dev-integration). This branch...

[Community] Add fairseq FastSpeech2

Hi, may I ask a couple of questions: 1. Is there some estimated time of merging for this PR? 2. Are there any experiments/examples of training using this implementation Fastspeech2?...

Finetuning Script

Hello You can find training details in [readme](https://github.com/arampacha/CLIP-rsicd#training-details). The training script used to fine-tune presented models is [run_clip_flax_tv.py](https://github.com/arampacha/CLIP-rsicd/blob/master/run_clip_flax_tv.py). Exact training command used to get our best performing model can be...