Li Dong comments

Results 47 comments of


                                            Li Dong

[Kosmos-2] Unable to start the demo

```bash ##################### # # Use this with or without the .gitattributes snippet with this Gist # create a fixle.sh file, paste this in and run it. # Why do you...

[Kosmos-2] Unable to start the demo

I see. The error might be caused by using WSL. I am unsure whether Gradio is supported under WSL.

TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=NoneType), but expected one of: * (tuple of ints size, , tuple of names names, torch.memory_format memory_format, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad) (tuple of ints size, *, torch.memory_format memory_format, Tensor out, torch.dtype dtype, torch.layout layout, torch.device device, bool pin_memory, bool requires_grad)

Could you also attach the command that produces the error?

embed_tokens

I found one blog (in Japanese) that might be useful https://zenn.dev/selllous/articles/retnet_tutorial.

ModuleNotFoundError: No module named 'examples.simultaneous_translation'

```bash FAIRSEQ_DIR=$(pip list -v | grep 'fairseq' | awk '{print $3}') export PYTHONPATH=$PYTHONPATH:$FAIRSEQ_DIR ```

retnet traning config

> Hi, Is there any resolution to this question for the initialization and recommended training configs to reproduce the paper results? I am also seeing some instability with the default...

Beit3 Training Batch Procedure

The code and pre-trained models of BEiT-3 can be found at [aka.ms/beit3](https://aka.ms/beit3).

extending VLMO with MIM (Masked Image Modeling) loss

@jinxixiang Could you also post the loss curves (such as tensorboard screenshots) of the run `using MIM + MLM + contrastive loss: (does not converge)`?

extending VLMO with MIM (Masked Image Modeling) loss

You could try https://github.com/microsoft/torchscale if the issue is training stability (i.e., loss divergence). The Multiway architecture can be enabled by multiway=True. https://github.com/microsoft/torchscale#key-features

extending VLMO with MIM (Masked Image Modeling) loss

The code and pre-trained models of BEiT-3 can be found at [aka.ms/beit3](https://aka.ms/beit3).