Eunkwang Jeon
Eunkwang Jeon
I strongly agree!!! We can imagine open-assistant working in multilingual by automating en->x through machine translation.
Hi, @hyunwoongko Currently, I am working on converting the public Korean data into a "instruction-fulfillment" format. I don't know what type of data you can provide, but any data you...
We plan to update the models that have been added in the near future. The models to be added are as follows. * imagenet21k * R26+ViT-B_32.npz * R50+ViT-L_32.npz * ViT-B_8.npz...
Transfer learning was performed using v100. Check the relative time of the tensorboard for the learning time.
Distributed training on a single node can be executed as follows. ```shell python3 -m torch.distributed.launch --nproc_per_node=NUM_OF_GPU train.py --train_batch_size BATCH_SIZE_PER_GPU --name cifar10-100_500 --dataset cifar10 --model_type ViT-B_16 --pretrained_dir checkpoint/ViT-B_16.npz ```
> @jeonsworld Hello, how can I use multi-GPUs? There are [DataParallel](https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html#dataparallel) and [Distributed](https://pytorch.org/docs/stable/distributed.html#distributed-communication-package-torch-distributed) ways to use multi-gpu in pytorch. The current code supports distributed learning and uses the following command....
In CNN, weight standardization is suggested in [Big Transfer (BiT): General Visual Representation Learning](https://arxiv.org/abs/1912.11370). See section 4.3 of paper.