torchMoji icon indicating copy to clipboard operation
torchMoji copied to clipboard

Train a new model without using the pretrained model weights

Open ishita1995 opened this issue 6 years ago • 5 comments

I am trying to train a new model using the torchmoji architecture. I am not loading the pre-trained weights in the code. In torchmoji_transfer function calling the TorchMoji class but not loading the weights. The output is coming out to be nan due to which loss is nan. Can you please help me understand where am I going wrong. I am pretty new to deep learning. Sorry for the inconvenience in advance.

ishita1995 avatar Dec 07 '18 11:12 ishita1995

Hi, can you post a simple and self-contained example of code showing that is not working?

thomwolf avatar Dec 07 '18 22:12 thomwolf

from __future__ import print_function
import example_helper
import json
import torch
from torchmoji.model_def import torchmoji_transfer
from torchmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH, ROOT_PATH
from torchmoji.finetuning import (
     load_benchmark,
     finetune)


DATASET_PATH = '{}/data/emotion_data/raw.pickle'.format(ROOT_PATH)
nb_classes = 4

with open(VOCAB_PATH, 'r') as f:
    vocab = json.load(f)

# Load dataset.
data = load_benchmark(DATASET_PATH, vocab)

# Set up model and finetune
model = torchmoji_transfer(nb_classes, None,extend_embedding=1412)
print(model)
model, acc = finetune(model, data['texts'], data['labels'], nb_classes, data['batch_size'], method='chain-thaw')
print('Acc: {}'.format(acc))

In this example , if I don't load pretrained weights and just initiate the TorchMoji class the output comes out as nan even when i initiate the weights in __init_weights() What am I doing wrong here.

ishita1995 avatar Dec 10 '18 09:12 ishita1995

Hello I am also trying training the torchmoji without pre-trained model. but I have problem that loss is not decrease well....

graykode avatar Jan 28 '19 05:01 graykode

@graykode I ultimately wrote my own script with the same architecture to replicate the results.

ishita1995 avatar Feb 08 '19 12:02 ishita1995

Hi @ishita1995, can you share your implementation?

DanielJuravski avatar Jan 24 '20 15:01 DanielJuravski