torchMoji
torchMoji copied to clipboard
Train a new model without using the pretrained model weights
I am trying to train a new model using the torchmoji architecture. I am not loading the pre-trained weights in the code. In torchmoji_transfer function calling the TorchMoji class but not loading the weights. The output is coming out to be nan due to which loss is nan. Can you please help me understand where am I going wrong. I am pretty new to deep learning. Sorry for the inconvenience in advance.
Hi, can you post a simple and self-contained example of code showing that is not working?
from __future__ import print_function
import example_helper
import json
import torch
from torchmoji.model_def import torchmoji_transfer
from torchmoji.global_variables import PRETRAINED_PATH, VOCAB_PATH, ROOT_PATH
from torchmoji.finetuning import (
load_benchmark,
finetune)
DATASET_PATH = '{}/data/emotion_data/raw.pickle'.format(ROOT_PATH)
nb_classes = 4
with open(VOCAB_PATH, 'r') as f:
vocab = json.load(f)
# Load dataset.
data = load_benchmark(DATASET_PATH, vocab)
# Set up model and finetune
model = torchmoji_transfer(nb_classes, None,extend_embedding=1412)
print(model)
model, acc = finetune(model, data['texts'], data['labels'], nb_classes, data['batch_size'], method='chain-thaw')
print('Acc: {}'.format(acc))
In this example , if I don't load pretrained weights and just initiate the TorchMoji class the output comes out as nan even when i initiate the weights in __init_weights() What am I doing wrong here.
Hello I am also trying training the torchmoji without pre-trained model. but I have problem that loss is not decrease well....
@graykode I ultimately wrote my own script with the same architecture to replicate the results.
Hi @ishita1995, can you share your implementation?