nagisa icon indicating copy to clipboard operation
nagisa copied to clipboard

How to train with gpu?

Open kdrkdrkdr opened this issue 1 year ago • 5 comments

Is there any solution with colab?

kdrkdrkdr avatar Sep 09 '22 16:09 kdrkdrkdr

Hi @kdrkdrkdr. Thank you for using nagisa!

I am very sorry about this. Nagisa does not support GPU. Nagisa does some optimizations to make the neural nets run faster on CPU, so it can't make training faster with GPU. Please perform the training on CPU in Colab.

taishi-i avatar Sep 09 '22 18:09 taishi-i

Thank you for your reply! I'd like to ask you one more question... Is there a way to do additional learning with the already pretrained files?

kdrkdrkdr avatar Sep 10 '22 03:09 kdrkdrkdr

Is there a way to do additional learning with the already pretrained files?

Yes, it is possible! The following method (nagisa.train._start) can be used to retrain the pretrained files.

# This is pseudo-code! Be caseful.
import nagisa

pretrained_hp = nagisa.utils.load_data("pretrained.hp")
pretrained_model = nagisa.model.Model(pretrained_hp, pretrained_params) 

nagisa.train._start(additional_hp, additional_model, train_data, test_data, dev_data)

The above code presented is pseudo-code. I want to provide you with a sample code that works correctly. Could you wait a day or two days?

taishi-i avatar Sep 11 '22 19:09 taishi-i

Hi @kdrkdrkdr. I have created a working code to help you retrain the pretrained model. It can be used to retrain the pretrained model by utilizing naigsa internal methods.

Since we use sample data, please clone the nagisa directory and move it to the working folder.

$ git clone https://github.com/taishi-i/nagisa
$ cd nagisa/test

Please create the following file in the working folder and run it in Python. Check the comment out for an explanation.

import nagisa

# First, create the pretrained model
nagisa.fit(
    train_file="../nagisa/data/sample_datasets/sample.train",
    dev_file="../nagisa/data/sample_datasets/sample.dev",
    test_file="../nagisa/data/sample_datasets/sample.test",
    model_name="sample",
)


# Load the pretrained model files
pretrained_hp = nagisa.utils.load_data("sample.hp")
pretrained_params = "sample.params"
pretrained_model = nagisa.model.Model(pretrained_hp, pretrained_params)
vocabs = nagisa.utils.load_data("sample.vocabs")


# Change to the format of your data set
DELIMITER = "\t"
NEWLINE = "EOS"


# Load files to retrain
train_data = nagisa.train.prepro.from_file(
    filename="../nagisa/data/sample_datasets/sample.train",
    window_size=pretrained_hp['WINDOW_SIZE'],
    vocabs=vocabs,
    delimiter=DELIMITER,
    newline=NEWLINE
)

test_data = nagisa.train.prepro.from_file(
    filename="../nagisa/data/sample_datasets/sample.test",
    window_size=pretrained_hp['WINDOW_SIZE'],
    vocabs=vocabs,
    delimiter=DELIMITER,
    newline=NEWLINE
)

dev_data = nagisa.train.prepro.from_file(
    filename="../nagisa/data/sample_datasets/sample.dev",
    window_size=pretrained_hp['WINDOW_SIZE'],
    vocabs=vocabs,
    delimiter=DELIMITER,
    newline=NEWLINE
)


# To avoid overwriting models files, converts the output model name for retraining
retrained_model_name = "retrained_sample"
pretrained_hp["MODEL"] = f"{retrained_model_name}.params"
pretrained_hp["HYPERPARAMS"] = f"{retrained_model_name}.hp"


# Train the pretrained model
# Save retrained_sample.hp and retrained_sample.params
nagisa.train._start(pretrained_hp, pretrained_model, train_data, test_data, dev_data)

If you have any questions, please feel free to ask. Thanks!

taishi-i avatar Sep 12 '22 17:09 taishi-i

Thank you! Can I ask you one more question? Can I use a different dataset for the pretrained model?

kdrkdrkdr avatar Sep 19 '22 16:09 kdrkdrkdr