nni How to use cream.CreamSupernetTrainer() correctly?

Describe the issue:

When I use CreamSupernetTrainer(), it will report an error：NotImplementedError. I find it seems that it is because cream does not implement the corresponding CreamMutator. Is there any solution？

Environment:

NNI version: 2.3
Training service (local|remote|pai|aml|etc):
Client OS: win10
Python version: 3.8
PyTorch/TensorFlow version: PyTorch1.8.1
Is conda/virtualenv/venv used?: conda
Is running in Docker?:

Code: import torch from torch import nn from nni.nas.pytorch import mutables from torchvision import transforms import torchvision from collections import OrderedDict import os import ops from nni.nas.pytorch.mutator import Mutator from nni.algorithms.nas.pytorch import cream

class AlexNet(nn.Module): def init(self): super(AlexNet, self).init() self.con1 = nn.Sequential(nn.Conv2d(1, 96, kernel_size=11, stride=4, padding=1), nn.ReLU(), nn.MaxPool2d(kernel_size=3, stride=2), ) self.con2 = mutables.LayerChoice(OrderedDict([ ('55', nn.Conv2d(96, 256, kernel_size=5, padding=2)), ('33', nn.Conv2d(96, 256, kernel_size=3, padding=1)), ('33dilsep', ops.DilConv(96, 256, 3, 1, 2, 2)), ('33sep', ops.SepConv(96, 256, 3, 1, 1))]), key='con2layer_key') self.con3 = nn.Sequential(nn.ReLU(), nn.MaxPool2d(kernel_size=3, stride=2), nn.Conv2d(256, 384, kernel_size=3, padding=1), nn.ReLU()) self.con4 = mutables.LayerChoice(OrderedDict([ ('33', nn.Conv2d(384, 384, kernel_size=3, padding=1)), ('33dilsep', ops.DilConv(384, 384, 3, 1, 2, 2)), ('3*3sep', ops.SepConv(384, 384, 3, 1, 1))]), key='con4layer_key') self.con5 = nn.Sequential(nn.ReLU(), nn.Conv2d(384, 256, kernel_size=3, padding=1), nn.ReLU(), nn.MaxPool2d(kernel_size=3, stride=2), nn.Flatten(), nn.Linear(6400, 4096), nn.ReLU(), nn.Dropout(p=0.5), nn.Linear(4096, 4096), nn.ReLU(), nn.Dropout(p=0.5), nn.Linear(4096, 10)) def forward(self, x): x = self.con1(x) x = self.con2(x) x = self.con3(x) x = self.con4(x) x = self.con5(x) return x

def test(): model = AlexNet() model.train() trans = [transforms.ToTensor()] resize = 224 if resize: trans.insert(0, transforms.Resize(resize)) trans = transforms.Compose(trans) dataset_train = torchvision.datasets.FashionMNIST( root='E:\liefeng\Pytest\data', train=True, transform=trans, download=True) dataset_test = torchvision.datasets.FashionMNIST( root='E:\liefeng\Pytest\data', train=False, transform=trans, download=True)

criterion = nn.CrossEntropyLoss()
criterion_val = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), 0.05, momentum=0.9, weight_decay=1.0E-4)

train_loader = torch.utils.data.DataLoader(dataset_train,
                                           batch_size=64,
                                           shuffle = True,
                                           num_workers=4,
                                           pin_memory=True)
valid_loader = torch.utils.data.DataLoader(dataset_test,
                                           batch_size=64,
                                           shuffle = False,
                                           num_workers=4,
                                           pin_memory=True)

trainer_cream = cream.CreamSupernetTrainer(model,
                                           loss=criterion,
                                     val_loss=criterion_val,
                                     optimizer=optimizer,
                                     num_epochs=10,
                                     train_loader=train_loader,
                                     valid_loader=valid_loader,
                                     mutator=Mutator(model),
                                     batch_size=64,
                                     log_frequency=40,
                                     meta_sta_epoch=5,
                                     update_iter=200,
                                     slices=2,
                                     pool_size=10,
                                     pick_method='meta',
                                     choice_num=6,
                                     sta_num=(4,4,4,4,4),
                                     acc_gap=5,
                                     flops_dict=None,
                                     flops_fixed=0,
                                     local_rank=0,
                                     callbacks=None)

trainer_cream.enable_visualization()
trainer_cream.train()  # training
if os.path.isdir('model_dir'):
    pass
else:
    os.makedirs('model_dir')
    print('craete model_dir!')
trainer_cream.export(file="model_dir/final_AlexNetNAS_cream.json")

if name == 'main': test()

Log message: [2021-08-31 23:55:35] INFO (nni.nas.pytorch.trainer/MainThread) Creating graph json, writing to logs\1630425332.8086371. Visualization enabled. [2021-08-31 23:55:35] WARNING (nni.nas.pytorch.mutator/MainThread) Graph is only tested with PyTorch 1.4. Other versions might not work. [2021-08-31 23:55:37] INFO (nni.nas.pytorch.trainer/MainThread) Epoch 1 Training Traceback (most recent call last): File "E:/Pytest/NAS/AlexNetNAS-cream-nni.py", line 113, in test() File "E:/Pytest/NAS/AlexNetNAS-cream-nni.py", line 102, in test trainer_cream.train() # training File "D:\application\anaconda\anaconda3\envs\YUE_PYTHON\lib\site-packages\nni\nas\pytorch\trainer.py", line 154, in train loss = self.train_one_epoch(epoch) File "D:\application\anaconda\anaconda3\envs\YUE_PYTHON\lib\site-packages\nni\algorithms\nas\pytorch\cream\trainer.py", line 356, in train_one_epoch self.mutator.reset() File "D:\application\anaconda\anaconda3\envs\YUE_PYTHON\lib\site-packages\nni\nas\pytorch\mutator.py", line 52, in reset self._cache = self.sample_search() File "D:\application\anaconda\anaconda3\envs\YUE_PYTHON\lib\site-packages\nni\nas\pytorch\mutator.py", line 33, in sample_search raise NotImplementedError NotImplementedError

Aug 31 '21 16:08 yuezhuang1387

Adding @penghouwen (Cream's author) in case @jonsnows and @yuezhuang1387 need further help here.

Sep 26 '21 08:09 scarlett2018

Hi,

Thanks for your interest in Cream!

You could refer to this line for the correct usage of mutator.

mutator = RandomMutator(model) # instead of mutator=Mutator(model),

Best,

Hao.

Sep 28 '21 01:09 Z7zuqer

hi @yuezhuang1387 Do you still have this problem? Hao had reply the issue! And nni v2.9 had been released. Welcome to use the latest version to test this issue. I really hope your problem has been resolved.

Sep 13 '22 03:09 Lijiaoa

Closed because no reply for a long time.

Dec 21 '22 02:12 Lijiaoa

nni nni copied to clipboard

How to use cream.CreamSupernetTrainer() correctly?

nni
nni copied to clipboard