batchgenerators icon indicating copy to clipboard operation
batchgenerators copied to clipboard

Training with `MultiThreadedAugmenter` stops too early?

Open petteriTeikari opened this issue 4 years ago • 0 comments

Thanks for the nice package

Might be something very simple problem, being new to batchgenerators, I cannot figure out why my training does not process beyond 2 epochs here (training code modified from SimonKohl/probabilistic_unet: (Tensorflow 1.14, CUDA 10 and batchgenerators with git clone install)

epoch gets values 0 and 1 and then training stops, so I am a bit unsure where that 2 as "number of epochs" is coming, and I tested using a fixed train_batch without the next(train_gen) and the training went through the number of epochs that I wanted.

What is that I could have missed?

batch_size = 2
cf.no_of_epochs = 10
cf.n_training_batches = 1
# thus, samples in this mini-debug dataset = 2

with tf.train.MonitoredTrainingSession(hooks=[saver_hook], config=config) as sess:
  for epoch in range(cf.no_of_epochs):
    for b in range(cf.n_training_batches):
      train_batch = next(train_gen)
      _, train_summary = sess.run([optimizer, training_summary_op],
                                            feed_dict={x: train_batch['data'], y: train_batch['seg'],
                                                       mask: train_batch['mask']})
      sess.run(global_step)
	

My dataloader is:

class Loader(DataLoader):

    def __init__(self, cf, data, batch_size, patch_size, data_split='train',
                 num_threads_in_multithreaded =1, seed_for_shuffle=1234,
                 subset = 'test_network', n_classes=None):
        super().__init__(data, batch_size, num_threads_in_multithreaded, seed_for_shuffle)
        self.cf = cf
        self.patch_size = patch_size
        self.data_split = data_split
        self.subset = subset
        self.n_classes = n_classes
        self._batches_generated = 0
        self.indices = list(range(len(data)))

    def load_data_per_code(self, patient_per_code, hot_encode_label_numpy = True):

        nib_img = nib.load(img_path)
        img = nib_img.get_data()
        metadata = nib_img.header
        
	...

        return img, seg, mask, metadata, ...

    def generate_train_batch(self, debug_verbose = False):

        # DataLoader has its own methods for selecting what patients to use next, see its Documentation
        codes_in = list(self._data.keys())
        idx = self.get_indices()
        codes_per_idx = [codes_in[i] for i in idx]
        patients_for_batch = [self._data[i] for i in codes_per_idx]

        # initialize empty array for data and seg
        data = np.zeros((self.cf.image_shape), dtype=np.float32)
        seg = np.zeros((self.cf.segmentation_shape), dtype=np.float32)
        mask = np.zeros((self.cf.loss_mask_shape), dtype=np.float32)

        metadata = []
        patient_codes = []
        seg_names = []

        # iterate over patients_for_batch and include them in the batch
        for i, patient_per_code in enumerate(patients_for_batch):
            img_i, seg_i, mask_i, metadata_i, code, seg_name = self.load_data_per_code(patient_per_code)
            data[i] = img_i
            seg[i] = seg_i
            mask[i] = mask_i
            metadata.append(metadata_i)
            patient_codes.append(code)
            seg_names.append(seg_name)

        self._batches_generated += 1
        
        return {'data': data, 'seg': seg, 'mask': mask,
                'metadata': metadata, 'codes': patient_codes}

My 'train_gen' is:


    tr_transforms = _couple of transformations_
    MultiThreadedAugmenter(dataloader_train, tr_transforms, num_processes=1, num_cached_per_queue)

petteriTeikari avatar Jun 10 '20 19:06 petteriTeikari