DeepLung icon indicating copy to clipboard operation
DeepLung copied to clipboard

For faster NMS

Open naoe1999 opened this issue 3 years ago • 11 comments

First, thank you for your code sharing, it helps me a lot.

To make your code even better, I would like to suggest much faster NMS code below. This code is to replace both nms() and iou() functions. I referred to the nms algorithm by "Malisiewicz et al", and modified it to fit LUNA16's 3d data format.

Speed acceleration is achieved by:

  1. limiting maximum number of candidates before nms (to 3000 for example). This is extremely helpful when detector is not trained enough as it is producing millions of predictions over the certain threshold (such as -0.5). This would not be happening when the detector is properly trained.

  2. removing inner loop and utilizing numpy vectorization. You know the numpy vectorization is much much faster than python iteration loop.



def nms(output, nms_th=0.1):
    if len(output) == 0:
        return output

    output = output[np.argsort(-output[:, 0])]
    output = output[:3000]  # limit max number of candidates

    r = output[:, 4] / 2.
    x1 = output[:, 1] - r
    y1 = output[:, 2] - r
    z1 = output[:, 3] - r
    x2 = output[:, 1] + r
    y2 = output[:, 2] + r
    z2 = output[:, 3] + r
    volume = output[:, 4] ** 3  # (x2 - x1) * (y2 - y1) * (z2 - z1)

    pick = []
    idxs = np.arange(len(output))

    while len(idxs) > 0:
        i = idxs[0]
        pick.append(i)

        xx1 = np.maximum(x1[i], x1[idxs[1:]])
        yy1 = np.maximum(y1[i], y1[idxs[1:]])
        zz1 = np.maximum(z1[i], z1[idxs[1:]])
        xx2 = np.minimum(x2[i], x2[idxs[1:]])
        yy2 = np.minimum(y2[i], y2[idxs[1:]])
        zz2 = np.minimum(z2[i], z2[idxs[1:]])

        w = np.maximum(0.0, xx2 - xx1)
        h = np.maximum(0.0, yy2 - yy1)
        d = np.maximum(0.0, zz2 - zz1)
        intersection = w * h * d

        iou = intersection / (volume[i] + volume[idxs[1:]] - intersection)

        idxs = np.delete(idxs, np.concatenate(([0], np.where(iou >= nms_th)[0] + 1)))

    return output[pick]

naoe1999 avatar Jun 28 '21 05:06 naoe1999

Yes. Thank you so much for the contribution!

wentaozhu avatar Jun 28 '21 17:06 wentaozhu

First, thank you for your code sharing, it helps me a lot.

To make your code even better, I would like to suggest much faster NMS code below. This code is to replace both nms() and iou() functions. I referred to the nms algorithm by "Malisiewicz et al", and modified it to fit LUNA16's 3d data format.

Speed acceleration is achieved by:

1. limiting maximum number of candidates before nms  (to 3000 for example). This is extremely helpful when detector is not trained enough as it is producing millions of predictions over the certain threshold (such as -0.5).  This would not be happening when the detector is properly trained.

2. removing inner loop and utilizing numpy vectorization. You know the numpy vectorization is much much faster than python iteration loop.
def nms(output, nms_th=0.1):
    if len(output) == 0:
        return output

    output = output[np.argsort(-output[:, 0])]
    output = output[:3000]  # limit max number of candidates

    r = output[:, 4] / 2.
    x1 = output[:, 1] - r
    y1 = output[:, 2] - r
    z1 = output[:, 3] - r
    x2 = output[:, 1] + r
    y2 = output[:, 2] + r
    z2 = output[:, 3] + r
    volume = output[:, 4] ** 3  # (x2 - x1) * (y2 - y1) * (z2 - z1)

    pick = []
    idxs = np.arange(len(output))

    while len(idxs) > 0:
        i = idxs[0]
        pick.append(i)

        xx1 = np.maximum(x1[i], x1[idxs[1:]])
        yy1 = np.maximum(y1[i], y1[idxs[1:]])
        zz1 = np.maximum(z1[i], z1[idxs[1:]])
        xx2 = np.minimum(x2[i], x2[idxs[1:]])
        yy2 = np.minimum(y2[i], y2[idxs[1:]])
        zz2 = np.minimum(z2[i], z2[idxs[1:]])

        w = np.maximum(0.0, xx2 - xx1)
        h = np.maximum(0.0, yy2 - yy1)
        d = np.maximum(0.0, zz2 - zz1)
        intersection = w * h * d

        iou = intersection / (volume[i] + volume[idxs[1:]] - intersection)

        idxs = np.delete(idxs, np.concatenate(([0], np.where(iou >= nms_th)[0] + 1)))

    return output[pick]

Hello naoe1999

I have tried to make the follow up on training through this repository, and tried to train res18 without 064.cpkt, and I found the training loss is not decreasing. I think because of the torch version changes, I also changed some parts of data.py and main.py files based on the torch vesion without any success.

Kindly please can you share with me your data.py and main.py files? My email is [email protected] Thanks in advance

SirMwan avatar Jul 16 '21 03:07 SirMwan

First, thank you for your code sharing, it helps me a lot. To make your code even better, I would like to suggest much faster NMS code below. This code is to replace both nms() and iou() functions. I referred to the nms algorithm by "Malisiewicz et al", and modified it to fit LUNA16's 3d data format. Speed acceleration is achieved by:

1. limiting maximum number of candidates before nms  (to 3000 for example). This is extremely helpful when detector is not trained enough as it is producing millions of predictions over the certain threshold (such as -0.5).  This would not be happening when the detector is properly trained.

2. removing inner loop and utilizing numpy vectorization. You know the numpy vectorization is much much faster than python iteration loop.
def nms(output, nms_th=0.1):
    if len(output) == 0:
        return output

    output = output[np.argsort(-output[:, 0])]
    output = output[:3000]  # limit max number of candidates

    r = output[:, 4] / 2.
    x1 = output[:, 1] - r
    y1 = output[:, 2] - r
    z1 = output[:, 3] - r
    x2 = output[:, 1] + r
    y2 = output[:, 2] + r
    z2 = output[:, 3] + r
    volume = output[:, 4] ** 3  # (x2 - x1) * (y2 - y1) * (z2 - z1)

    pick = []
    idxs = np.arange(len(output))

    while len(idxs) > 0:
        i = idxs[0]
        pick.append(i)

        xx1 = np.maximum(x1[i], x1[idxs[1:]])
        yy1 = np.maximum(y1[i], y1[idxs[1:]])
        zz1 = np.maximum(z1[i], z1[idxs[1:]])
        xx2 = np.minimum(x2[i], x2[idxs[1:]])
        yy2 = np.minimum(y2[i], y2[idxs[1:]])
        zz2 = np.minimum(z2[i], z2[idxs[1:]])

        w = np.maximum(0.0, xx2 - xx1)
        h = np.maximum(0.0, yy2 - yy1)
        d = np.maximum(0.0, zz2 - zz1)
        intersection = w * h * d

        iou = intersection / (volume[i] + volume[idxs[1:]] - intersection)

        idxs = np.delete(idxs, np.concatenate(([0], np.where(iou >= nms_th)[0] + 1)))

    return output[pick]

Hello naoe1999

I have tried to make the follow up on training through this repository, and tried to train res18 without 064.cpkt, and I found the training loss is not decreasing. I think because of the torch version changes, I also changed some parts of data.py and main.py files based on the torch vesion without any success.

Kindly please can you share with me your data.py and main.py files? My email is [email protected] Thanks in advance

This is part of main.py file I have changed

def train(data_loader, net, loss, epoch, optimizer, get_lr, save_freq, save_dir): start_time = time.time()

net.train()
lr = get_lr(epoch)
for param_group in optimizer.param_groups:
    param_group['lr'] = lr

metrics = []

for i, (data, target, coord) in enumerate(data_loader):
    if torch.cuda.is_available():
        data = Variable(data.cuda())
        target = Variable(target.cuda())
        coord = Variable(coord.cuda())
    data = data.float()
    target = target.float()
    coord = coord.float()


    optimizer.zero_grad()
    output = net(data, coord)
    loss_output = loss(output, target)
    loss_output[0].backward()
    optimizer.step()

    loss_output[0] = loss_output[0].item()    ####
    metrics.append(loss_output)

if epoch % args.save_freq == 0:            
    state_dict = net.state_dict()
    for key in state_dict.keys():
        state_dict[key] = state_dict[key].cpu()
        
    torch.save({
        'epoch': epoch,
        'save_dir': save_dir,
        'state_dict': state_dict,
        'args': args},
        os.path.join(save_dir, '%03d.ckpt' % epoch))

end_time = time.time()
metrics = np.asarray(metrics, np.float32)
print('Epoch %03d (lr %.5f)' % (epoch, lr))
print('Train:      tpr %3.2f, tnr %3.2f, total pos %d, total neg %d, time %3.2f' % (
    100.0 * np.sum(metrics[:, 6]) / np.sum(metrics[:, 7]),
    100.0 * np.sum(metrics[:, 8]) / np.sum(metrics[:, 9]),
    np.sum(metrics[:, 7]),
    np.sum(metrics[:, 9]),
    end_time - start_time))
print('loss %2.4f, classify loss %2.4f, regress loss %2.4f, %2.4f, %2.4f, %2.4f' % (
    np.mean(metrics[:, 0]),
    np.mean(metrics[:, 1]),
    np.mean(metrics[:, 2]),
    np.mean(metrics[:, 3]),
    np.mean(metrics[:, 4]),
    np.mean(metrics[:, 5])))
print()

Based on the above changes, as well in the layer.py file I have done changes at the following parts:

def hard_mining(neg_output, neg_labels, num_hard): _, idcs = torch.topk(neg_output, min(num_hard, len(neg_output))) neg_output = torch.index_select(neg_output, 0, idcs) neg_labels = torch.index_select(neg_labels, 0, idcs) return neg_output, neg_labels

class Loss(nn.Module): def init(self, num_hard = 2): super(Loss, self).init() self.sigmoid = nn.Sigmoid() self.classify_loss = nn.BCELoss() self.regress_loss = nn.SmoothL1Loss() self.num_hard = num_hard

def forward(self, output, labels, train = True):
    batch_size = labels.size(0)
    output = output.view(-1, 5)
    labels = labels.view(-1, 5)
    
    pos_idcs = labels[:, 0] > 0.5
    pos_idcs = pos_idcs.unsqueeze(1).expand(pos_idcs.size(0), 5)
    pos_output = output[pos_idcs].view(-1, 5)
    pos_labels = labels[pos_idcs].view(-1, 5)

    neg_idcs = labels[:, 0] < -0.5
    neg_output = output[:, 0][neg_idcs]
    neg_labels = labels[:, 0][neg_idcs]
    
    if self.num_hard > 0 and train:
        neg_output, neg_labels = hard_mining(neg_output, neg_labels, self.num_hard * batch_size)
    neg_prob = self.sigmoid(neg_output)

    #classify_loss = self.classify_loss(
     #   torch.cat((pos_prob, neg_prob), 0),
      #  torch.cat((pos_labels[:, 0], neg_labels + 1), 0))
    if len(pos_output)>0:
        pos_prob = self.sigmoid(pos_output[:, 0])
        pz, ph, pw, pd = pos_output[:, 1], pos_output[:, 2], pos_output[:, 3], pos_output[:, 4]
        lz, lh, lw, ld = pos_labels[:, 1], pos_labels[:, 2], pos_labels[:, 3], pos_labels[:, 4]

        regress_losses = [
            self.regress_loss(pz, lz),
            self.regress_loss(ph, lh),
            self.regress_loss(pw, lw),
            self.regress_loss(pd, ld)]
        regress_losses_data = [loz.item() for loz in regress_losses]  ### changed
        classify_loss = 0.5 * self.classify_loss(
        pos_prob, pos_labels[:, 0]) + 0.5 * self.classify_loss(
        neg_prob, neg_labels + 1)
        pos_correct = (pos_prob.data >= 0.5).sum()
        pos_total = len(pos_prob)

    else:
        regress_losses = [0,0,0,0]
        classify_loss =  0.5 * self.classify_loss(
        neg_prob, neg_labels + 1)
        pos_correct = 0
        pos_total = 0
        regress_losses_data = [0,0,0,0]
    classify_loss_data = classify_loss.item() #####changed
    loss = classify_loss
    for regress_loss in regress_losses:
        loss += regress_loss
    neg_correct = (neg_prob.data < 0.5).sum()
    neg_total = len(neg_prob)
    return [loss, classify_loss_data] + regress_losses_data + [pos_correct, pos_total, neg_correct, neg_total]

And in the data.py file, there are alot of int issues.

SirMwan avatar Jul 16 '21 04:07 SirMwan

Dear @SirMwan,

I think the changes you have made ( .data to .item() ) have no problem. I didn't change that part though, because it just worked fine with my pytorch version.

I guess some other changes have effected. You mentioned "a lot of int issues", and you probably made some changes to it to get it worked. If I remember correctly, those are python version issues:

  • for python 2.x, (int) / (int) -> (int) , but (float) / (int), (int) / (float), or (float) / (float) -> (float)
  • for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line. If the original code is to operate (int) / (int), then you should change "/" to "//" However, any of operand is (float), then you should not make change. That was the most tricky part when I dealt with version issue.

After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s.

I hope it helps. If you have already done it correctly but still have problem, then that would be another issue.

naoe1999 avatar Jul 18 '21 06:07 naoe1999

Dear @SirMwan,

I think the changes you have made ( .data to .item() ) have no problem. I didn't change that part though, because it just worked fine with my pytorch version.

I guess some other changes have effected. You mentioned "a lot of int issues", and you probably made some changes to it to get it worked. If I remember correctly, those are python version issues:

* for python 2.x, (int) / (int) -> (int) ,  but (float) / (int),  (int) / (float), or (float) / (float) -> (float)

* for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line. If the original code is to operate (int) / (int), then you should change "/" to "//" However, any of operand is (float), then you should not make change. That was the most tricky part when I dealt with version issue.

After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s.

I hope it helps. If you have already done it correctly but still have problem, then that would be another issue.

Dear @naoe1999

Kindly help me to cross check the lines I have changed here below: In changes I put # then number of changes

class DataBowl3Detector(Dataset): def init(self, data_dir, split_path, config, phase='train', split_comber=None): assert(phase == 'train' or phase == 'val' or phase == 'test') self.phase = phase self.max_stride = config['max_stride']
self.stride = config['stride']
sizelim = config['sizelim']/config['reso'] sizelim2 = config['sizelim2']//config['reso'] #1 sizelim3 = config['sizelim3']//config['reso'] #2

..... else: imgs = np.load(self.filenames[idx]) bboxes = self.sample_bboxes[idx] nz, nh, nw = imgs.shape[1:] pz = int(np.ceil(float(nz) / self.stride)) * self.stride ph = int(np.ceil(float(nh) / self.stride)) * self.stride pw = int(np.ceil(float(nw) / self.stride)) * self.stride imgs = np.pad(imgs, [[0,0],[0, pz - nz], [0, ph - nh], [0, pw - nw]], 'constant',constant_values = self.pad_value)

        xx,yy,zz = np.meshgrid(np.linspace(-0.5,0.5,imgs.shape[1]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[2]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[3]//self.stride),indexing ='ij')  #3,4,5
        coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')
        imgs, nzhw = self.split_comber.split(imgs)
        coord2, nzhw2 = self.split_comber.split(coord,
                                               side_len = self.split_comber.side_len//self.stride,
                                               max_stride = self.split_comber.max_stride//self.stride,
                                               margin = self.split_comber.margin//self.stride)  #6,7,8
        assert np.all(nzhw==nzhw2)
        imgs = (imgs.astype(np.float32)-128)/128
        return torch.from_numpy(imgs), bboxes, torch.from_numpy(coord2), np.array(nzhw)

def len(self): if self.phase == 'train': return len(self.bboxes)//(1-self.r_rand) #9 elif self.phase =='val': return len(self.bboxes) else: return len(self.sample_bboxes)

....... start = [] for i in range(3): if not isRand: r = target[3] // 2 #10 s = np.floor(target[i] - r)+ 1 - bound_size e = np.ceil (target[i] + r)+ 1 + bound_size - crop_size[i] else: s = np.max([imgs.shape[i+1]-crop_size[i]//2,imgs.shape[i+1]//2+bound_size]) e = np.min([crop_size[i]//2, imgs.shape[i+1]//2-bound_size]) #11,12,13,14 target = np.array([np.nan,np.nan,np.nan,np.nan]) if s>e: start.append(np.random.randint(e,s))#! else: start.append(int(target[i])-crop_size[i]//2+np.random.randint(-bound_size//2,bound_size//2)) #15,16,17

    normstart = np.array(start).astype('float32')/np.array(imgs.shape[1:])-0.5
    normsize = np.array(crop_size).astype('float32')/np.array(imgs.shape[1:])
    xx,yy,zz = np.meshgrid(np.linspace(normstart[0],normstart[0]+normsize[0],self.crop_size[0]//self.stride),
                       np.linspace(normstart[1],normstart[1]+normsize[1],self.crop_size[1]//self.stride),
                       np.linspace(normstart[2],normstart[2]+normsize[2],self.crop_size[2]//self.stride),indexing ='ij')  #18,19,20
    coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')

..... def call(self, input_size, target, bboxes, filename): stride = self.stride num_neg = self.num_neg th_neg = self.th_neg anchors = self.anchors th_pos = self.th_pos

    output_size = []
    for i in range(3):
        if input_size[i] % stride != 0:
            print(filename)
        # assert(input_size[i] % stride == 0) 
        output_size.append(input_size[i] // stride)  #21

SirMwan avatar Jul 22 '21 06:07 SirMwan

Then, when I am trying to visualize images with bounding boxes after prepare.py (preprocessing):-

  1. Some images displayed well, means images with the bounding boxes at the nodules,
  2. But some images only displayed for example "Groundtruth (1, 291, 180, 266) (1, 4) " without an image. Is the second case a problem?

SirMwan avatar Jul 23 '21 00:07 SirMwan

Dear @SirMwan ,

I checked it as below :

#1 ~ #2: I changed the previous line as well: sizelim = config['sizelim'] // config['reso'] I don't think it's critical either its result is float or int.

#3 ~ #5: OK #6 ~ #8: OK

#9: int(len(self.bboxes) / (1 - self.r_rand)) # this would be as intended without any error or warning.

#10: I didn't change this. "s" and "e" will be int by the following lines. : r = target[3] / 2

#11 ~ #14: OK #15 ~ #17: OK #18 ~ #20: OK #21: OK

Not-displaying image problem can happen by a bunch of reasons. You'd better look into the values in those images. If the values do not make sense, that would be the critical reason for failing training.

naoe1999 avatar Jul 23 '21 01:07 naoe1999

Dear @SirMwan ,

I checked it as below :

#1 ~ #2: I changed the previous line as well: sizelim = config['sizelim'] // config['reso'] I don't think it's critical either its result is float or int.

#3 ~ #5: OK #6 ~ #8: OK

#9: int(len(self.bboxes) / (1 - self.r_rand)) # this would be as intended without any error or warning.

#10: I didn't change this. "s" and "e" will be int by the following lines. : r = target[3] / 2

#11 ~ #14: OK #15 ~ #17: OK #18 ~ #20: OK #21: OK

Not-displaying image problem can happen by a bunch of reasons. You'd better look into the values in those images. If the values do not make sense, that would be the critical reason for failing training.

Dear @naoe1999 Thank you very much, Now it is going well.

Then let me ask this last question. In the model res18, there is "two MaxUnpool layers" but no where have been used in the code.

self.unmaxpool1 = nn.MaxUnpool3d(kernel_size=2,stride=2) self.unmaxpool2 = nn.MaxUnpool3d(kernel_size=2,stride=2)

Thank you very much.

SirMwan avatar Jul 23 '21 06:07 SirMwan

Dear @SirMwan,

Congrats!!! Pleasure to hear it's going well.

For your question of self.unmaxpool1 ..., I didn't find its usage either. Instead, nn.ConvTranspose3d() is used for the upscaling operation (for decoder part of U-net structure).

naoe1999 avatar Jul 23 '21 09:07 naoe1999

Dear @SirMwan, I think the changes you have made ( .data to .item() ) have no problem. I didn't change that part though, because it just worked fine with my pytorch version. I guess some other changes have effected. You mentioned "a lot of int issues", and you probably made some changes to it to get it worked. If I remember correctly, those are python version issues:

* for python 2.x, (int) / (int) -> (int) ,  but (float) / (int),  (int) / (float), or (float) / (float) -> (float)

* for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line. If the original code is to operate (int) / (int), then you should change "/" to "//" However, any of operand is (float), then you should not make change. That was the most tricky part when I dealt with version issue. After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s. I hope it helps. If you have already done it correctly but still have problem, then that would be another issue.

Dear @naoe1999

Kindly help me to cross check the lines I have changed here below: In changes I put # then number of changes

class DataBowl3Detector(Dataset): def init(self, data_dir, split_path, config, phase='train', split_comber=None): assert(phase == 'train' or phase == 'val' or phase == 'test') self.phase = phase self.max_stride = config['max_stride'] self.stride = config['stride'] sizelim = config['sizelim']/config['reso'] sizelim2 = config['sizelim2']//config['reso'] #1 sizelim3 = config['sizelim3']//config['reso'] #2

..... else: imgs = np.load(self.filenames[idx]) bboxes = self.sample_bboxes[idx] nz, nh, nw = imgs.shape[1:] pz = int(np.ceil(float(nz) / self.stride)) * self.stride ph = int(np.ceil(float(nh) / self.stride)) * self.stride pw = int(np.ceil(float(nw) / self.stride)) * self.stride imgs = np.pad(imgs, [[0,0],[0, pz - nz], [0, ph - nh], [0, pw - nw]], 'constant',constant_values = self.pad_value)

        xx,yy,zz = np.meshgrid(np.linspace(-0.5,0.5,imgs.shape[1]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[2]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[3]//self.stride),indexing ='ij')  #3,4,5
        coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')
        imgs, nzhw = self.split_comber.split(imgs)
        coord2, nzhw2 = self.split_comber.split(coord,
                                               side_len = self.split_comber.side_len//self.stride,
                                               max_stride = self.split_comber.max_stride//self.stride,
                                               margin = self.split_comber.margin//self.stride)  #6,7,8
        assert np.all(nzhw==nzhw2)
        imgs = (imgs.astype(np.float32)-128)/128
        return torch.from_numpy(imgs), bboxes, torch.from_numpy(coord2), np.array(nzhw)

def len(self): if self.phase == 'train': return len(self.bboxes)//(1-self.r_rand) #9 elif self.phase =='val': return len(self.bboxes) else: return len(self.sample_bboxes)

....... start = [] for i in range(3): if not isRand: r = target[3] // 2 #10 s = np.floor(target[i] - r)+ 1 - bound_size e = np.ceil (target[i] + r)+ 1 + bound_size - crop_size[i] else: s = np.max([imgs.shape[i+1]-crop_size[i]//2,imgs.shape[i+1]//2+bound_size]) e = np.min([crop_size[i]//2, imgs.shape[i+1]//2-bound_size]) #11,12,13,14 target = np.array([np.nan,np.nan,np.nan,np.nan]) if s>e: start.append(np.random.randint(e,s))#! else: start.append(int(target[i])-crop_size[i]//2+np.random.randint(-bound_size//2,bound_size//2)) #15,16,17

    normstart = np.array(start).astype('float32')/np.array(imgs.shape[1:])-0.5
    normsize = np.array(crop_size).astype('float32')/np.array(imgs.shape[1:])
    xx,yy,zz = np.meshgrid(np.linspace(normstart[0],normstart[0]+normsize[0],self.crop_size[0]//self.stride),
                       np.linspace(normstart[1],normstart[1]+normsize[1],self.crop_size[1]//self.stride),
                       np.linspace(normstart[2],normstart[2]+normsize[2],self.crop_size[2]//self.stride),indexing ='ij')  #18,19,20
    coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')

..... def call(self, input_size, target, bboxes, filename): stride = self.stride num_neg = self.num_neg th_neg = self.th_neg anchors = self.anchors th_pos = self.th_pos

    output_size = []
    for i in range(3):
        if input_size[i] % stride != 0:
            print(filename)
        # assert(input_size[i] % stride == 0) 
        output_size.append(input_size[i] // stride)  #21

Dear @SirMwan,

I think the changes you have made ( .data to .item() ) have no problem. I didn't change that part though, because it just worked fine with my pytorch version.

I guess some other changes have effected. You mentioned "a lot of int issues", and you probably made some changes to it to get it worked. If I remember correctly, those are python version issues:

  • for python 2.x, (int) / (int) -> (int) , but (float) / (int), (int) / (float), or (float) / (float) -> (float)
  • for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line. If the original code is to operate (int) / (int), then you should change "/" to "//" However, any of operand is (float), then you should not make change. That was the most tricky part when I dealt with version issue.

After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s.

I hope it helps. If you have already done it correctly but still have problem, then that would be another issue.

I also encountered the same problem, can you share your code with me: [email protected], thank you very much

Yuke2021 avatar Sep 28 '21 03:09 Yuke2021

Dear @SirMwan, I think the changes you have made ( .data to .item() ) have no problem. I didn't change that part though, because it just worked fine with my pytorch version. I guess some other changes have effected. You mentioned "a lot of int issues", and you probably made some changes to it to get it worked. If I remember correctly, those are python version issues:

* for python 2.x, (int) / (int) -> (int) ,  but (float) / (int),  (int) / (float), or (float) / (float) -> (float)

* for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line. If the original code is to operate (int) / (int), then you should change "/" to "//" However, any of operand is (float), then you should not make change. That was the most tricky part when I dealt with version issue. After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s. I hope it helps. If you have already done it correctly but still have problem, then that would be another issue.

Dear @naoe1999

Kindly help me to cross check the lines I have changed here below: In changes I put # then number of changes

class DataBowl3Detector(Dataset): def init(self, data_dir, split_path, config, phase='train', split_comber=None): assert(phase == 'train' or phase == 'val' or phase == 'test') self.phase = phase self.max_stride = config['max_stride'] self.stride = config['stride'] sizelim = config['sizelim']/config['reso'] sizelim2 = config['sizelim2']//config['reso'] #1 sizelim3 = config['sizelim3']//config['reso'] #2

..... else: imgs = np.load(self.filenames[idx]) bboxes = self.sample_bboxes[idx] nz, nh, nw = imgs.shape[1:] pz = int(np.ceil(float(nz) / self.stride)) * self.stride ph = int(np.ceil(float(nh) / self.stride)) * self.stride pw = int(np.ceil(float(nw) / self.stride)) * self.stride imgs = np.pad(imgs, [[0,0],[0, pz - nz], [0, ph - nh], [0, pw - nw]], 'constant',constant_values = self.pad_value)

        xx,yy,zz = np.meshgrid(np.linspace(-0.5,0.5,imgs.shape[1]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[2]//self.stride),
                               np.linspace(-0.5,0.5,imgs.shape[3]//self.stride),indexing ='ij')  #3,4,5
        coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')
        imgs, nzhw = self.split_comber.split(imgs)
        coord2, nzhw2 = self.split_comber.split(coord,
                                               side_len = self.split_comber.side_len//self.stride,
                                               max_stride = self.split_comber.max_stride//self.stride,
                                               margin = self.split_comber.margin//self.stride)  #6,7,8
        assert np.all(nzhw==nzhw2)
        imgs = (imgs.astype(np.float32)-128)/128
        return torch.from_numpy(imgs), bboxes, torch.from_numpy(coord2), np.array(nzhw)

def len(self): if self.phase == 'train': return len(self.bboxes)//(1-self.r_rand) #9 elif self.phase =='val': return len(self.bboxes) else: return len(self.sample_bboxes)

....... start = [] for i in range(3): if not isRand: r = target[3] // 2 #10 s = np.floor(target[i] - r)+ 1 - bound_size e = np.ceil (target[i] + r)+ 1 + bound_size - crop_size[i] else: s = np.max([imgs.shape[i+1]-crop_size[i]//2,imgs.shape[i+1]//2+bound_size]) e = np.min([crop_size[i]//2, imgs.shape[i+1]//2-bound_size]) #11,12,13,14 target = np.array([np.nan,np.nan,np.nan,np.nan]) if s>e: start.append(np.random.randint(e,s))#! else: start.append(int(target[i])-crop_size[i]//2+np.random.randint(-bound_size//2,bound_size//2)) #15,16,17

    normstart = np.array(start).astype('float32')/np.array(imgs.shape[1:])-0.5
    normsize = np.array(crop_size).astype('float32')/np.array(imgs.shape[1:])
    xx,yy,zz = np.meshgrid(np.linspace(normstart[0],normstart[0]+normsize[0],self.crop_size[0]//self.stride),
                       np.linspace(normstart[1],normstart[1]+normsize[1],self.crop_size[1]//self.stride),
                       np.linspace(normstart[2],normstart[2]+normsize[2],self.crop_size[2]//self.stride),indexing ='ij')  #18,19,20
    coord = np.concatenate([xx[np.newaxis,...], yy[np.newaxis,...],zz[np.newaxis,:]],0).astype('float32')

..... def call(self, input_size, target, bboxes, filename): stride = self.stride num_neg = self.num_neg th_neg = self.th_neg anchors = self.anchors th_pos = self.th_pos

    output_size = []
    for i in range(3):
        if input_size[i] % stride != 0:
            print(filename)
        # assert(input_size[i] % stride == 0) 
        output_size.append(input_size[i] // stride)  #21

Dear @SirMwan,

I think the changes you have made ( .data to .item() ) have no problem. I didn't change that part though, because it just worked fine with my pytorch version.

I guess some other changes have effected. You mentioned "a lot of int issues", and you probably made some changes to it to get it worked. If I remember correctly, those are python version issues:

  • for python 2.x, (int) / (int) -> (int) , but (float) / (int), (int) / (float), or (float) / (float) -> (float)
  • for python 3.x, all of them -> (float)

Thus, when you make change you have to review the code line by line. If the original code is to operate (int) / (int), then you should change "/" to "//" However, any of operand is (float), then you should not make change. That was the most tricky part when I dealt with version issue.

After those changes, my data.py has 20 "//"s, and split_combine.py has 2 "//"s.

I hope it helps. If you have already done it correctly but still have problem, then that would be another issue.

I also encountered the same problem, can you share your code with me: [email protected], thank you very much

Yuke2021 avatar Sep 28 '21 03:09 Yuke2021