Total3DUnderstanding icon indicating copy to clipboard operation
Total3DUnderstanding copied to clipboard

One Different Image Layout Estimation and Drawing 3D Layout

Open emirhanKural opened this issue 3 years ago • 1 comments

Hi, I really appreciate the project and hope it can be developed more :)

Now, I'm trying to do only layout_estimation. My purpose is to give an image and take its layout 3D image. Like this : image image

First problem is that how can cam_K be estimated ? I have check out all code samples. I can could layout and cam_R estimation. In all your samples you use cam_K of data to draw 3D layout. How can I predict it or is there any way to draw 3D without cam_K.

Second problem is that I don't know I am doing correctly but when I tried to estimate layouts of demo datas, my results were really bad. I used demo.py steps to predict layout points. For weight, I used your pretained_model firstly, then I trained 100 epochs and tried its weight. But the results was same.

I used here @chengzhag's layout_estimation.yaml

def estimate(img_path):
    cfg = CONFIG("configs/layout_estimation.yaml",)
    checkpoint = CheckpointIO(cfg)
    cfg = mount_external_config(cfg)
    device = load_device(cfg)
    cfg.config["mode"] = "demo"
    net = load_model(cfg, device=device)
    checkpoint.register_modules(net=net)

    cfg.config['demo_path'] = img_path
    data = load_demo_data(cfg.config['demo_path'], device)

    with torch.no_grad():
        est_data = net(data)
    
    

    lo_bdb3D_out = get_layout_bdb_sunrgbd(cfg.bins_tensor, est_data['lo_ori_reg_result'],
                                          torch.argmax(est_data['lo_ori_cls_result'], 1),
                                          est_data['lo_centroid_result'],
                                          est_data['lo_coeffs_result'])
    layout = lo_bdb3D_out[0,:,:].cpu().numpy()
    
    cam_R_out = get_rotation_matix_result(cfg.bins_tensor,
                                          torch.argmax(est_data['pitch_cls_result'], 1), est_data['pitch_reg_result'],
                                          torch.argmax(est_data['roll_cls_result'], 1), est_data['roll_reg_result'])
    pre_cam_R = cam_R_out[0, :, :].cpu().numpy()

    pre_layout = format_layout(layout)
    
    return pre_layout, pre_cam_R

To draw 3D layout : (I'm getting cam_K of the sample. Not shown here)

img_path = "./demo/inputs/1"
sequence_id = img_path[-1]   
    
rgb_image = np.asarray(Image.open(img_path+"/img.jpg").convert('RGB'))
pre_layout , pre_cam_R = estimate(img_path)
scene_box = Box(rgb_image, None, cam_K, None, pre_cam_R, None,
                pre_layout, None, None, 'prediction', None)

scene_box.draw3D(if_save=True, save_path = './demo/sunrgbd/%s_recon.png' % (sequence_id))

I got results like this: image

It should seem like this : image

I hope that I could express myself clearly. Thank very much^^

emirhanKural avatar Jul 01 '21 18:07 emirhanKural

Hi,

In the paper, we actually ask for camera intrinsics (i.e., cam_K), otherwise, this problem would be extremely ambiguous.

For the layout estimation in our demo, our prediction is here. The figure is exactly produced by our demo code.

Best, Yinyu

yinyunie avatar Jul 18 '21 19:07 yinyunie