vedatad icon indicating copy to clipboard operation
vedatad copied to clipboard

Question about pipeline for Inference with `InferEngine`

Open connor-john opened this issue 3 years ago • 0 comments

How is data needed to be prepared for using InferEngine If my inference was something like

def read_video(video):
    '''Read video prepare video_metas'''
    pass

def prepare(cfg, checkpoint):
    engine = build_engine(cfg.infer_engine)
    load_weights(engine.model, checkpoint, map_location='cpu')

    device = torch.cuda.current_device()
    engine = MMDataParallel(
        engine.to(device), device_ids=[torch.cuda.current_device()])

    data_pipeline = Compose(cfg.data_pipeline)

    return engine, data_pipeline

def main():
    args = parse_args()
    cfg = Config.fromfile(args.config)

    engine, data_pipeline = prepare(cfg, args.checkpoint)

    imgs, video_metas = read_video(args.video)

    data = data_pipeline(imgs)
    
    # scatter here

    results = engine.infer(data['imgs'], video_metas)

    print(results)

I will likely need to change the pipeline from the default but to what

data_pipeline=[
    dict(typename='LoadMetaInfo'), # probably dont need
    dict(typename='Time2Frame'), # probably dont need
    dict(
        typename='OverlapCropAug',
        num_frames=num_frames,
        overlap_ratio=overlap_ratio,
        transforms=[
            dict(typename='TemporalCrop'),
            dict(typename='LoadFrames', to_float32=True), # probably dont need 
            dict(typename='SpatialCenterCrop', crop_size=img_shape),
            dict(typename='Normalize', **img_norm_cfg),
            dict(typename='Pad', size=(num_frames, *img_shape)),
            dict(typename='DefaultFormatBundle'),
            dict(typename='Collect', keys=['imgs'])
    ])
]

I imagine ImageToTensor is needed as a last step before Collect and loading the frame will need to be different Any clues or help is appreciated

connor-john avatar Jan 24 '22 01:01 connor-john