OpenPCDet icon indicating copy to clipboard operation
OpenPCDet copied to clipboard

Train CenterNet 4 Frames Backbone for MPPNet and Performance Doesnt Good

Open christofel04 opened this issue 1 year ago • 0 comments

Hello Writer... Thank you so much for developing open source tools for SOTA 3D object detections. I am trying to recreate MPPNet performance for multiframe 3D object detection in Waymo Open Dataset... Like written in documentation first I trained CenterNet 4 frames backbone for MPPNet with default parameters like in configuration with more epochs (72 epochs).

CLASS_NAMES: ['Vehicle', 'Pedestrian', 'Cyclist']

DATA_CONFIG:
    _BASE_CONFIG_: cfgs/dataset_configs/waymo_dataset_multiframe.yaml

    SAMPLED_INTERVAL: {
        'train': 1,
        'test': 1
    }

MODEL:
    NAME: CenterPoint

    VFE:
        NAME: MeanVFE

    BACKBONE_3D:
        NAME: VoxelResBackBone8x

    MAP_TO_BEV:
        NAME: HeightCompression
        NUM_BEV_FEATURES: 256

    BACKBONE_2D:
        NAME: BaseBEVBackbone

        LAYER_NUMS: [5, 5]
        LAYER_STRIDES: [1, 2]
        NUM_FILTERS: [128, 256]
        UPSAMPLE_STRIDES: [1, 2]
        NUM_UPSAMPLE_FILTERS: [256, 256]

    DENSE_HEAD:
        NAME: CenterHead
        CLASS_AGNOSTIC: False

        CLASS_NAMES_EACH_HEAD: [
            ['Vehicle', 'Pedestrian', 'Cyclist']
        ]

        SHARED_CONV_CHANNEL: 64
        USE_BIAS_BEFORE_NORM: True
        NUM_HM_CONV: 2
        SEPARATE_HEAD_CFG:
            HEAD_ORDER: ['center', 'center_z', 'dim', 'rot', 'vel']
            HEAD_DICT: {
                'center': {'out_channels': 2, 'num_conv': 2},
                'center_z': {'out_channels': 1, 'num_conv': 2},
                'dim': {'out_channels': 3, 'num_conv': 2},
                'rot': {'out_channels': 2, 'num_conv': 2},
                'vel': {'out_channels': 2, 'num_conv': 2},
            }

        TARGET_ASSIGNER_CONFIG:
            FEATURE_MAP_STRIDE: 8
            NUM_MAX_OBJS: 500
            GAUSSIAN_OVERLAP: 0.1
            MIN_RADIUS: 2

        LOSS_CONFIG:
            LOSS_WEIGHTS: {
                'cls_weight': 1.0,
                'loc_weight': 2.0,
                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.2, 0.2]
            }

        POST_PROCESSING:
            SCORE_THRESH: 0.1
            POST_CENTER_LIMIT_RANGE: [-75.2, -75.2, -2, 75.2, 75.2, 4]
            MAX_OBJ_PER_SAMPLE: 500
            NMS_CONFIG:
                NMS_TYPE: nms_gpu
                NMS_THRESH: 0.7
                NMS_PRE_MAXSIZE: 4096
                NMS_POST_MAXSIZE: 500

    POST_PROCESSING:
        RECALL_THRESH_LIST: [0.3, 0.5, 0.7]

        EVAL_METRIC: kitti #waymo


OPTIMIZATION:
    BATCH_SIZE_PER_GPU: 2
    NUM_EPOCHS: 72 #72

    OPTIMIZER: adam_onecycle
    LR: 0.003
    WEIGHT_DECAY: 0.01
    MOMENTUM: 0.9

    MOMS: [0.95, 0.85]
    PCT_START: 0.4
    DIV_FACTOR: 10
    DECAY_STEP_LIST: [35, 45]
    LR_DECAY: 0.1
    LR_CLIP: 0.0000001

    LR_WARMUP: False
    WARMUP_EPOCH: 1

    GRAD_NORM_CLIP: 10

I trained the model for 200 hours and after I evaluate CenterNet performance is very bad. Because I have low memory disk I only produce ground truth points with limited size that is not all samples extracted ground truth points. The performance of Center Net 4 frames can be seen as below.

Screenshot from 2024-01-25 14-46-39

All ground truth almost become Level 2 samples because not all ground truth points I extracted because I have low memory disk. The training loss of training Center Net 4 frames also still high that is around 1.4 in the 72 epoch.

Screenshot from 2024-01-25 14-49-19

Is anyone can give me advice how to train CenterNet 4 frames in Waymo Open Dataset ? I use 20 % Waymo Open Dataset training test and not extract all ground truth points because I have low memory disk.

Is my CenterNet 4 frames doesnt do well because I dont extract all ground truth points ?

Thank you for advice and suggestions...

christofel04 avatar Jan 25 '24 05:01 christofel04