yolov7_d2
yolov7_d2 copied to clipboard
Cann't train with solov2 config
Hi, Thanks for this repo :-) I am try to train the network (and do an overfitt) using solov2 config When I start the training process I see the image but the masks are wrong (the image flipped, but the masks not). When I close the image, the training crash. attached the log. Thanks
python3 train_net.py --config-file configs/coco-instance/solov2_lite.yaml
Install mish-cuda to speed up training and inference. More importantly, replace the naive Mish with MishCuda will give a ~1.5G memory saving during training.
Command Line Args: Namespace(config_file='configs/coco-instance/solov2_lite.yaml', dist_url='tcp://127.0.0.1:50152', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=[], resume=False)
[05/09 13:50:05 detectron2]: Rank of current process: 0. World size: 1
[05/09 13:50:06 detectron2]: Environment info:
---------------------- ---------------------------------------------------------------------
sys.platform linux
Python 3.6.9 (default, Mar 15 2022, 13:55:28) [GCC 8.4.0]
numpy 1.19.2
detectron2 0.6 @/home/ws/.local/lib/python3.6/site-packages/detectron2
Compiler GCC 7.3
CUDA compiler CUDA 10.2
detectron2 arch flags 3.7, 5.0, 5.2, 6.0, 6.1, 7.0, 7.5
DETECTRON2_ENV_MODULE <not set>
PyTorch 1.10.0+cu102 @/home/ws/.local/lib/python3.6/site-packages/torch
PyTorch debug build False
GPU available Yes
GPU 0 GeForce RTX 2080 Ti (arch=7.5)
Driver version 450.57
CUDA_HOME /usr/local/cuda-10.2
Pillow 6.2.2
torchvision 0.11.0+cu102 @/home/ws/.local/lib/python3.6/site-packages/torchvision
torchvision arch flags 3.5, 5.0, 6.0, 7.0, 7.5
fvcore 0.1.5.post20220414
iopath 0.1.9
cv2 4.5.5
---------------------- ---------------------------------------------------------------------
PyTorch built with:
- GCC 7.3
- C++ Version: 201402
- Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 10.2
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
- CuDNN 7.6.5
- Magma 2.5.2
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,
[05/09 13:50:06 detectron2]: Command line arguments: Namespace(config_file='configs/coco-instance/solov2_lite.yaml', dist_url='tcp://127.0.0.1:50152', eval_only=False, machine_rank=0, num_gpus=1, num_machines=1, opts=[], resume=False)
[05/09 13:50:06 detectron2]: Contents of args.config_file=configs/coco-instance/solov2_lite.yaml:
MODEL:
META_ARCHITECTURE: "SOLOv2"
MASK_ON: True
BACKBONE:
NAME: "build_resnet_fpn_backbone"
RESNETS:
OUT_FEATURES: ["res2", "res3", "res4", "res5"]
FPN:
IN_FEATURES: ["res2", "res3", "res4", "res5"]
SOLOV2:
FPN_SCALE_RANGES: ((1, 56), (28, 112), (56, 224), (112, 448), (224, 896))
NUM_GRIDS: [40, 36, 24, 16, 12]
NUM_INSTANCE_CONVS: 2
NUM_KERNELS: 256
INSTANCE_IN_CHANNELS: 256
INSTANCE_CHANNELS: 128
MASK_IN_CHANNELS: 256
MASK_CHANNELS: 128
NORM: "SyncBN"
DATASETS:
TRAIN: ("nets_kinneret_only24",)
TEST: ("nets_kinneret_only24",)
SOLVER:
IMS_PER_BATCH: 8
BASE_LR: 0.01
WARMUP_FACTOR: 0.01
WARMUP_ITERS: 1000
STEPS: (60000, 80000)
MAX_ITER: 90000
INPUT:
MIN_SIZE_TRAIN: (640, 672, 704, 736, 768, 800)
MASK_FORMAT: "bitmask"
VERSION: 2
[05/09 13:50:06 detectron2]: Running with full config:
CUDNN_BENCHMARK: false
DATALOADER:
ASPECT_RATIO_GROUPING: true
FILTER_EMPTY_ANNOTATIONS: true
NUM_WORKERS: 4
REPEAT_THRESHOLD: 0.0
SAMPLER_TRAIN: TrainingSampler
DATASETS:
CLASS_NAMES: []
PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000
PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000
PROPOSAL_FILES_TEST: []
PROPOSAL_FILES_TRAIN: []
TEST:
- nets_kinneret_only24
TRAIN:
- nets_kinneret_only24
GLOBAL:
HACK: 1.0
INPUT:
COLOR_JITTER:
BRIGHTNESS: false
LIGHTING: false
SATURATION: false
CROP:
ENABLED: false
SIZE:
- 0.9
- 0.9
TYPE: relative_range
DISTORTION:
ENABLED: false
EXPOSURE: 1.5
HUE: 0.1
SATURATION: 1.5
FORMAT: BGR
GRID_MASK:
ENABLED: false
MODE: 1
PROB: 0.3
USE_HEIGHT: true
USE_WIDTH: true
INPUT_SIZE:
- 640
- 640
JITTER_CROP:
ENABLED: false
JITTER_RATIO: 0.3
MASK_FORMAT: bitmask
MAX_SIZE_TEST: 1333
MAX_SIZE_TRAIN: 1333
MIN_SIZE_TEST: 800
MIN_SIZE_TRAIN:
- 640
- 672
- 704
- 736
- 768
- 800
MIN_SIZE_TRAIN_SAMPLING: choice
MOSAIC:
DEBUG_VIS: false
ENABLED: false
MIN_OFFSET: 0.2
MOSAIC_HEIGHT: 640
MOSAIC_WIDTH: 640
NUM_IMAGES: 4
POOL_CAPACITY: 1000
MOSAIC_AND_MIXUP:
DEBUG_VIS: false
DEGREES: 10.0
DISABLE_AT_ITER: 120000
ENABLED: false
ENABLE_MIXUP: true
MOSAIC_HEIGHT_RANGE:
- 512
- 800
MOSAIC_WIDTH_RANGE:
- 512
- 800
MSCALE:
- 0.5
- 1.5
NUM_IMAGES: 4
PERSPECTIVE: 0.0
POOL_CAPACITY: 1000
SCALE:
- 0.5
- 1.5
SHEAR: 2.0
TRANSLATE: 0.1
RANDOM_FLIP: horizontal
RESIZE:
ENABLED: false
SCALE_JITTER:
- 0.8
- 1.2
SHAPE:
- 640
- 640
TEST_SHAPE:
- 608
- 608
SHIFT:
SHIFT_PIXELS: 32
MODEL:
ANCHOR_GENERATOR:
ANGLES:
- - -90
- 0
- 90
ASPECT_RATIOS:
- - 0.5
- 1.0
- 2.0
NAME: DefaultAnchorGenerator
OFFSET: 0.0
SIZES:
- - 32
- 64
- 128
- 256
- 512
BACKBONE:
CHANNEL: 0
FREEZE_AT: 2
NAME: build_resnet_fpn_backbone
SIMPLE: false
STRIDE: 1
BIFPN:
NORM: GN
NUM_BIFPN: 6
NUM_LEVELS: 5
OUT_CHANNELS: 160
SEPARABLE_CONV: false
DARKNET:
DEPTH: 53
DEPTH_WISE: false
NORM: BN
OUT_FEATURES:
- dark3
- dark4
- dark5
RES5_DILATION: 1
STEM_OUT_CHANNELS: 32
WEIGHTS: ''
WITH_CSP: true
DETR:
ATTENTION_TYPE: DETR
BBOX_EMBED_NUM_LAYERS: 3
CENTERED_POSITION_ENCODIND: false
CLS_WEIGHT: 1.0
DECODER_BLOCK_GRAD: true
DEC_LAYERS: 6
DEEP_SUPERVISION: true
DEFORMABLE: false
DIM_FEEDFORWARD: 2048
DROPOUT: 0.1
ENC_LAYERS: 6
FROZEN_WEIGHTS: ''
GIOU_WEIGHT: 2.0
HIDDEN_DIM: 256
L1_WEIGHT: 5.0
NHEADS: 8
NO_OBJECT_WEIGHT: 0.1
NUM_CLASSES: 80
NUM_FEATURE_LEVELS: 1
NUM_OBJECT_QUERIES: 100
NUM_QUERY_PATTERN: 3
NUM_QUERY_POSITION: 300
PRE_NORM: false
SPATIAL_PRIOR: learned
TWO_STAGE: false
USE_FOCAL_LOSS: false
WITH_BOX_REFINE: false
DEVICE: cuda
EFFICIENTNET:
FEATURE_INDICES:
- 1
- 4
- 10
- 15
NAME: efficientnet_b0
OUT_FEATURES:
- stride4
- stride8
- stride16
- stride32
PRETRAINED: true
FBNET_V2:
ARCH: default
ARCH_DEF: []
NORM: bn
NORM_ARGS: []
OUT_FEATURES:
- trunk3
SCALE_FACTOR: 1.0
STEM_IN_CHANNELS: 3
WIDTH_DIVISOR: 1
FPN:
FUSE_TYPE: sum
IN_FEATURES:
- res2
- res3
- res4
- res5
NORM: ''
OUT_CHANNELS: 256
OUT_CHANNELS_LIST:
- 256
- 512
- 1024
REPEAT: 2
KEYPOINT_ON: false
LOAD_PROPOSALS: false
MASK_ON: true
META_ARCHITECTURE: SOLOv2
NMS_TYPE: normal
ONNX_EXPORT: false
PADDED_VALUE: 114.0
PANOPTIC_FPN:
COMBINE:
ENABLED: true
INSTANCES_CONFIDENCE_THRESH: 0.5
OVERLAP_THRESH: 0.5
STUFF_AREA_LIMIT: 4096
INSTANCE_LOSS_WEIGHT: 1.0
PIXEL_MEAN:
- 103.53
- 116.28
- 123.675
PIXEL_STD:
- 1.0
- 1.0
- 1.0
PROPOSAL_GENERATOR:
MIN_SIZE: 0
NAME: RPN
REGNETS:
OUT_FEATURES:
- s2
- s3
- s4
TYPE: x
RESNETS:
DEFORM_MODULATED: false
DEFORM_NUM_GROUPS: 1
DEFORM_ON_PER_STAGE:
- false
- false
- false
- false
DEPTH: 50
NORM: FrozenBN
NUM_GROUPS: 1
OUT_FEATURES:
- res2
- res3
- res4
- res5
R2TYPE: res2net50_v1d
RES2_OUT_CHANNELS: 256
RES5_DILATION: 1
STEM_OUT_CHANNELS: 64
STRIDE_IN_1X1: true
WIDTH_PER_GROUP: 64
RETINANET:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_WEIGHTS:
- 1.0
- 1.0
- 1.0
- 1.0
FOCAL_LOSS_ALPHA: 0.25
FOCAL_LOSS_GAMMA: 2.0
IN_FEATURES:
- p3
- p4
- p5
- p6
- p7
IOU_LABELS:
- 0
- -1
- 1
IOU_THRESHOLDS:
- 0.4
- 0.5
NMS_THRESH_TEST: 0.5
NORM: ''
NUM_CLASSES: 80
NUM_CONVS: 4
PRIOR_PROB: 0.01
SCORE_THRESH_TEST: 0.05
SMOOTH_L1_LOSS_BETA: 0.1
TOPK_CANDIDATES_TEST: 1000
ROI_BOX_CASCADE_HEAD:
BBOX_REG_WEIGHTS:
- - 10.0
- 10.0
- 5.0
- 5.0
- - 20.0
- 20.0
- 10.0
- 10.0
- - 30.0
- 30.0
- 15.0
- 15.0
IOUS:
- 0.5
- 0.6
- 0.7
ROI_BOX_HEAD:
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS:
- 10.0
- 10.0
- 5.0
- 5.0
CLS_AGNOSTIC_BBOX_REG: false
CONV_DIM: 256
FC_DIM: 1024
NAME: ''
NORM: ''
NUM_CONV: 0
NUM_FC: 0
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
SMOOTH_L1_BETA: 0.0
TRAIN_ON_PRED_BOXES: false
ROI_HEADS:
BATCH_SIZE_PER_IMAGE: 512
IN_FEATURES:
- res4
IOU_LABELS:
- 0
- 1
IOU_THRESHOLDS:
- 0.5
NAME: Res5ROIHeads
NMS_THRESH_TEST: 0.5
NUM_CLASSES: 80
POSITIVE_FRACTION: 0.25
PROPOSAL_APPEND_GT: true
SCORE_THRESH_TEST: 0.05
ROI_KEYPOINT_HEAD:
CONV_DIMS:
- 512
- 512
- 512
- 512
- 512
- 512
- 512
- 512
LOSS_WEIGHT: 1.0
MIN_KEYPOINTS_PER_IMAGE: 1
NAME: KRCNNConvDeconvUpsampleHead
NORMALIZE_LOSS_BY_VISIBLE_KEYPOINTS: true
NUM_KEYPOINTS: 17
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
ROI_MASK_HEAD:
CLS_AGNOSTIC_MASK: false
CONV_DIM: 256
NAME: MaskRCNNConvUpsampleHead
NORM: ''
NUM_CONV: 0
POOLER_RESOLUTION: 14
POOLER_SAMPLING_RATIO: 0
POOLER_TYPE: ROIAlignV2
RPN:
BATCH_SIZE_PER_IMAGE: 256
BBOX_REG_LOSS_TYPE: smooth_l1
BBOX_REG_LOSS_WEIGHT: 1.0
BBOX_REG_WEIGHTS:
- 1.0
- 1.0
- 1.0
- 1.0
BOUNDARY_THRESH: -1
CONV_DIMS:
- -1
HEAD_NAME: StandardRPNHead
IN_FEATURES:
- res4
IOU_LABELS:
- 0
- -1
- 1
IOU_THRESHOLDS:
- 0.3
- 0.7
LOSS_WEIGHT: 1.0
NMS_THRESH: 0.7
POSITIVE_FRACTION: 0.5
POST_NMS_TOPK_TEST: 1000
POST_NMS_TOPK_TRAIN: 2000
PRE_NMS_TOPK_TEST: 6000
PRE_NMS_TOPK_TRAIN: 12000
SMOOTH_L1_BETA: 0.0
SEM_SEG_HEAD:
COMMON_STRIDE: 4
CONVS_DIM: 128
IGNORE_VALUE: 255
IN_FEATURES:
- p2
- p3
- p4
- p5
LOSS_WEIGHT: 1.0
NAME: SemSegFPNHead
NORM: GN
NUM_CLASSES: 54
SOLOV2:
FPN_INSTANCE_STRIDES:
- 8
- 8
- 16
- 32
- 32
FPN_SCALE_RANGES:
- - 1
- 56
- - 28
- 112
- - 56
- 224
- - 112
- 448
- - 224
- 896
INSTANCE_CHANNELS: 128
INSTANCE_IN_CHANNELS: 256
INSTANCE_IN_FEATURES:
- p2
- p3
- p4
- p5
- p6
LOSS:
DICE_WEIGHT: 3.0
FOCAL_ALPHA: 0.25
FOCAL_GAMMA: 2.0
FOCAL_USE_SIGMOID: true
FOCAL_WEIGHT: 1.0
MASK_CHANNELS: 128
MASK_IN_CHANNELS: 256
MASK_IN_FEATURES:
- p2
- p3
- p4
- p5
MASK_THR: 0.5
MAX_PER_IMG: 100
NMS_KERNEL: gaussian
NMS_PRE: 500
NMS_SIGMA: 2
NMS_TYPE: matrix
NORM: SyncBN
NUM_CLASSES: 80
NUM_GRIDS:
- 40
- 36
- 24
- 16
- 12
NUM_INSTANCE_CONVS: 2
NUM_KERNELS: 256
NUM_MASKS: 256
PRIOR_PROB: 0.01
SCORE_THR: 0.1
SIGMA: 0.2
TYPE_DCN: DCN
UPDATE_THR: 0.05
USE_COORD_CONV: true
USE_DCN_IN_INSTANCE: false
SPARSE_INST:
CLS_THRESHOLD: 0.005
DATASET_MAPPER: SparseInstDatasetMapper
DECODER:
GROUPS: 4
INST:
CONVS: 4
DIM: 256
KERNEL_DIM: 128
MASK:
CONVS: 4
DIM: 256
NAME: BaseIAMDecoder
NUM_CLASSES: 80
NUM_MASKS: 100
OUTPUT_IAM: false
SCALE_FACTOR: 2.0
ENCODER:
IN_FEATURES:
- res3
- res4
- res5
NAME: FPNPPMEncoder
NORM: ''
NUM_CHANNELS: 256
LOSS:
CLASS_WEIGHT: 2.0
ITEMS:
- labels
- masks
MASK_DICE_WEIGHT: 2.0
MASK_PIXEL_WEIGHT: 5.0
NAME: SparseInstCriterion
OBJECTNESS_WEIGHT: 1.0
MASK_THRESHOLD: 0.45
MATCHER:
ALPHA: 0.8
BETA: 0.2
NAME: SparseInstMatcher
MAX_DETECTIONS: 100
SWIN:
DEPTHS:
- 2
- 2
- 6
- 2
OUT_FEATURES:
- 1
- 2
- 3
PATCH: 4
TYPE: tiny
WEIGHTS: ''
WINDOW: 7
VT_FPN:
HEADS: 16
IN_FEATURES:
- res2
- res3
- res4
- res5
LAYERS: 3
MIN_GROUP_PLANES: 64
NORM: BN
OUT_CHANNELS: 256
POS_HWS: []
POS_N_DOWNSAMPLE: []
TOKEN_C: 1024
TOKEN_LS:
- 16
- 16
- 8
- 8
WEIGHTS: ''
YOLO:
ANCHORS:
- - - 116
- 90
- - 156
- 198
- - 373
- 326
- - - 30
- 61
- - 62
- 45
- - 42
- 119
- - - 10
- 13
- - 16
- 30
- - 33
- 23
ANCHOR_MASK: []
BRANCH_DILATIONS:
- 1
- 2
- 3
CLASSES: 80
CONF_THRESHOLD: 0.01
DEPTH_MUL: 1.0
IGNORE_THRESHOLD: 0.07
IN_FEATURES:
- dark3
- dark4
- dark5
IOU_TYPE: ciou
LOSS:
ANCHOR_RATIO_THRESH: 4.0
BUILD_TARGET_TYPE: default
LAMBDA_CLS: 1.0
LAMBDA_CONF: 1.0
LAMBDA_IOU: 1.1
LAMBDA_WH: 1.0
LAMBDA_XY: 1.0
USE_L1: true
LOSS_TYPE: v4
MAX_BOXES_NUM: 100
NECK:
TYPE: yolov3
WITH_SPP: false
NMS_THRESHOLD: 0.5
NUM_BRANCH: 3
ORIEN_HEAD:
UP_CHANNELS: 64
TEST_BRANCH_IDX: 1
VARIANT: yolov3
WIDTH_MUL: 1.0
OUTPUT_DIR: ./output
SEED: -1
SOLVER:
AMP:
ENABLED: false
AMSGRAD: false
AUTO_SCALING_METHODS:
- default_scale_d2_configs
- default_scale_quantization_configs
BACKBONE_MULTIPLIER: 0.1
BASE_LR: 0.01
BIAS_LR_FACTOR: 1.0
CHECKPOINT_PERIOD: 5000
CLIP_GRADIENTS:
CLIP_TYPE: value
CLIP_VALUE: 1.0
ENABLED: false
NORM_TYPE: 2.0
GAMMA: 0.1
IMS_PER_BATCH: 8
LR_MULTIPLIER_OVERWRITE: []
LR_SCHEDULER:
GAMMA: 0.1
MAX_EPOCH: 500
MAX_ITER: 40000
NAME: WarmupMultiStepLR
STEPS:
- 30000
WARMUP_FACTOR: 0.001
WARMUP_ITERS: 1000
WARMUP_METHOD: linear
LR_SCHEDULER_NAME: WarmupMultiStepLR
MAX_ITER: 90000
MOMENTUM: 0.9
NESTEROV: false
OPTIMIZER: ADAMW
REFERENCE_WORLD_SIZE: 8
STEPS:
- 60000
- 80000
WARMUP_FACTOR: 0.01
WARMUP_ITERS: 1000
WARMUP_METHOD: linear
WEIGHT_DECAY: 0.0001
WEIGHT_DECAY_BIAS: null
WEIGHT_DECAY_EMBED: 0.0
WEIGHT_DECAY_NORM: 0.0
TEST:
AUG:
ENABLED: false
FLIP: true
MAX_SIZE: 4000
MIN_SIZES:
- 400
- 500
- 600
- 700
- 800
- 900
- 1000
- 1100
- 1200
DETECTIONS_PER_IMAGE: 100
EVAL_PERIOD: 0
EXPECTED_RESULTS: []
KEYPOINT_OKS_SIGMAS: []
PRECISE_BN:
ENABLED: false
NUM_ITER: 200
VERSION: 2
VIS_PERIOD: 0
[05/09 13:50:06 detectron2]: Full config saved to ./output/config.yaml
[05/09 13:50:06 d2.utils.env]: Using a generated random seed 6550842
[05/09 13:50:06 d2.engine.defaults]: Auto-scaling the config to batch_size=1, learning_rate=0.00125, max_iter=720000, warmup=8000.
13:50:06 05.09 INFO solov2.py:83]: instance_shapes: [ShapeSpec(channels=256, height=None, width=None, stride=4), ShapeSpec(channels=256, height=None, width=None, stride=8), ShapeSpec(channels=256, height=None, width=None, stride=16), ShapeSpec(channels=256, height=None, width=None, stride=32), ShapeSpec(channels=256, height=None, width=None, stride=64)]
[05/09 13:50:08 d2.data.datasets.coco]: Loaded 87 images in COCO format from /home/ws/data/dataset/nets_kinneret_only24_2/train_coco.json
[05/09 13:50:08 d2.data.build]: Removed 0 images with no usable annotations. 87 images left.
[05/09 13:50:08 d2.data.build]: Distribution of instances among all 3 categories:
| category | #instances | category | #instances | category | #instances |
|:----------:|:-------------|:----------:|:-------------|:----------:|:-------------|
| car | 1305 | bus | 0 | truck | 0 |
| | | | | | |
| total | 1305 | | | | |
[05/09 13:50:08 d2.data.build]: Using training sampler TrainingSampler
[05/09 13:50:08 d2.data.common]: Serializing 87 elements to byte tensors and concatenating them all ...
[05/09 13:50:08 d2.data.common]: Serialized dataset takes 0.82 MiB
[05/09 13:50:08 fvcore.common.checkpoint]: No checkpoint found. Initializing model from scratch
[05/09 13:50:08 d2.engine.train_loop]: Starting training from iteration 0
(15, 768, 768)
/home/ws/.local/lib/python3.6/site-packages/detectron2/structures/image_list.py:88: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
max_size = (max_size + (stride - 1)) // stride * stride
[(768, 768)]
torch.Size([1, 3, 768, 768])
/home/ws/.local/lib/python3.6/site-packages/torch/nn/functional.py:3635: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode)
/home/ws/.local/lib/python3.6/site-packages/torch/nn/functional.py:3680: UserWarning: The default behavior for interpolate/upsample with float scale_factor changed in 1.6.0 to align with other frameworks/libraries, and now uses scale_factor directly, instead of relying on the computed output size. If you wish to restore the old behavior, please set recompute_scale_factor=True. See the documentation of nn.Upsample for details.
"The default behavior for interpolate/upsample with float scale_factor changed "
/home/ws/.local/lib/python3.6/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
/home/ws/PycharmProjects/yolov7/yolov7/modeling/meta_arch/solov2.py:300: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
(center_w / upsampled_size[1]) // (1. / num_grid))
/home/ws/PycharmProjects/yolov7/yolov7/modeling/meta_arch/solov2.py:302: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
(center_h / upsampled_size[0]) // (1. / num_grid))
/home/ws/PycharmProjects/yolov7/yolov7/modeling/meta_arch/solov2.py:306: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
0, int(((center_h - half_h) / upsampled_size[0]) // (1. / num_grid)))
/home/ws/PycharmProjects/yolov7/yolov7/modeling/meta_arch/solov2.py:308: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
num_grid - 1, int(((center_h + half_h) / upsampled_size[0]) // (1. / num_grid)))
/home/ws/PycharmProjects/yolov7/yolov7/modeling/meta_arch/solov2.py:310: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
0, int(((center_w - half_w) / upsampled_size[1]) // (1. / num_grid)))
/home/ws/PycharmProjects/yolov7/yolov7/modeling/meta_arch/solov2.py:312: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
num_grid - 1, int(((center_w + half_w) / upsampled_size[1]) // (1. / num_grid)))
ERROR [05/09 13:50:12 d2.engine.train_loop]: Exception during training:
Traceback (most recent call last):
File "/home/ws/.local/lib/python3.6/site-packages/detectron2/engine/train_loop.py", line 149, in train
self.run_step()
File "train_net.py", line 58, in run_step
self._trainer.run_step()
File "/home/ws/.local/lib/python3.6/site-packages/detectron2/engine/train_loop.py", line 285, in run_step
losses.backward()
File "/home/ws/.local/lib/python3.6/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/ws/.local/lib/python3.6/site-packages/torch/autograd/__init__.py", line 156, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 128, 192, 192]], which is output 0 of ReluBackward0, is at version 3; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
[05/09 13:50:12 d2.engine.hooks]: Total training time: 0:00:03 (0:00:00 on hooks)
[05/09 13:50:12 d2.utils.events]: iter: 0 lr: N/A max_mem: 710M
Traceback (most recent call last):
File "train_net.py", line 133, in <module>
args=(args,),
File "/home/ws/.local/lib/python3.6/site-packages/detectron2/engine/launch.py", line 82, in launch
main_func(*args)
File "train_net.py", line 121, in main
return trainer.train()
File "/home/ws/.local/lib/python3.6/site-packages/detectron2/engine/defaults.py", line 484, in train
super().train(self.start_iter, self.max_iter)
File "/home/ws/.local/lib/python3.6/site-packages/detectron2/engine/train_loop.py", line 149, in train
self.run_step()
File "train_net.py", line 58, in run_step
self._trainer.run_step()
File "/home/ws/.local/lib/python3.6/site-packages/detectron2/engine/train_loop.py", line 285, in run_step
losses.backward()
File "/home/ws/.local/lib/python3.6/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/ws/.local/lib/python3.6/site-packages/torch/autograd/__init__.py", line 156, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [1, 128, 192, 192]], which is output 0 of ReluBackward0, is at version 3; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
Can u try find out which op using inplace? I didn't get this before.
try change this line https://github.com/jinfagang/yolov7/blob/f9c0b723be90bc3fbf7955f1d2c0344d5f52c5e1/yolov7/modeling/head/solov2_head.py#L264 to feature_add_all_level = feature_add_all_level + self.convs_all_levels[i](mask_feat)
@acai66 Can u make a PR to solov2 if this works?
@sdimantsd Can u please pull and try again? it should be fixed.
@jinfagang Now the masks are OK and the training not crash. But it's only display the images and not start training
If I change the line in solov2.py from:
im = visualize_det_cv2_part(im, None, clss, bboxes, is_show=True)
to
im = visualize_det_cv2_part(im, None, clss, bboxes, is_show=False)
(changed to: is_show=False)
it's works. but I think all of it:
for a in batched_inputs:
img = a["image"].cpu().permute(1, 2, 0).numpy().astype(np.uint8)
ins = a['instances']
bboxes = ins.gt_boxes.tensor.cpu().numpy().astype(int)
clss = ins.gt_classes.cpu().numpy()
im = img.copy()
bit_masks = ins.gt_masks.tensor.cpu().numpy()
print(bit_masks.shape)
# img = vis_bitmasks_with_classes(img, clss, bit_masks)
im = vis_bitmasks(im, bit_masks)
im = visualize_det_cv2_part(im, None, clss, bboxes, is_show=False)
and this:
print(images.image_sizes)
print(images.tensor.shape)
is unnecessary during the training
@sdimantsd You are right, these files is for debugging GT is right or not. Can u verify is the GT is right or not from dataloader?
You can send me a PR is you verify the GT is right, just comment out these files.
Hi @jinfagang Thanks! It look like the GT are good. and the training started. I will let you know if the overfitting works. Thanks
Hi @jinfagang I am try to overfitting SoloV2. but it's not working. I changed the dataset to a custom dataset and only 3 labels (car, bus, truck). Those are the last lines in the log:
[05/11 11:27:11 d2.utils.events]: eta: 0:01:00 iter: 719399 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0019 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:13 d2.utils.events]: eta: 0:00:58 iter: 719419 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0019 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:16 d2.utils.events]: eta: 0:00:56 iter: 719439 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0019 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:18 d2.utils.events]: eta: 0:00:54 iter: 719459 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:20 d2.utils.events]: eta: 0:00:52 iter: 719479 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:22 d2.utils.events]: eta: 0:00:50 iter: 719499 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:24 d2.utils.events]: eta: 0:00:48 iter: 719519 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:26 d2.utils.events]: eta: 0:00:46 iter: 719539 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:28 d2.utils.events]: eta: 0:00:44 iter: 719559 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:30 d2.utils.events]: eta: 0:00:42 iter: 719579 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:32 d2.utils.events]: eta: 0:00:40 iter: 719599 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:34 d2.utils.events]: eta: 0:00:38 iter: 719619 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:36 d2.utils.events]: eta: 0:00:36 iter: 719639 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:38 d2.utils.events]: eta: 0:00:34 iter: 719659 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:40 d2.utils.events]: eta: 0:00:32 iter: 719679 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:42 d2.utils.events]: eta: 0:00:30 iter: 719699 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0019 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:44 d2.utils.events]: eta: 0:00:28 iter: 719719 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:46 d2.utils.events]: eta: 0:00:26 iter: 719739 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:48 d2.utils.events]: eta: 0:00:24 iter: 719759 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:50 d2.utils.events]: eta: 0:00:22 iter: 719779 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:52 d2.utils.events]: eta: 0:00:20 iter: 719799 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:54 d2.utils.events]: eta: 0:00:18 iter: 719819 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:56 d2.utils.events]: eta: 0:00:16 iter: 719839 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:27:58 d2.utils.events]: eta: 0:00:14 iter: 719859 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:28:00 d2.utils.events]: eta: 0:00:12 iter: 719879 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:28:02 d2.utils.events]: eta: 0:00:10 iter: 719899 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:28:04 d2.utils.events]: eta: 0:00:08 iter: 719919 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:28:06 d2.utils.events]: eta: 0:00:06 iter: 719939 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
[05/11 11:28:08 d2.utils.events]: eta: 0:00:04 iter: 719959 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0020 lr: 1.25e-05 max_mem: 861M
[05/11 11:28:10 d2.utils.events]: eta: 0:00:02 iter: 719979 total_loss: 3.028 loss_ins: 2.591 loss_cate: 0.4368 time: 0.1018 data_time: 0.0018 lr: 1.25e-05 max_mem: 861M
The loss is not going down.
And when I run the demo.py on the image it's not detect any object. Can you help me please?
@sdimantsd Can u please provide your machine config? I think this is gradient bloom. Clearly it won't learn anything.
If I change the line in solov2.py from:
im = visualize_det_cv2_part(im, None, clss, bboxes, is_show=True)toim = visualize_det_cv2_part(im, None, clss, bboxes, is_show=False)(changed to: is_show=False) it's works. but I think all of it:for a in batched_inputs: img = a["image"].cpu().permute(1, 2, 0).numpy().astype(np.uint8) ins = a['instances'] bboxes = ins.gt_boxes.tensor.cpu().numpy().astype(int) clss = ins.gt_classes.cpu().numpy() im = img.copy() bit_masks = ins.gt_masks.tensor.cpu().numpy() print(bit_masks.shape) # img = vis_bitmasks_with_classes(img, clss, bit_masks) im = vis_bitmasks(im, bit_masks) im = visualize_det_cv2_part(im, None, clss, bboxes, is_show=False)and this:
print(images.image_sizes) print(images.tensor.shape)is unnecessary during the training
But when i close the visualization, while training the loss will get Nan. In the process of visualization, the displayed image and mask are all right.
f"Loss became infinite or NaN at iteration={storage.iter}!\n"
FloatingPointError: Loss became infinite or NaN at iteration=1! loss_dict = {'loss_ins': 2.8118667602539062, 'loss_cate': nan}
After doing the step mentioned in this thread (changing True to False). I'm also getting the same error. If anyone was able to solve it @sdimantsd @visionKinger, Could you please help. Thanks !!
Please try narrow down the lr.
Hi @jinfagang , Thanks for your answer, I have been experimenting with the solov2_lite and your suggestions have been really helpful. I was able to run it for roughly 22 epochs and here is a bit of training logs for it
[09/10 17:34:31 d2.utils.events]: eta: 0:13:22 iter: 6679 total_loss: 3.368 loss_ins: 2.948 loss_cate: 0.4127 time: 1.6192 data_time: 0.0553 lr: 1.25e-09 max_mem: 9312M [09/10 17:35:03 d2.utils.events]: eta: 0:12:51 iter: 6699 total_loss: 3.385 loss_ins: 2.951 loss_cate: 0.4373 time: 1.6192 data_time: 0.0565 lr: 1.25e-09 max_mem: 9312M [09/10 17:35:36 d2.utils.events]: eta: 0:12:20 iter: 6719 total_loss: 3.346 loss_ins: 2.945 loss_cate: 0.4056 time: 1.6192 data_time: 0.0437 lr: 1.25e-09 max_mem: 9312M [09/10 17:36:09 d2.utils.events]: eta: 0:11:50 iter: 6739 total_loss: 3.374 loss_ins: 2.945 loss_cate: 0.428 time: 1.6194 data_time: 0.0484 lr: 1.25e-09 max_mem: 9312M [09/10 17:36:42 d2.utils.events]: eta: 0:11:19 iter: 6759 total_loss: 3.361 loss_ins: 2.945 loss_cate: 0.416 time: 1.6194 data_time: 0.0525 lr: 1.25e-09 max_mem: 9312M [09/10 17:37:14 d2.utils.events]: eta: 0:10:48 iter: 6779 total_loss: 3.382 loss_ins: 2.95 loss_cate: 0.4243 time: 1.6193 data_time: 0.0508 lr: 1.25e-09 max_mem: 9312M [09/10 17:37:46 d2.utils.events]: eta: 0:10:16 iter: 6799 total_loss: 3.348 loss_ins: 2.948 loss_cate: 0.4043 time: 1.6193 data_time: 0.0551 lr: 1.25e-09 max_mem: 9312M [09/10 17:38:19 d2.utils.events]: eta: 0:09:45 iter: 6819 total_loss: 3.369 loss_ins: 2.946 loss_cate: 0.4279 time: 1.6193 data_time: 0.0545 lr: 1.25e-09 max_mem: 9312M [09/10 17:38:52 d2.utils.events]: eta: 0:09:14 iter: 6839 total_loss: 3.379 loss_ins: 2.94 loss_cate: 0.4413 time: 1.6195 data_time: 0.0488 lr: 1.25e-09 max_mem: 9312M [09/10 17:39:23 d2.utils.events]: eta: 0:08:43 iter: 6859 total_loss: 3.367 loss_ins: 2.944 loss_cate: 0.4301 time: 1.6192 data_time: 0.0518 lr: 1.25e-09 max_mem: 9312M [09/10 17:39:55 d2.utils.events]: eta: 0:08:13 iter: 6879 total_loss: 3.366 loss_ins: 2.945 loss_cate: 0.4096 time: 1.6193 data_time: 0.0510 lr: 1.25e-09 max_mem: 9312M [09/10 17:40:26 d2.utils.events]: eta: 0:07:42 iter: 6899 total_loss: 3.372 loss_ins: 2.946 loss_cate: 0.4224 time: 1.6190 data_time: 0.0594 lr: 1.25e-09 max_mem: 9312M [09/10 17:40:58 d2.utils.events]: eta: 0:07:11 iter: 6919 total_loss: 3.374 loss_ins: 2.944 loss_cate: 0.4191 time: 1.6189 data_time: 0.0555 lr: 1.25e-09 max_mem: 9312M [09/10 17:41:30 d2.utils.events]: eta: 0:06:40 iter: 6939 total_loss: 3.394 loss_ins: 2.95 loss_cate: 0.4397 time: 1.6189 data_time: 0.0476 lr: 1.25e-09 max_mem: 9312M [09/10 17:42:02 d2.utils.events]: eta: 0:06:09 iter: 6959 total_loss: 3.368 loss_ins: 2.948 loss_cate: 0.4201 time: 1.6189 data_time: 0.0553 lr: 1.25e-09 max_mem: 9312M [09/10 17:42:37 d2.utils.events]: eta: 0:05:38 iter: 6979 total_loss: 3.375 loss_ins: 2.947 loss_cate: 0.4291 time: 1.6191 data_time: 0.0514 lr: 1.25e-09 max_mem: 9312M [09/10 17:43:10 d2.utils.events]: eta: 0:05:08 iter: 6999 total_loss: 3.362 loss_ins: 2.948 loss_cate: 0.4324 time: 1.6194 data_time: 0.0621 lr: 1.25e-09 max_mem: 9312M [09/10 17:43:44 d2.utils.events]: eta: 0:04:37 iter: 7019 total_loss: 3.407 loss_ins: 2.947 loss_cate: 0.4632 time: 1.6195 data_time: 0.0589 lr: 1.25e-09 max_mem: 9312M [09/10 17:44:18 d2.utils.events]: eta: 0:04:06 iter: 7039 total_loss: 3.375 loss_ins: 2.948 loss_cate: 0.4275 time: 1.6197 data_time: 0.0561 lr: 1.25e-09 max_mem: 9312M [09/10 17:44:50 d2.utils.events]: eta: 0:03:36 iter: 7059 total_loss: 3.372 loss_ins: 2.945 loss_cate: 0.4346 time: 1.6197 data_time: 0.0505 lr: 1.25e-09 max_mem: 9312M [09/10 17:45:22 d2.utils.events]: eta: 0:03:05 iter: 7079 total_loss: 3.376 loss_ins: 2.948 loss_cate: 0.4245 time: 1.6197 data_time: 0.0525 lr: 1.25e-09 max_mem: 9312M [09/10 17:45:54 d2.utils.events]: eta: 0:02:34 iter: 7099 total_loss: 3.354 loss_ins: 2.943 loss_cate: 0.4098 time: 1.6196 data_time: 0.0537 lr: 1.25e-09 max_mem: 9312M [09/10 17:46:28 d2.utils.events]: eta: 0:02:03 iter: 7119 total_loss: 3.386 loss_ins: 2.949 loss_cate: 0.4338 time: 1.6198 data_time: 0.0537 lr: 1.25e-09 max_mem: 9312M [09/10 17:47:00 d2.utils.events]: eta: 0:01:32 iter: 7139 total_loss: 3.362 loss_ins: 2.947 loss_cate: 0.4166 time: 1.6197 data_time: 0.0488 lr: 1.25e-09 max_mem: 9312M [09/10 17:47:33 d2.utils.events]: eta: 0:01:01 iter: 7159 total_loss: 3.36 loss_ins: 2.946 loss_cate: 0.4138 time: 1.6198 data_time: 0.0530 lr: 1.25e-09 max_mem: 9312M ` and logs for the validation set
`[09/10 17:50:53 d2.evaluation.coco_evaluation]: Evaluation results for segm:
| AP | AP50 | AP75 | APs | APm | APl |
|---|---|---|---|---|---|
| 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| [09/10 17:50:53 d2.evaluation.coco_evaluation]: Per-category segm AP: | |||||
| category | AP | category | AP | category | AP |
| :----------- | :------ | :----------- | :------ | :----------- | :------ |
| circle | 0.000 | poly | 0.000 | line | 0.000 |
| parabola | 0.000 | ||||
| [09/10 17:50:53 d2.engine.defaults]: Evaluation results for val in csv format: | |||||
| [09/10 17:50:53 d2.evaluation.testing]: copypaste: Task: bbox | |||||
| [09/10 17:50:53 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl | |||||
| [09/10 17:50:53 d2.evaluation.testing]: copypaste: 0.0356,0.2245,0.0039,0.6089,1.0985,0.0605 | |||||
| [09/10 17:50:53 d2.evaluation.testing]: copypaste: Task: segm | |||||
| [09/10 17:50:53 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl | |||||
| [09/10 17:50:53 d2.evaluation.testing]: copypaste: 0.0000,0.0000,0.0000,0.0000,0.0000,0.0000` |
Could you please suggest where I should look at or the reason why it's not giving any result. Thanks !!
you should register custom dataset and train with custom data script.
Yes, did that already, only then I was able to train, here's how I registered the data
#registering the data set register_coco_instances("train", {},"/content/yolov7_d2/dataset/math_data/train/train.json", "/content/yolov7_d2/dataset/math_data") register_coco_instances("val", {}, "/content/yolov7_d2/dataset/math_data/val/val.json", "/content/yolov7_d2/dataset/math_data")
and did the changes accordingly in the train_inseg file, added these dew lines at the start in train_inseg file
from detectron2.data.datasets.coco import load_coco_json, register_coco_instances from train_det import Trainer, setup
def register_custom_datasets(): # facemask dataset DATASET_ROOT = "./dataset/math_data" ANN_ROOT = DATASET_ROOT TRAIN_PATH = os.path.join(ANN_ROOT, "train") VAL_PATH = os.path.join(ANN_ROOT, "val") TRAIN_JSON = os.path.join(TRAIN_PATH, "train.json") VAL_JSON = os.path.join(VAL_PATH, "val.json") register_coco_instances("train", {}, TRAIN_JSON, TRAIN_PATH) register_coco_instances("val", {}, VAL_JSON, VAL_PATH)
register_custom_datasets()
Thanks for the quick response, could you please suggest anything else that I should look at
can u try train coco first? for tiny dataset I think the lr is very hard to adjust. you can join our discord for further guidance.
Sure, will try and update. Thanks !!!