mmocr icon indicating copy to clipboard operation
mmocr copied to clipboard

[Bug] MMOCR inference cannot use the visualizer configs

Open Daisy5296 opened this issue 1 year ago • 3 comments

Prerequisite

Task

I'm using the official example scripts/configs for the officially supported tasks/models/datasets.

Branch

main branch https://github.com/open-mmlab/mmocr

Environment

sys.platform: linux Python: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:10) [GCC 10.3.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0,1: Tesla V100-PCIE-32GB CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 11.7, V11.7.99 GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 PyTorch: 1.13.0a0+340c412 PyTorch compiling details: PyTorch built with:

  • GCC 9.4
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.4 Product Build 20200917 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.6.0 (Git Hash N/A)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX512
  • CUDA Runtime 11.7
  • NVCC architecture flags: -gencode;arch=compute_52,code=sm_52;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_86,code=compute_86
  • CuDNN 8.4.1 (built against CUDA 11.6)
  • Magma 2.6.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.4.1, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS=-fno-gnu-unique -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=ON, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.13.0a0 OpenCV: 4.2.0 MMEngine: 0.7.2 MMOCR: 1.0.0+d7c59f3

Reproduces the problem - code sample

Here is the the visualization related config that I used for in the detection config file: default_hooks = dict( timer=dict(type='IterTimerHook'), logger=dict(type='LoggerHook', interval=5), param_scheduler=dict(type='ParamSchedulerHook'), checkpoint=dict(type='CheckpointHook', interval=20), sampler_seed=dict(type='DistSamplerSeedHook'), sync_buffer=dict(type='SyncBuffersHook'), visualization=dict( type='VisualizationHook', interval=1, enable=True, show=False, draw_gt=True, draw_pred=True)) ... visualizer = dict( type='TextDetLocalVisualizer', name='visualizer', vis_backends=[dict(type='LocalVisBackend')], save_dir='/dfs/data/mmocr/imgs')

Here is the visualization related config that I used for in the recognition config file: default_hooks = dict( timer=dict(type='IterTimerHook'), logger=dict(type='LoggerHook', interval=50), param_scheduler=dict(type='ParamSchedulerHook'), checkpoint=dict(type='CheckpointHook', interval=5), sampler_seed=dict(type='DistSamplerSeedHook'), sync_buffer=dict(type='SyncBuffersHook'), visualization=dict( type='VisualizationHook', interval=1, enable=True, show=False, draw_gt=False, draw_pred=True)) .... vis_backends = [dict(type='LocalVisBackend')] visualizer = dict( type='TextRecogLocalVisualizer', name='visualizer', vis_backends=[dict(type='LocalVisBackend')], font_properties='/dfs/data/mmocr/fonts/simsun.ttc', save_dir='/dfs/data/mmocr/imgs')

Reproduces the problem - command or script

from mmocr.apis import MMOCRInferencer ocr = MMOCRInferencer(det='dbnet_resnet18_fpnc_1200e_rects_icdar2015_test.py', det_weights='epoch_80_new.pth', rec='satrn_shallow_5e_st_mj_rects_icdar2015_lsvt_art_rctw_icdar2013_ttt.py', rec_weights='epoch_20.pth') ocr('demo/circle.jpeg', show=False, print_result=True, save_vis=True, out_dir='imgs/')

Reproduces the problem - error message

Hi! I used the mmocr inference script and the above configs, but it cannot visualize chinese. A warning like below was displayed: mmengine - WARNING - Visualizer backend is not initialized because save_dir is None

However, I believe I've set the config files correctly, since it works for the tools/test.py. Is it true that the mmocr inference does not use the visualizaton configs? or how should I set it correctly?

Additional information

No response

Daisy5296 avatar May 09 '23 03:05 Daisy5296