mmpose icon indicating copy to clipboard operation
mmpose copied to clipboard

Is there anyway to turn off progress bar when do evaluation?

Open darcula1993 opened this issue 2 years ago • 7 comments

darcula1993 avatar Apr 13 '22 11:04 darcula1993

You can comment these lines:

https://github.com/open-mmlab/mmpose/blob/4853b4bcd1238ef559c0f341ed2402c5a3605316/mmpose/apis/test.py#L38-L39

if you single GPU.

Or you can comment these lines: https://github.com/open-mmlab/mmpose/blob/4853b4bcd1238ef559c0f341ed2402c5a3605316/mmpose/apis/test.py#L76-L77

if you use multiple GPUs.

BTW, is there any inconvenience you have using the progress bar tool?

liqikai9 avatar Apr 13 '22 11:04 liqikai9

@liqikai9 The training server I use do not have interactive terminal, so we have to save the print log to disk to check. And here is the log,the progress bar just print each line for one step even with multiple gpus. Should I update mmpose? the curent version is 0.21.0.

[ ] 0/104125, elapsed: 0s, ETA: [ ] 1/104125, 0.0 task/s, elapsed: 29s, ETA: 3049871s [ ] 2/104125, 0.1 task/s, elapsed: 29s, ETA: 1524937s [ ] 3/104125, 0.1 task/s, elapsed: 29s, ETA: 1016617s [ ] 4/104125, 0.1 task/s, elapsed: 29s, ETA: 762456s [ ] 5/104125, 0.2 task/s, elapsed: 29s, ETA: 609960s [ ] 6/104125, 0.2 task/s, elapsed: 29s, ETA: 508296s [ ] 7/104125, 0.2 task/s, elapsed: 29s, ETA: 435678s [ ] 8/104125, 0.3 task/s, elapsed: 29s, ETA: 381215s [ ] 9/104125, 0.3 task/s, elapsed: 29s, ETA: 338855s [ ] 10/104125, 0.3 task/s, elapsed: 29s, ETA: 304967s [ ] 11/104125, 0.4 task/s, elapsed: 29s, ETA: 277241s [ ] 12/104125, 0.4 task/s, elapsed: 29s, ETA: 254135s [ ] 13/104125, 0.4 task/s, elapsed: 29s, ETA: 234584s [ ] 14/104125, 0.5 task/s, elapsed: 29s, ETA: 217826s [ ] 15/104125, 0.5 task/s, elapsed: 29s, ETA: 203303s

darcula1993 avatar Apr 14 '22 05:04 darcula1993

If you use multiple GPUs, try to comment these lines:

https://github.com/open-mmlab/mmpose/blob/2a0a2d2fb4b5bf5d8620c6bd04a70c6a940b98ba/mmpose/apis/test.py#L66-L67

and https://github.com/open-mmlab/mmpose/blob/2a0a2d2fb4b5bf5d8620c6bd04a70c6a940b98ba/mmpose/apis/test.py#L73-L77

which at the end act like:

model.eval()
results = []
dataset = data_loader.dataset
rank, world_size = get_dist_info()
# if rank == 0:
    # prog_bar = mmcv.ProgressBar(len(dataset))
for data in data_loader:
    with torch.no_grad():
        result = model(return_loss=False, **data)
    results.append(result)

    # if rank == 0:
        # # use the first key as main key to calculate the batch size
        # batch_size = len(next(iter(data.values())))
        # for _ in range(batch_size * world_size):
            # prog_bar.update()

This will disable the mmcv.ProgressBar tool and thus stop printing the log each time in the evaluation. Try and see if the problem is still present.

liqikai9 avatar Apr 16 '22 09:04 liqikai9

comment these lines works,

for _ in range(batch_size * world_size):
    prog_bar.update()

seems that if rank == 0 condition do not work

darcula1993 avatar Apr 18 '22 07:04 darcula1993

seems that if rank == 0 condition do not work

Maybe this has something with your server. What platform did you use for the multi-GPU evaluation?

liqikai9 avatar Apr 18 '22 09:04 liqikai9

It is a local server with 8 2080ti and inspur training platform. CentOS7, pytorch 1.7.0 with cuda 11.0,Driver Version: 450.102.04

darcula1993 avatar Apr 19 '22 04:04 darcula1993

[1] 2022-04-14 09:10:34,807 - mmpose - INFO - Environment info:
[2] ------------------------------------------------------------
[3] sys.platform: linux
[4] Python: 3.6.9 (default, Oct  8 2020, 12:12:24) [GCC 8.4.0]
[5] CUDA available: True
[6] GPU 0,1,2,3,4,5,6: GeForce RTX 2080 Ti
[7] CUDA_HOME: /usr/local/cuda
[8] NVCC: Build cuda_11.0_bu.TC445_37.28845127_0
[9] GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
[10] PyTorch: 1.7.0+cu110
[11] PyTorch compiling details: PyTorch built with:
[12]   - GCC 7.3
[13]   - C++ Version: 201402
[14]   - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
[15]   - Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
[16]   - OpenMP 201511 (a.k.a. OpenMP 4.5)
[17]   - NNPACK is enabled
[18]   - CPU capability usage: AVX2
[19]   - CUDA Runtime 11.0
[20]   - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80
[21]   - CuDNN 8.0.4
[22]   - Magma 2.5.2
[23]   - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 
[24] 
[25] TorchVision: 0.8.1+cu110
[26] OpenCV: 4.5.4
[27] MMCV: 1.4.0
[28] MMCV Compiler: GCC 7.3
[29] MMCV CUDA Compiler: 11.0
[30] MMPose: 0.21.0+228747c

darcula1993 avatar Apr 19 '22 04:04 darcula1993