CenterPose icon indicating copy to clipboard operation
CenterPose copied to clipboard

Hard to run eval_video_official.py & some training questions

Open Learningm opened this issue 3 years ago • 14 comments

Hi, I encounter some problems when using the code. I followed data/Readme.md and download the chair data then preprocess it. After that, I got nothing inside the output folder (e.g. outf_all) except another empty folder named 'chair_train'.

I got a 'bug_lists.txt' file, seems including all the related 'chair' category video name after preprocessing.

I see there's also 'bug_list.txt' in the 'label' folder? What do you mean by 'bug_lists'?

Please help me figure out what matters, Thank you very much!

Learningm avatar Jul 20 '22 08:07 Learningm

I found that it's the annotation_path matters.

Running preprocess.py inside data/ folder, but annotation_path start with data/ folder.

Learningm avatar Jul 21 '22 07:07 Learningm

I follow the tutorial in Readme inside the objectron_eval folder, however, the prepare_test_video.py script didn't work when the category is 'bike' as the partition() func return empty.

After I modified the code in prepare_test_video.py to eval on the bike category, I run python eval_video_official.py --eval_c bike --arch dlav1_34 inside ${CenterPose}/src/tools/objectron_eval and it outputs nothing.

I figured out that I need to put the tfrecord data folder inside the path mentioned aboved. Then it runs a batch of videos and some debug windows appears and stucks.

It's hard to run the evaluation part related scripts, please modify the tutorial readme with more details, thank you very much!

Learningm avatar Jul 27 '22 07:07 Learningm

I follow the tutorial in Readme inside the objectron_eval folder, however, the prepare_test_video.py script didn't work when the category is 'bike' as the partition() func return empty.

After I modified the code in prepare_test_video.py to eval on the bike category, I run python eval_video_official.py --eval_c bike --arch dlav1_34 inside ${CenterPose}/src/tools/objectron_eval and it outputs nothing.

I figured out that I need to put the tfrecord data folder inside the path mentioned aboved. Then it runs a batch of videos and some debug windows appears and stucks.

It's hard to run the evaluation part related scripts, please modify the tutorial readme with more details, thank you very much!

Sorry for the inconvenience. I may update the tutorial to make it more user-friendly if I am more available later.

Uio96 avatar Jul 29 '22 20:07 Uio96

Hi, I encounter some problems when using the code. I followed data/Readme.md and download the chair data then preprocess it. After that, I got nothing inside the output folder (e.g. outf_all) except another empty folder named 'chair_train'.

I got a 'bug_lists.txt' file, seems including all the related 'chair' category video name after preprocessing.

I see there's also 'bug_list.txt' in the 'label' folder? What do you mean by 'bug_lists'?

Please help me figure out what matters, Thank you very much!

We ran into some problems (annotation inconsistency or video missing) when processing the raw data from the Objectron dataset for training purposes. So we skip those clips. You can find out the related code in the data folder https://github.com/NVlabs/CenterPose/blob/main/data/preprocess.py#L83-L95.

Uio96 avatar Jul 29 '22 20:07 Uio96

I follow the tutorial in Readme inside the objectron_eval folder, however, the prepare_test_video.py script didn't work when the category is 'bike' as the partition() func return empty. After I modified the code in prepare_test_video.py to eval on the bike category, I run python eval_video_official.py --eval_c bike --arch dlav1_34 inside ${CenterPose}/src/tools/objectron_eval and it outputs nothing. I figured out that I need to put the tfrecord data folder inside the path mentioned aboved. Then it runs a batch of videos and some debug windows appears and stucks. It's hard to run the evaluation part related scripts, please modify the tutorial readme with more details, thank you very much!

Sorry for the inconvenience. I may update the tutorial to make it more user-friendly if I am more available later.

Thanks for reply. I encounter some other problems when studying your code.

The variable name confused me though you have made some comments, but i still wonder why the 'hp' stands for 'keypoints' instead of using 'kp'.

My second question is the channel number when I start to train the code using main_CenterPose.py. The 'hp_offset' in opt heads is 2, but paper mentions this channel should be 16? https://github.com/NVlabs/CenterPose/blob/main/src/lib/opts.py#L403

paper: image

Learningm avatar Aug 02 '22 06:08 Learningm

Hi, I encounter another question, https://github.com/NVlabs/CenterPose/blob/main/src/lib/models/networks/pose_dla_dcn.py#L253

what is opt.pre_img / opt.pre_hm / opt.pre_hm_hp ? I can not find the comments about these parameters.

Learningm avatar Aug 04 '22 06:08 Learningm

Hi, I encounter another question, https://github.com/NVlabs/CenterPose/blob/main/src/lib/models/networks/pose_dla_dcn.py#L253

what is opt.pre_img / opt.pre_hm / opt.pre_hm_hp ? I can not find the comments about these parameters.

That's for CenterPoseTrack. Its pipeline plot may give you some ideas.

pre_img is for the image; pre_hm is for the filtered center heatmap; pre_hm_hp is for the filtered keypoint heatmaps. image

Uio96 avatar Aug 05 '22 19:08 Uio96

The variable name confused me though you have made some comments, but i still wonder why the 'hp' stands for 'keypoints' instead of using 'kp'.

Sorry for the confusion. I had some other thoughts about the name before but used "keypoints" in the end.

My second question is the channel number when I start to train the code using main_CenterPose.py. The 'hp_offset' in opt heads is 2, but paper mentions this channel should be 16?

We have 8 keypoints, and each of them has 2 heads. The total channel is 16.

Uio96 avatar Aug 05 '22 19:08 Uio96

Thanks for reply.

I got another problem, if I remove the 'wh' head for training, then I will got some error when I inference.

https://github.com/NVlabs/CenterPose/blob/main/src/lib/models/decode.py#L102 There will be no kps_heatmap_std / kps_heatmap_mean / kps_heatmap_height output compared with having 'wh' head.

As you can see, if I don't have the 'wh' head, I run into https://github.com/NVlabs/CenterPose/blob/main/src/lib/models/decode.py#L258

However, https://github.com/NVlabs/CenterPose/blob/main/src/lib/models/decode.py#L358 asks for the missing variable above.

Could you illustrate the decode processing with more detail please? Or some reference papers? I got confused when I try to understand the decode processing part. Thank you very much !

Learningm avatar Aug 12 '22 07:08 Learningm

More questions:

I try to remove 'scale' head and I set opt.obj_scale = False. And I got crash when I am running the demo.py, the crash happenes on the following link as it asks for scale variable, https://github.com/NVlabs/CenterPose/blob/main/src/lib/utils/pnp/cuboid_pnp_shell.py#L12

I tried to set the scale to be 1 constantly, the visualization just become a square cuboid which is not correct.

Could I get the pose without predicting the scale? Is it possible to use pnp to get output3d points and infer the scale using the predicted 3d points?

Learningm avatar Aug 12 '22 08:08 Learningm

Thanks for reply.

I got another problem, if I remove the 'wh' head for training, then I will got some error when I inference.

https://github.com/NVlabs/CenterPose/blob/main/src/lib/models/decode.py#L102 There will be no kps_heatmap_std / kps_heatmap_mean / kps_heatmap_height output compared with having 'wh' head.

As you can see, if I don't have the 'wh' head, I run into https://github.com/NVlabs/CenterPose/blob/main/src/lib/models/decode.py#L258

However, https://github.com/NVlabs/CenterPose/blob/main/src/lib/models/decode.py#L358 asks for the missing variable above.

Could you illustrate the decode processing with more detail please? Or some reference papers? I got confused when I try to understand the decode processing part. Thank you very much !

Sorry about the confusion. I once tried to see the impact of removing 'wh' option for CenterPose. I found that it might not be a good choice, so I did not go further. Then when I developed for CenterPoseTrack with some more parameters, e.g., kps_heatmap_std / kps_heatmap_mean / kps_heatmap_height, I assumed the 'wh' option was already enabled. In your case, you probably have to give kps_heatmap_std / kps_heatmap_mean / kps_heatmap_height some default values there (if you do not care about tracking).

As for the reference, our implementation is based on CenterNet. I do not think there is detailed instruction available now, as most of the codes are just too detailed to be put on paper.

Uio96 avatar Sep 19 '22 14:09 Uio96

More questions:

I try to remove 'scale' head and I set opt.obj_scale = False. And I got crash when I am running the demo.py, the crash happenes on the following link as it asks for scale variable, https://github.com/NVlabs/CenterPose/blob/main/src/lib/utils/pnp/cuboid_pnp_shell.py#L12

I tried to set the scale to be 1 constantly, the visualization just become a square cuboid which is not correct.

Could I get the pose without predicting the scale? Is it possible to use pnp to get output3d points and infer the scale using the predicted 3d points?

In my implementation, 'scale' head is trying to predict the ratio between width/height/length. If not given the absolute height info from somewhere else, then its prediction is width/1/length (we call it relative scale).

pnp is used to get the pose given 2d key points on the image and 3d key points (prior information). In our case, width/1/length or width/height/length can be used to get 3d key points (as we work on a 3d bounding box).

I think your so-called "predicted 3d points" is more like transforming the 3d key points (prior information) into the camera space with the pose calculated by pnp. In other words, you cannot get the pose without predicting the scale.

Uio96 avatar Sep 19 '22 14:09 Uio96

Thanks for the clarfication for the questions above.

I got another question, how to train cup category? In this implementation, you got two checkpoints for mug and cup respectively in the evaluation script. However, it's not clear how to train these two checkpoints, With --mug flag seems to train mug category, without this flag seems to train mug and cup together, instead of only cup data.

Learningm avatar Sep 26 '22 03:09 Learningm

Did you succeed in the test? Some errors occur when I run the test code "python eval_video_official. py".

YC0315 avatar Oct 13 '22 07:10 YC0315