lightning-pose Feature: add support for multiple (non-fused) camera views

The current package only supports multiview setups that have fused views across cameras into a single frame. This does not scale well past 2-4 views.

Nov 19 '23 18:11 themattinthehatt

We're interested in the multiview feature for separated data streams. We would like to track monkey hands in 3D from 6 camera views.

Jan 03 '24 16:01 YitingChang

@YitingChang thanks for reaching out! We've wrapped up this feature and are working on docs now, I'll contact you when they're ready.

Jan 03 '24 17:01 themattinthehatt

Hello, we are also highly interested in the multiview feature for separate videos of different camera angles. Is there any update about the feature and documentation by now?

Mar 07 '24 16:03 mariusosswald

Hello, has there been any update on the feature at this point? we would like to use the separate multi-view feature for four different views.

Apr 04 '24 16:04 NgD-Huy

Hi everyone, thanks for your interest in this feature (and your patience). I wanted to let you know that we have a beta version up and running and documented. The current version will allow you to fit supervised and context models on mulitple synchronized camera views.

In order to use this feature, you will need to pull updates and switch to the dynamic_crop branch of the repo:

git fetch
git checkout dynamic_crop

The documentation describing the data layout and how to run training and inference can be found here: https://lightning-pose.readthedocs.io/en/dynamic_crop/source/user_guide_advanced/multiview_separate.html

Please let us know if you start using this feature! We are eager to support you all through bug fixes, feature requests, guidance on training, etc. Reach out to us on discord if you just want to chat, or if you run into a bug please raise a separate github issue.

We are currently working on several new features that you all will be interested in as well:

implement unsupervised losses for the multiview case
in many of these multiview setups the animal is moving around in an environement much larger than itself. It therefore makes sense to have a two-stage pipeline, where the animal is first localized within each view, and then a region around the animal is cropped out in 3 dimensions. In the second stage a standard pose estimator is run on this cropped version, leading to increased precision in the pose estimates.

I will keep everyone updated on our progress with these.

@danbider @farzad-ziaie @Selmaan @YitingChang @mariusosswald @NgD-Huy @ankitvishnu23

Apr 04 '24 19:04 themattinthehatt

Thank you for the update!

Apr 08 '24 16:04 NgD-Huy

Thanks for the update! @themattinthehatt I have a 4 camera setup of an arena with a mouse. Currently, we use Deeplabcut to track the animal across the different cameras and then triangulate the landmarks in 3D afterwards. If the current beta is supervised only, then how does it compare to existing supervised models like Deeplabcut? Thanks in advance. Also when customising the config file for multiview, for the csv_file section, do I need to list every csv for every labelled video? or leave it as CollectedData.csv. the Latter which was created during the DLC conversion seems to have rows accurately corresponding to each of the labelled frames across all the videos. See attached a photo of my labelled_data directory (converted from DeepLabCut labels). See also attached my created config file for this data thus far. LPmk1_config_COPY.txt

Apr 18 '24 10:04 DavidGill159

@DavidGill159

how does it compare to existing supervised models like Deeplabcut?

Right now we train a single LP network across all available views. This network then processes videos from one view at a time. Afterwards you can do the triangulation - so in this sense the workflow is the same as DLC. We have yet to benchmark the supervised-only version vs DLC or SLEAP in this multiview setup, but we are working on that. My guess, based on our experiments in the LP paper, is that the supervised performance will be on par with DLC. The current multiview LP implementation does allow for context frames, so if you have a dataset that contains a lot of brief occlusions I think the LP context model (TCN) would do much better on those frames than DLC.

for the csv_file section, do I need to list every csv for every labelled video?

For now the expected format is actually a hybrid of the two options: there should be one csv per view that contains the labels from multiple videos. So if you have 4 views you should have 4 csv files: CollectedData_camera_1.csv, ... CollectedData_camera_4.csv (though the names themselves are arbitrary).

Did you use the DLC->LP conversion script from the repo? We haven't updated that to work with multiview datasets. So for now you'll have to manually split your CollectedData.csv file into the 4 separate files. The important thing to verify is that row n in each file all correspond to the same video/frame number.

I'll also mention that technically speaking you could train a "singleview" model using your current CollectedData.csv file. However, once the unsupervised losses become available then it will be required to have the csv split by view so that during training we can ensure that corresponding views from the same video/frame are handled properly.

Please let me know if you have any other questions!

Apr 18 '24 13:04 themattinthehatt

Thank you for the prompt reply! I reformatted everything as you recommended. When running training I get this error returned (see attached screenshot).

Apr 18 '24 15:04 DavidGill159

haven't seen that one before. can you send/show a screenshot of one of your csv files (with the header)? also, did you update the config file to list all 4 csv files?

Apr 18 '24 15:04 themattinthehatt

haven't seen that one before. can you send/show a screenshot of one of your csv files (with the header)?

Sure, here is a screenshot for 1 of the cameras:

did you update the config file to list all 4 csv files? Yep, I did.

Apr 18 '24 15:04 DavidGill159

It seems the DLC->LP conversion didn't succeed: you can see in column B what are supposed to be the snout x labels are instead the camera names (and similar for column C). And because of that, the final columns (X and Y) don't have any entries. My guess is this is why you're getting that error (since the error you got is consistent with a string not being a number).

The DLC->LP converter should be able to handle this type of data but clearly something went wrong. If you send me the labels from one of your DLC labeled data folders I can try to fix it. Do you have both csv and h5 files for the labeled data?

Apr 18 '24 16:04 themattinthehatt

@DavidGill159 did you post a follow-up question here? I got an email but for some reason I'm not seeing any new activity in this thread.

May 24 '24 13:05 themattinthehatt

@DavidGill159 did you post a follow-up question here? I got an email but for some reason I'm not seeing any new activity in this thread.

Hi yes I thought I had resolved it but I am still having an issue: I created a new network to produce 3D data from 4 separate camera streams, considering our previous discussion ^ i.e. there are 4 CSVs, 1 for each camera, containing the labelled frames for different videos; the same frames were extracted for each camera view. However, I am getting this error when attempting to train a supervised model.

My data is organised as per the attached screenshot:

May 24 '24 15:05 DavidGill159

Based on the error traceback the code is attempting to create a HeatmapDataset when it should actually be building a Multiview HeatmapDataset. There are two potential reasons for this off the top of my head:

you are on the wrong branch of the LP repo - are you on dynamic_crop?
you did not include the data.view_names field in your config file (see here: https://lightning-pose.readthedocs.io/en/dynamic_crop/source/user_guide_advanced/multiview_separate.html#the-configuration-file). for now this is way the code recognizes your data as a multiview dataset

May 24 '24 16:05 themattinthehatt

Hi, yes I am using the dynamic_crop.

Here is my config, I believe it is the format required:

May 24 '24 16:05 DavidGill159

Maybe run git pull and try again? You don't seem to be on the latest dynamic_crop branch - your stack trace shows that dataset = HeatmapDataset(...) is being called from line 91 of .../utils/scripts.py.

Line 91 of that file is indeed dataset = HeatmapDataset(...) in the current main branch, but it is something different in the current dynamic_crop branch

May 24 '24 17:05 themattinthehatt

Hi, tried that but still getting the same error when trying to train.

May 26 '24 18:05 DavidGill159

@DavidGill159 whenever you start training there should be a printout of the various config options - can you copy/paste that here?

May 28 '24 16:05 themattinthehatt

Our Hydra config file:

data parameters

image_orig_dims: {'height': 1024, 'width': 1280} image_resize_dims: {'height': 384, 'width': 384} data_dir: /mnt/c/Users/X48823DG/lightning-pose/lp_mk3_3Dmodel_DG/lp_mk3/ video_dir: /mnt/c/Users/X48823DG/lightning-pose/lp_mk3_3Dmodel_DG/lp_mk3/videos/ csv_file: ['camera_1.csv', 'camera_2.csv', 'camera_3.csv', 'camera_4.csv'] view_names: ['camera_1', 'camera_2', 'camera_3', 'camera_4'] downsample_factor: 2 num_keypoints: 11 keypoint_names: ['Snout', 'Left ear', 'Right ear', 'Left implant back', 'Right implant back', 'White cable part', 'Neck base', 'Body midpoint', 'Tail base', 'Neck-body midpoint', 'Body-tail midpoint'] mirrored_column_matches: [] columns_for_singleview_pca: [5, 7, 8]

training parameters

imgaug: dlc train_batch_size: 16 val_batch_size: 32 test_batch_size: 32 train_prob: 0.95 val_prob: 0.05 train_frames: 1 num_gpus: 1 num_workers: 4 early_stop_patience: 3 unfreezing_epoch: 20 min_epochs: 300 max_epochs: 750 log_every_n_steps: 10 check_val_every_n_epoch: 5 gpu_id: 0 rng_seed_data_pt: 0 rng_seed_model_pt: 0 lr_scheduler: multisteplr lr_scheduler_params: {'multisteplr': {'milestones': [150, 200, 250], 'gamma': 0.5}}

model parameters

losses_to_use: [] backbone: resnet50_animal_ap10k model_type: heatmap heatmap_loss_type: mse model_name: test checkpoint: None

dali parameters

general: {'seed': 123456} base: {'train': {'sequence_length': 32}, 'predict': {'sequence_length': 96}} context: {'train': {'batch_size': 8}, 'predict': {'sequence_length': 96}}

losses parameters

pca_multiview: {'log_weight': 5.0, 'components_to_keep': 3, 'epsilon': None} pca_singleview: {'log_weight': 5.0, 'components_to_keep': 0.99, 'epsilon': None} temporal: {'log_weight': 5.0, 'epsilon': 20.0, 'prob_threshold': 0.05}

callbacks parameters

anneal_weight: {'attr_name': 'total_unsupervised_importance', 'init_val': 0.0, 'increase_factor': 0.01, 'final_val': 1.0, 'freeze_until_epoch': 0}

May 31 '24 07:05 DavidGill159

Very strange, I don't understand why you're not able to build a MultiviewHeatmapDataset. I would suggest debugging by putting some print statements into the actual code itself to get a better idea of what's going wrong. The relevant function is get_dataset - based on your error traceback this function is returning a HeatmapDataset. So at the top of this function put a print(cfg), or otherwise try to understand which part of the if/else statement you're entering and why. You can see that a MultiviewHeatmapDataset will be returned if cfg.data contains the field view_names and the length of that field is greater than 1 (meaning multiple views), which does seem to be the case based on your config above.

May 31 '24 12:05 themattinthehatt

Ok I have figured out what is going wrong.

When calling training, my powershell is using the scripts.py from the path: "/mnt/c/Users/X48823DG/lightning-pose/ Pose-app/lightning-pose/lightning_pose/utils/scripts.py"

updating to the dynamic crop branch in my LP environment is updating the scripts.py at this path instead: "/mnt/c/Users/X48823DG/lightning-pose/ lightning_pose/utils/scripts.py"

As to why, I have no idea. I have been running all previous training in the following directory: /mnt/c/Users/X48823DG/lightning-pose

May 31 '24 15:05 DavidGill159

ah ok - are you trying to run the multiview training from the command line or from the app?

May 31 '24 16:05 themattinthehatt

@themattinthehatt Thank you so much for the updates!

I'm trying to run the multiview training from the command line but get this error.

multiview_error1

Here is my configuration:

data parameters

image_orig_dims: {'height': 960, 'width': 960} image_resize_dims: {'height': 256, 'width': 256} data_dir: /home/yiting/Documents/LP_projects/LP_240613 video_dir: /home/yiting/Documents/LP_projects/LP_240613/videos_checked csv_file: ['camBL.csv', 'camBo.csv', 'camBR.csv', 'camTL.csv', 'camTo.csv', 'camTR.csv'] view_names: ['camBL', 'camBo', 'camBR', 'camTL', 'camTo', 'camTR'] downsample_factor: 2 num_keypoints: 12 keypoint_names: ['index_Tip', 'index_DIP', 'index_PIP', 'middle_Tip', 'middle_DIP', 'middle_PIP', 'ring_Tip', 'ring_DIP', 'ring_PIP', 'small_Tip', 'small_DIP', 'small_PIP'] mirrored_column_matches: None columns_for_singleview_pca: [0, 1, 3, 4, 5, 6, 7, 8]

training parameters

imgaug: dlc train_batch_size: 12 val_batch_size: 24 test_batch_size: 24 train_prob: 0.8 val_prob: 0.1 train_frames: 1 num_gpus: 1 num_workers: 4 early_stop_patience: 3 unfreezing_epoch: 20 min_epochs: 300 max_epochs: 300 log_every_n_steps: 10 check_val_every_n_epoch: 5 gpu_id: 0 rng_seed_data_pt: 0 rng_seed_model_pt: 0 lr_scheduler: multisteplr lr_scheduler_params: {'multisteplr': {'milestones': [150, 200, 250], 'gamma': 0.5}}

model parameters

losses_to_use: ['temporal', 'pca_singleview', 'pca_multiview'] backbone: resnet50_animal_ap10k model_type: heatmap_mhcrnn heatmap_loss_type: mse model_name: test checkpoint: None

dali parameters

general: {'seed': 123456} base: {'train': {'sequence_length': 12}, 'predict': {'sequence_length': 48}} context: {'train': {'batch_size': 6}, 'predict': {'sequence_length': 48}}

losses parameters

pca_multiview: {'log_weight': 5.0, 'components_to_keep': 3, 'epsilon': None} pca_singleview: {'log_weight': 5.0, 'components_to_keep': 0.99, 'epsilon': None} temporal: {'log_weight': 5.0, 'epsilon': 20.0, 'prob_threshold': 0.05}

callbacks parameters

anneal_weight: {'attr_name': 'total_unsupervised_importance', 'init_val': 0.0, 'increase_factor': 0.01, 'final_val': 1.0, 'freeze_until_epoch': 0}

Jun 13 '24 23:06 YitingChang

@YitingChang we're still working on getting the unsupervised losses working for the mutliview case, so for now you'll have to just train a fully supervised model by setting model.losses_to_use: [] in the config file

Jun 14 '24 00:06 themattinthehatt

ah ok - are you trying to run the multiview training from the command line or from the app?

Hi, It goes through the command line. I solved that issue by re-installing LP without the app. However, I am now facing a new error:

Jun 14 '24 09:06 DavidGill159

ah ok - are you trying to run the multiview training from the command line or from the app?

Hi, It goes through the command line. I solved that issue by re-installing LP without the app. However, I am now facing a new error:

I'm getting the same error.

Jun 14 '24 18:06 YitingChang

ok not sure why this is popping up now since I haven't updated this branch in a few months - I'll look into this as soon as possible (might take a couple days) and get back to you. Thanks for your patience!

Jun 17 '24 16:06 themattinthehatt

@DavidGill159 @YitingChang I fixed the error and pushed the code to the dynamic_crop branch (5829fbe910c2cd3eac845dc5bd50f4750e5e7063). I added some new tests and everything looks to be working fine, but please let me know if you run into any other issues.

@YitingChang I also merged in all the newest updates from main, including the hand backbone, so you should be set to use that backbone to train a multiview model now.

Jun 19 '24 19:06 themattinthehatt

@DavidGill159 @YitingChang I fixed the error and pushed the code to the dynamic_crop branch (5829fbe). I added some new tests and everything looks to be working fine, but please let me know if you run into any other issues.

Great all working now! 👍

Jun 20 '24 09:06 DavidGill159