VisCy Single-cell representation learning

Accumulated changes for single-cell representation learning.

@edyoshikun this PR include breaking API changes for image translation (#145).

Pending before merging this to main:

#159
#160
#164

Aug 31 '24 13:08 ziw-liu

#159 and #168 were still a bit broken. I'm merging now to prepare the branch point, but they should be fixed before merging to main.

Sep 27 '24 21:09 ziw-liu

To merge this branch and release candidate 0.3.0-rc1, we need to test the following:

[ ] Demos and notebooks that illustrate robust virtual staining: @ziw-liu @edyoshikun
[ ] Try training models with updated example configs: @ziw-liu
[ ] Fix any remaining bugs in the representation learning code path.

We decided that the configs and checkpoints posted for the preprint will continue to depend on release 0.2.0.

Oct 08 '24 17:10 mattersoflight

Things I have tested with the current HEAD of this branch:

Train VS model with:

/hpc/projects/intracellular_dashboard/viral-sensor/infection_classification/models/phase-to-sensor/2024_08_14_ZIKV_pal17_48h/fit.yml

Predict with VSCyto2D (reported in the VS preprint) with:

/hpc/projects/intracellular_dashboard/ops/2024_09_19_tracking_accuracy_test/2-VS/tta/predict.yml

Imports paths in example VS notebooks and configs are correct

@mattersoflight I'm still working on #181 which will also introduce user interface changes. Should we do comprehensive release candidate testing after that?

Things need to be tested before release:

@edyoshikun or @mattersoflight:

[x] End-to-end testing of the VS example notebooks.
[ ] (Optional) update the HF demo

@Soorya19Pradeep:

[x] Training of new contrastive model
[x] Prediction using model checkpoint we report in the paper.

Oct 10 '24 01:10 ziw-liu

@mattersoflight I'm still working on https://github.com/mehta-lab/VisCy/pull/181 which will also introduce user interface changes. Should we do comprehensive release candidate testing after that?

@ziw-liu I suggest merging #181 (CLI interface) in this branch and then doing the tests you outlined so that everyone builds familiarity with the revised CLI.

Since these two PRs make multiple breaking (and welcome) changes to the codebase, I suggest tagging the current head of main as 0.2.1-rc1 or similar. We don't need to push this to PyPI; it is just for us to check out the current state of main if need arises.

Oct 10 '24 14:10 mattersoflight

@mattersoflight Compared to the latest stable release (v0.2.1), the current HEAD of main adds a visualization script (#144) and a link to the demo (#172), so there should be no behavior change. If you still think we need a tag, I'm comfortable with just tagging v0.2.2 stable.

Oct 10 '24 16:10 ziw-liu

I have done one round of testing. The training is underway and the prediction using an earlier model checkpoint was completed.

Oct 10 '24 22:10 Soorya19Pradeep

As discussed, the HF model is pinned to use <0.3 versions and the gradio code is not exposed to the user, so we don't need to update this for now. The model weights are posted separately and point to the github.

Oct 17 '24 18:10 edyoshikun

@ziwen, I'm done testing the virtual staining end-to-end. I didn't run any of the representation learning. I really appreciate the new config files structure and CLI. This will work well with any type of custom dataloaders and models. Thank you

Oct 17 '24 19:10 edyoshikun