Extraction of particles from kmeans clusters in star format
Hi @zhonge
-
Could you please suggest how does one extract particles for a given kmeans cluster and then transform them in star format for traditional refinement? Note: In lasso tool, selection is done manually, therefore not precise. While running kmeans clustering algorithm, there must be a way to directly get indices for a given cluster. Kindly suggest.
-
Could we view the latent space in the jupyter-notebook as per the kmeans cluster labels?
Thanks and Regards.
Thanks for your question! There are a couple of options for selecting particles from the desired cluster:
-
You can select the particles using the
cryoDRGN_filtering.ipynbnotebook fromcryodrgn analyze-- there is a section of this notebook for selecting based on GMM or kmeans cluster labels. There is also a cell that generates a visualization of the selected cluster. Here is an example in the tutorial: https://www.notion.so/cryoDRGN-EMPIAR-10076-tutorial-c8728dcc88e744c8827447c3ff094d19#770194967d274239bfca4c52868198aa -
On the command line, there is a script for selecting clusters in the
utilssubdirectory of the repo: https://github.com/zhonge/cryodrgn/blob/master/utils/select_clusters.pyFor example, to select the particles in clusters 3, 5, and 7:
(cryodrgn) $ python select_clusters.py /path/to/analyze/directory/kmeans20/labels.pkl --sel 3 5 7 -o selected_clusters.pkl
If your training job has already been filtered (i.e. you provided a selection --ind to cryodrgn train_vae), you need to include the --parent-ind and --N-orig to get the indices into the original particle stack.
- Finally, there is a new tool
cryodrgn analyze_landscapeavailable in the latest beta version of cryodrgn (1.0.0-beta) for assigning and selecting classes from the cryodrgn results. I am still working on the documentation for landscape analysis, but a work-in-progress version is here: https://www.notion.so/cryodrgn-conformational-landscape-analysis-a5af129288d54d1aa95388bdac48235a.