What is the best way to take particles from cryosparc to cryodrgn?
How are most people taking extracted particle images from cryosparc for use in cryoDRGN? In the extract particles job in cryosparc, the extracted particles are not output. I only see the full micrographs to download as well as the file containing the particle coordinates. Do I need to re-extract the particles in Relion or is there something I'm missing in cryosparc where I can download the extracted particle images? Just looking for the best way to handle this and figure everyone on this forum has dealt with this before.
Thanks!
Hi amitrox,
To export particles from cryoSPARC to cryoDRGN, you need two things:
- The
particles.csfile from the last iteration of your consensus refinement job - The path to the particle extraction job, which was the source for the particles for the consensus refinement.
For example, my consensus refinement was J380, which referred to particles extracted in J379. These are the 4 commands I used to import the poses and CTF parameters from the consensus refinement, and the particle images from the extraction job.
.../cryodrgn$ cryodrgn parse_pose_csparc /path-to-CS-project/J380/J380_007_particles.cs -D 480 -o poses.pkl
.../cryodrgn$ cryodrgn parse_ctf_csparc /path-to-CS-project/J380/J380_007_particles.cs -o ctf.pkl
.../cryodrgn/particles_256$ cryodrgn downsample /path-to-CS-project/J380/J380_007_particles.cs -D 256 -o particles.256.mrcs --chunk 10000 --datadir /path-to-CS-project/J379/extract/
.../cryodrgn/particles_128$ cryodrgn downsample ../particles_256/particles.256.txt -D 128 -o particles.128.mrcs --chunk 10000
cryoDRGN will read the particles.cs file to find the name of a particle images file, something like:
000000006985639979312_FoilHole_13405301_Data_13384918_6_20240614_232414_EER_patch_aligned_doseweighted_particles.mrc
It then appends this file name to the path of the particle images directory, which you need to provide using the --datadir flag.
Also, the extract particles job in cryoSPARC should look like this:
The
extract/ directory should have a bunch of .mrc files named something like patch_aligned_doseweighted_particles.mrc. I believe each .mrc file contains a stack of particle images for the specified micrograph.
The consensus refinement job should look like this:
While we're on this topic, @michal-g, I also wanted to report a possible bug in the tutorial documentation for the cryodrgn downsample command:
The input format to specify the particle stack may also be a .star file or a .cs file.
If the paths to the .mrcs particles given by the .star/.cs file are broken, you can overwrite them using the argument --datadir [PATH TO DIRECTORY WITH .MRCS] . In some cases, the --datadir path should point to the project directory in order to complete relative file paths given in the .star or .cs file.
When I used cryodrgn downsample to import particles from cryoSPARC v4.6, the --datadir path had to refer to the directory in the extraction job that contains the particle images, not the cryoSPARC project directory.
Cheers, cbeck
Hey there,
I'm wondering what the best approach is for downsampling particles from two different extract jobs that you've combined (and run a deduplicate job on) in cryoSPARC processing. I have some rare views that I have had to train Topaz on specifically, and I've merged the particles back together later on. The box sizes and everything are the same.
Thanks in advance for any help on this!
Thank you for the detailed writeup @cbeck22. Sean, this should also be possible. If I remember correctly, there is an "Export" button on the outputs tab of a cryoSPARC job. This will create a single directory in your cryoSPARC project directory that links to all of the relevant extract jobs that you can provide as a --datadir argument.