cryodrgn
cryodrgn copied to clipboard
Downsample with .cs particles as source sometimes reads more particles than parse_pose and parse_ctf
I used a particles.cs file (with --datadir argument to specify the folder) from an exported job in CS as input for downsample and it loads all particles nicely. Actually, it loads ALL particles in the extract folders located within the exported CS job, even though some of those particles are excluded in the actual particle stack in CS. On the other hand, parse_pose and parse_ctf read the correct number of particles from the .cs file and when I try to run training, I get an error because of the mismatch in particle numbers. Would it be possible to read in only the particles that are a part of the final stack and not all particles located within the extract folder?
I circumvented this for now by re-extracting the particles which worked just fine, but takes unnecessary space.
Thanks for reporting. Are you providing a different .cs
file to cryodrgn downsample
than the cryodrgn parse_*
commands?
No, it is the same file.
Could you email the .cs
file to myself ([email protected]) and Vineet Bansal @vineetbansal ([email protected]). We will take a look. Thanks!
Hi @epkumpu - we're not seeing anything obvious in the cryodrgn
codebase that would cause this behavior. If you can send us your particles.cs
file where you're seeing a different number of processed particles in downsample
vs parse_pose
, it will be immensely helpful for us to squash this bug. Thanks!
Hello @epkumpu - thanks for the sample .cs
data. We have a suspicion of what the problem might be, though to be sure, we were wondering if you can check whether the directory you're specifying as --datadir
has your master mrc files directly inside it, without any intervening folders, for example:
-
<datadir>/011268377466662442732_FoilHole_25024273_Data_25002032_25002034_20220227_031106_fractions_patch_aligned_doseweighted_particles.mrc
-
<datadir>/011268377466662442732_FoilHole_25024273_Data_25002032_25002034_20220227_031106_fractions_patch_aligned_doseweighted_particles.mrc
in addition to where you expect them to be, i.e:
-
<datadir>/J777/extract/011268377466662442732_FoilHole_25024273_Data_25002032_25002034_20220227_031106_fractions_patch_aligned_doseweighted_particles.mrc
-
<datadir>/J777/extract/011268377466662442732_FoilHole_25024273_Data_25002032_25002034_20220227_031106_fractions_patch_aligned_doseweighted_particles.mrc
If that is the case, can I ask you to move those files out of --datadir
and redo the downsample step and see if cryoDRGN
picks the correct number of particles? Thanks!