Image size should be even error
Describe the bug Running preprocessing without --no-keep-real makes particles with 65 pixels instead of 64, and then train_cv for opusDSD complains that the pixel size isn't even.
Traceback (most recent call last):
File "/home/jkrieger/software/miniconda/envs/opusdsd-0.3.2b/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/jkrieger/software/miniconda/envs/opusdsd-0.3.2b/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/jkrieger/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/commands/train_cv.py", line 1157, in <module>
main(args)
File "/home/jkrieger/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/commands/train_cv.py", line 725, in main
data = dataset.LazyMRCData(args.particles, norm=args.norm,
File "/home/jkrieger/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/dataset.py", line 51, in __init__
assert ny % 2 == 0, "Image size must be even"
AssertionError: Image size must be even
Looking at the particles in Python confirms the problem:
In [1]: particles='/home/jkrieger/ScipionUserData/projects/TestOpusDsd/Runs/000186_OpusDsdProtPreprocess/output_particles/particles.64.ft.txt'
In [2]: from cryodrgn import dataset
Installed qt5 event loop hook.
In [3]: norm=None
In [4]: real_data=True
In [5]: invert_data=True
In [6]: ind=None
In [7]: use_real=True # it actually doesn't matter what value this takes as it isn't used in the code
In [8]: window=False
In [9]: relion31=True
In [10]: data=None
In [11]: datadir=None
In [12]: window_r=.85
In [13]: in_mem=True
In [14]: notinmem=False
In [15]: data = dataset.LazyMRCData(particles, norm=norm,
...: real_data=real_data, invert_data=invert_data,
...: ind=ind, keepreal=use_real, window=False,
...: datadir=datadir, relion31=relion31, window_r=window_r, in_mem=(not notinmem))
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[15], line 1
----> 1 data = dataset.LazyMRCData(particles, norm=norm,
2 real_data=real_data, invert_data=invert_data,
3 ind=ind, keepreal=use_real, window=False,
4 datadir=datadir, relion31=relion31, window_r=window_r, in_mem=(not notinmem))
File ~/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/dataset.py:51, in LazyMRCData.__init__(self, mrcfile, norm, real_data, keepreal, invert_data, ind, window, datadir, relion31, window_r, in_mem)
49 ny, nx = particles[0].get().shape
50 assert ny == nx, "Images must be square"
---> 51 assert ny % 2 == 0, "Image size must be even"
52 log('Loaded {} {}x{} images'.format(N, ny, nx))
53 self.particles = particles
AssertionError: Image size must be even
In [18]: mrcfile=particles
In [19]: particles = dataset.load_particles(mrcfile, True, datadir=datadir, relion31=relion31)
In [20]: type(particles)
Out[20]: list
In [21]: N = len(particles)
...: ny, nx = particles[0].get().shape
In [22]: ny
Out[22]: 65
In [23]: nx
Out[23]: 65
To Reproduce
CUDA_VISIBLE_DEVICES=0
python -m cryodrgn.commands.preprocess Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/particles.64.mrcs -D 64 --window-r 0.85 --max-threads 16 --relion31 -b 5000
python -m cryodrgn.commands.parse_pose_star Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/poses.pkl --relion31 -D 64 --Apix 3.54
python -m cryodrgn.commands.parse_pose_star Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/poses.pkl --relion31 -D 64 --Apix 3.54
python -m cryodrgn.commands.parse_ctf_star Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/ctfs.pkl --relion31 -D 64 --Apix 3.54 --kv 300.0 --cs 2.7 -w 0.1 --ps 0
python -m cryodrgn.commands.train_cv Runs/000186_OpusDsdProtPreprocess/output_particles/particles.64.ft.txt --poses Runs/000186_OpusDsdProtPreprocess/output_particles/poses.pkl --ctf Runs/000186_OpusDsdProtPreprocess/output_particles/ctfs.pkl --zdim 2 -o Runs/000237_OpusDsdProtTrain/output -n 3 --preprocessed --max-threads 1 --enc-layers 3 --enc-dim 1024 --dec-layers 3 --dec-dim 1024 --lazy-single --pe-type vanilla --encode-mode grad --template-type conv -b 5000 --lr 0.00012 --beta-control 1.0 --beta cos --downfrac 0.5 --valfrac 0.2 --lamb 1.0 --bfactor 4.0 --templateres 192
Expected behavior I'm not actually sure. Somehow, I'd expect preprocess to work upstream of train_cv but maybe not with the arguments that I used as I just saw that it's not actually one of the recommended steps on the README. I suppose the convolutional network means that we don't have such a great need to downsample the map for efficiency anymore.
I think there is probably a problem that --no-keep-real doesn't do what it's supposed to (see #4, which I closed because I wasn't sure this is the right answer).
Another thing that I'd expect is that the keepreal argument does something in LazyMRCData and handles whether there is a check for pixel size needing to be even.
Additional context
- You should probably know that I am making a plugin for opusDSD within the Scipion workflow engine, which can be found at https://github.com/scipion-em/scipion-em-opusdsd. This allows opusDSD to be run from a GUI and included in pipelines. If you would like to meet and see what I'm doing and be involved, you are very welcome to.
- These results are from using the test dataset that was also used for CryoDRGN, which comes from a refinement in a Relion tutorial dataset and contains 1799 particles.
I also have a warning from cryodrgn that we need to have a box size divisible by 8. Is this still true for opusdsd?
I can actually note that I still get this error if I don't do any downsampling
I also have a warning from cryodrgn that we need to have a box size divisible by 8. Is this still true for opusdsd?
This is not true for opusDSD since I didn't implement apex acceleration. CryoDRGN implements an apex.amp acceleration which requires that kind of box size if you enable it during training.
Describe the bug Running preprocessing without --no-keep-real makes particles with 65 pixels instead of 64, and then train_cv for opusDSD complains that the pixel size isn't even.
Traceback (most recent call last): File "/home/jkrieger/software/miniconda/envs/opusdsd-0.3.2b/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/jkrieger/software/miniconda/envs/opusdsd-0.3.2b/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/jkrieger/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/commands/train_cv.py", line 1157, in <module> main(args) File "/home/jkrieger/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/commands/train_cv.py", line 725, in main data = dataset.LazyMRCData(args.particles, norm=args.norm, File "/home/jkrieger/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/dataset.py", line 51, in __init__ assert ny % 2 == 0, "Image size must be even" AssertionError: Image size must be evenLooking at the particles in Python confirms the problem:
In [1]: particles='/home/jkrieger/ScipionUserData/projects/TestOpusDsd/Runs/000186_OpusDsdProtPreprocess/output_particles/particles.64.ft.txt' In [2]: from cryodrgn import dataset Installed qt5 event loop hook. In [3]: norm=None In [4]: real_data=True In [5]: invert_data=True In [6]: ind=None In [7]: use_real=True # it actually doesn't matter what value this takes as it isn't used in the code In [8]: window=False In [9]: relion31=True In [10]: data=None In [11]: datadir=None In [12]: window_r=.85 In [13]: in_mem=True In [14]: notinmem=False In [15]: data = dataset.LazyMRCData(particles, norm=norm, ...: real_data=real_data, invert_data=invert_data, ...: ind=ind, keepreal=use_real, window=False, ...: datadir=datadir, relion31=relion31, window_r=window_r, in_mem=(not notinmem)) --------------------------------------------------------------------------- AssertionError Traceback (most recent call last) Cell In[15], line 1 ----> 1 data = dataset.LazyMRCData(particles, norm=norm, 2 real_data=real_data, invert_data=invert_data, 3 ind=ind, keepreal=use_real, window=False, 4 datadir=datadir, relion31=relion31, window_r=window_r, in_mem=(not notinmem)) File ~/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/dataset.py:51, in LazyMRCData.__init__(self, mrcfile, norm, real_data, keepreal, invert_data, ind, window, datadir, relion31, window_r, in_mem) 49 ny, nx = particles[0].get().shape 50 assert ny == nx, "Images must be square" ---> 51 assert ny % 2 == 0, "Image size must be even" 52 log('Loaded {} {}x{} images'.format(N, ny, nx)) 53 self.particles = particles AssertionError: Image size must be even In [18]: mrcfile=particles In [19]: particles = dataset.load_particles(mrcfile, True, datadir=datadir, relion31=relion31) In [20]: type(particles) Out[20]: list In [21]: N = len(particles) ...: ny, nx = particles[0].get().shape In [22]: ny Out[22]: 65 In [23]: nx Out[23]: 65To Reproduce
CUDA_VISIBLE_DEVICES=0 python -m cryodrgn.commands.preprocess Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/particles.64.mrcs -D 64 --window-r 0.85 --max-threads 16 --relion31 -b 5000 python -m cryodrgn.commands.parse_pose_star Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/poses.pkl --relion31 -D 64 --Apix 3.54 python -m cryodrgn.commands.parse_pose_star Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/poses.pkl --relion31 -D 64 --Apix 3.54 python -m cryodrgn.commands.parse_ctf_star Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/ctfs.pkl --relion31 -D 64 --Apix 3.54 --kv 300.0 --cs 2.7 -w 0.1 --ps 0 python -m cryodrgn.commands.train_cv Runs/000186_OpusDsdProtPreprocess/output_particles/particles.64.ft.txt --poses Runs/000186_OpusDsdProtPreprocess/output_particles/poses.pkl --ctf Runs/000186_OpusDsdProtPreprocess/output_particles/ctfs.pkl --zdim 2 -o Runs/000237_OpusDsdProtTrain/output -n 3 --preprocessed --max-threads 1 --enc-layers 3 --enc-dim 1024 --dec-layers 3 --dec-dim 1024 --lazy-single --pe-type vanilla --encode-mode grad --template-type conv -b 5000 --lr 0.00012 --beta-control 1.0 --beta cos --downfrac 0.5 --valfrac 0.2 --lamb 1.0 --bfactor 4.0 --templateres 192Expected behavior I'm not actually sure. Somehow, I'd expect preprocess to work upstream of train_cv but maybe not with the arguments that I used as I just saw that it's not actually one of the recommended steps on the README. I suppose the convolutional network means that we don't have such a great need to downsample the map for efficiency anymore.
I think there is probably a problem that --no-keep-real doesn't do what it's supposed to (see #4, which I closed because I wasn't sure this is the right answer).
Another thing that I'd expect is that the keepreal argument does something in LazyMRCData and handles whether there is a check for pixel size needing to be even.
Additional context
- You should probably know that I am making a plugin for opusDSD within the Scipion workflow engine, which can be found at https://github.com/scipion-em/scipion-em-opusdsd. This allows opusDSD to be run from a GUI and included in pipelines. If you would like to meet and see what I'm doing and be involved, you are very welcome to.
- These results are from using the test dataset that was also used for CryoDRGN, which comes from a refinement in a Relion tutorial dataset and contains 1799 particles.
James, thank you very much! I will look into this issue!
Describe the bug Running preprocessing without --no-keep-real makes particles with 65 pixels instead of 64, and then train_cv for opusDSD complains that the pixel size isn't even.
Traceback (most recent call last): File "/home/jkrieger/software/miniconda/envs/opusdsd-0.3.2b/lib/python3.9/runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/jkrieger/software/miniconda/envs/opusdsd-0.3.2b/lib/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/jkrieger/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/commands/train_cv.py", line 1157, in <module> main(args) File "/home/jkrieger/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/commands/train_cv.py", line 725, in main data = dataset.LazyMRCData(args.particles, norm=args.norm, File "/home/jkrieger/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/dataset.py", line 51, in __init__ assert ny % 2 == 0, "Image size must be even" AssertionError: Image size must be evenLooking at the particles in Python confirms the problem:
In [1]: particles='/home/jkrieger/ScipionUserData/projects/TestOpusDsd/Runs/000186_OpusDsdProtPreprocess/output_particles/particles.64.ft.txt' In [2]: from cryodrgn import dataset Installed qt5 event loop hook. In [3]: norm=None In [4]: real_data=True In [5]: invert_data=True In [6]: ind=None In [7]: use_real=True # it actually doesn't matter what value this takes as it isn't used in the code In [8]: window=False In [9]: relion31=True In [10]: data=None In [11]: datadir=None In [12]: window_r=.85 In [13]: in_mem=True In [14]: notinmem=False In [15]: data = dataset.LazyMRCData(particles, norm=norm, ...: real_data=real_data, invert_data=invert_data, ...: ind=ind, keepreal=use_real, window=False, ...: datadir=datadir, relion31=relion31, window_r=window_r, in_mem=(not notinmem)) --------------------------------------------------------------------------- AssertionError Traceback (most recent call last) Cell In[15], line 1 ----> 1 data = dataset.LazyMRCData(particles, norm=norm, 2 real_data=real_data, invert_data=invert_data, 3 ind=ind, keepreal=use_real, window=False, 4 datadir=datadir, relion31=relion31, window_r=window_r, in_mem=(not notinmem)) File ~/software/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/dataset.py:51, in LazyMRCData.__init__(self, mrcfile, norm, real_data, keepreal, invert_data, ind, window, datadir, relion31, window_r, in_mem) 49 ny, nx = particles[0].get().shape 50 assert ny == nx, "Images must be square" ---> 51 assert ny % 2 == 0, "Image size must be even" 52 log('Loaded {} {}x{} images'.format(N, ny, nx)) 53 self.particles = particles AssertionError: Image size must be even In [18]: mrcfile=particles In [19]: particles = dataset.load_particles(mrcfile, True, datadir=datadir, relion31=relion31) In [20]: type(particles) Out[20]: list In [21]: N = len(particles) ...: ny, nx = particles[0].get().shape In [22]: ny Out[22]: 65 In [23]: nx Out[23]: 65To Reproduce
CUDA_VISIBLE_DEVICES=0 python -m cryodrgn.commands.preprocess Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/particles.64.mrcs -D 64 --window-r 0.85 --max-threads 16 --relion31 -b 5000 python -m cryodrgn.commands.parse_pose_star Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/poses.pkl --relion31 -D 64 --Apix 3.54 python -m cryodrgn.commands.parse_pose_star Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/poses.pkl --relion31 -D 64 --Apix 3.54 python -m cryodrgn.commands.parse_ctf_star Runs/000186_OpusDsdProtPreprocess/extra/input_particles.star -o Runs/000186_OpusDsdProtPreprocess/output_particles/ctfs.pkl --relion31 -D 64 --Apix 3.54 --kv 300.0 --cs 2.7 -w 0.1 --ps 0 python -m cryodrgn.commands.train_cv Runs/000186_OpusDsdProtPreprocess/output_particles/particles.64.ft.txt --poses Runs/000186_OpusDsdProtPreprocess/output_particles/poses.pkl --ctf Runs/000186_OpusDsdProtPreprocess/output_particles/ctfs.pkl --zdim 2 -o Runs/000237_OpusDsdProtTrain/output -n 3 --preprocessed --max-threads 1 --enc-layers 3 --enc-dim 1024 --dec-layers 3 --dec-dim 1024 --lazy-single --pe-type vanilla --encode-mode grad --template-type conv -b 5000 --lr 0.00012 --beta-control 1.0 --beta cos --downfrac 0.5 --valfrac 0.2 --lamb 1.0 --bfactor 4.0 --templateres 192Expected behavior I'm not actually sure. Somehow, I'd expect preprocess to work upstream of train_cv but maybe not with the arguments that I used as I just saw that it's not actually one of the recommended steps on the README. I suppose the convolutional network means that we don't have such a great need to downsample the map for efficiency anymore.
I think there is probably a problem that --no-keep-real doesn't do what it's supposed to (see #4, which I closed because I wasn't sure this is the right answer).
Another thing that I'd expect is that the keepreal argument does something in LazyMRCData and handles whether there is a check for pixel size needing to be even.
Additional context
- You should probably know that I am making a plugin for opusDSD within the Scipion workflow engine, which can be found at https://github.com/scipion-em/scipion-em-opusdsd. This allows opusDSD to be run from a GUI and included in pipelines. If you would like to meet and see what I'm doing and be involved, you are very welcome to.
- These results are from using the test dataset that was also used for CryoDRGN, which comes from a refinement in a Relion tutorial dataset and contains 1799 particles.
The preprocess routine in cryoDRGN might not work with opusDSD without special attention. OpusDSD takes the image stacks as input directly, and the image are in real space as it is. In contrast, cryoDRGN needs to take the fourier transform of the image, so they implement a preprocess routine to do the FFT in advance. Therefore, you need to make sure that the images downsampled by the preprocess of cryoDRGN is still in real space (I will check the no-keep-real argument). To downsample the image stack, I usually use the relion_preprocess from RELION, or specify a downsample argument during training, like --downsample 0.5, which will downsample the image to half of the original dimension. To go under the hood, James, you may note that there is a "data_augmentation" function in the train_cv.py, which handles the downsample, shift, and blurring, operations.
O, can you check the content of 'particles.64.ft.txt', is it pointing to particles.64.mrcs? You can also check the header of mrcs using IMOD's header command, like 'header particles.64.mrcs' .
https://github.com/alncat/opusDSD/blob/e931522987ed2b8fc8914943768d5a7452189493/cryodrgn/commands/preprocess.py#L107C21-L107C21 looks like no-keep-real will enable HT transform on the image, so the output is D+1
Thanks for all the responses. That’s really helpful!
I’ll have a look at particles.64.ft.txt and see. I’d guess the images are already in Fourier space for that
Yes, inside particles.64.ft.txt it says particles.64.0.ft.mrcs
Yes, inside particles.64.ft.txt it says particles.64.0.ft.mrcs
Sorry for the late reply, I am traveling last weekend. O, you can then try loading particles.64.mrcs directly to see if the images are in real space and in even size. Regarding the training command 'python -m cryodrgn.commands.train_cv Runs/000186_OpusDsdProtPreprocess/output_particles/particles.64.ft.txt --poses Runs/000186_OpusDsdProtPreprocess/output_particles/poses.pkl --ctf Runs/000186_OpusDsdProtPreprocess/output_particles/ctfs.pkl --zdim 2 -o Runs/000237_OpusDsdProtTrain/output -n 3 --preprocessed --max-threads 1 --enc-layers 3 --enc-dim 1024 --dec-layers 3 --dec-dim 1024 --lazy-single --pe-type vanilla --encode-mode grad --template-type conv -b 5000 --lr 0.00012 --beta-control 1.0 --beta cos --downfrac 0.5 --valfrac 0.2 --lamb 1.0 --bfactor 4.0 --templateres 192'. --preprocssed --max-threads 1 --enc-layers 3 --enc-dim 1024 --dec-layers 3 --dec-dim 1024' should be dropped as they are related to cryoDRGN. 'b' represents the batch size, 5000 might be too large. You can set it to a number around 20 that can fit into the gpu memory (depends on your hardware). If you have multiple gpu, you can then try 'multigpu' and 'num-gpus'. Since 64 is a very small size, downfrac can be set 1.0 (which means no downsampling), templateres can be set to smaller size like 128. Finally, you can try to make a mask using conesus model, following the link https://relion.readthedocs.io/en/release-3.1/SPA_tutorial/Mask.html . You can also try '--plot' option, which will show some intermediate results interactively during training.
Thanks for the comments. No worry about the delay.
I think that particles.64.0.ft.mrcs is probably in Fourier space and not even size.
I decided to upsample the particles to 130 now and use downfrac 1.0 because there is an error that it has to be bigger than 128 and that's the first bigger even number. This is still for testing quickly at the moment and it is giving results now, which I think look ok considering how small the data set is.
I'll try removing those parts especially --preprocessed --max-threads 1 and change the batch size. Isn't good to have an option for controlling layers and dim for enc and dec?
Many thanks again
https://github.com/alncat/opusDSD/blob/86bed17a235c3a166ca03b51aa75963a3f81c63e/cryodrgn/commands/train_cv.py#L948 This line can be deleted since this size limit no longer holds. The intermediate tensors will be resampled to 12^3 in encoder, https://github.com/alncat/opusDSD/blob/86bed17a235c3a166ca03b51aa75963a3f81c63e/cryodrgn/models.py#L663 . Hence, the encoder works with any size now 😺. You can try to delete that line and test on 64x64 images. It will be great if we can control the number of layers, but this requires us to refactor the encoder and convtemplate classes ( this will make the code more readable btw).
Ok, yes, I’ll delete this line and try it with 64x64 and remove the terms about the layers etc
Thanks again for all your help
I decided to upsample the particles to 130 now and use downfrac 1.0 because there is an error that it has to be bigger than 128 and that's the first bigger even number. This is still for testing quickly at the moment and it is giving results now, which I think look ok considering how small the data set is.
I was actually trying 192 before and it was working. Now that I've deleted the line and those arguments, I'm getting another problem about not having a mask. Do we need to always have one?
It is recommended to have a mask since the program can then determine the region with densities, and then crop out empty regions, which can save some memories. If no mask is supplied, the program will use a spherical mask with diameter 0.85 x image size. Since the mask often comes with the consensus refinement result, I usually do training with a mask. I will do some tests without mask to make sure that option works.
Mask is handled in decoder here, https://github.com/alncat/opusDSD/blob/86bed17a235c3a166ca03b51aa75963a3f81c63e/cryodrgn/models.py#L888
It is handled in encoder here, https://github.com/alncat/opusDSD/blob/86bed17a235c3a166ca03b51aa75963a3f81c63e/cryodrgn/models.py#L509
I decided to upsample the particles to 130 now and use downfrac 1.0 because there is an error that it has to be bigger than 128 and that's the first bigger even number. This is still for testing quickly at the moment and it is giving results now, which I think look ok considering how small the data set is.
I was actually trying 192 before and it was working. Now that I've deleted the line and those arguments, I'm getting another problem about not having a mask. Do we need to always have one?
Ah, the current code doesn't work without a mask. It needs some revises to make it work.
Ok. Thanks for checking.
I guess you mean the dynamic mask from cryosparc? Relion and Xmipp do not automatically make any
Ok. Thanks for checking.
I guess you mean the dynamic mask from cryosparc? Relion and Xmipp do not automatically make any
Yes, cryosparc generates models together with masks at every iteration. James, you can check my latest commit. I made the default spherical mask work. The diameter of mask can be controlled by --window-r.
Ok, I’ll probably be able to try it tomorrow or Thursday. Thanks
Hello,
Sorry for the delay. I've been quite busy and I got moved to a different workstation.
I've just given it another try and this part seems to be solved, but now there's another error:
(opusdsd-0.3.2b) flex@pascal ~/ScipionUserData/projects/TestOpusDsd $ eval "$(/home/flex/anaconda3/bin/conda shell.bash hook)"&& conda activate opusdsd-0.3.2b && CUDA_VISIBLE_DEVICES=0 python -m cryodrgn.commands.train_cv Runs/000160_OpusDsdProtTrain/extra/input_particles.star --poses Runs/000160_OpusDsdProtTrain/output/poses.pkl --ctf Runs/000160_OpusDsdProtTrain/output/ctfs.pkl --zdim 12 -o Runs/000160_OpusDsdProtTrain/output -n 3 --lazy-single --pe-type vanilla --encode-mode grad --template-type conv -b 20 --lr 0.00012 --beta-control 1.0 --beta cos --downfrac 1.0 --valfrac 0.2 --lamb 1.0 --bfactor 4.0 --templateres 192 --split Runs/000160_OpusDsdProtTrain/extra/sp-split.pkl --relion31
2023-11-29 15:57:29 /home/flex/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/commands/train_cv.py Runs/000160_OpusDsdProtTrain/extra/input_particles.star --poses Runs/000160_OpusDsdProtTrain/output/poses.pkl --ctf Runs/000160_OpusDsdProtTrain/output/ctfs.pkl --zdim 12 -o Runs/000160_OpusDsdProtTrain/output -n 3 --lazy-single --pe-type vanilla --encode-mode grad --template-type conv -b 20 --lr 0.00012 --beta-control 1.0 --beta cos --downfrac 1.0 --valfrac 0.2 --lamb 1.0 --bfactor 4.0 --templateres 192 --split Runs/000160_OpusDsdProtTrain/extra/sp-split.pkl --relion31
2023-11-29 15:57:29 Namespace(particles='/data/flex/ScipionUserData/projects/TestOpusDsd/Runs/000160_OpusDsdProtTrain/extra/input_particles.star', outdir='/data/flex/ScipionUserData/projects/TestOpusDsd/Runs/000160_OpusDsdProtTrain/output', ref_vol=None, zdim=12, poses='/data/flex/ScipionUserData/projects/TestOpusDsd/Runs/000160_OpusDsdProtTrain/output/poses.pkl', ctf='/data/flex/ScipionUserData/projects/TestOpusDsd/Runs/000160_OpusDsdProtTrain/output/ctfs.pkl', group=None, group_stat=None, load=None, latents=None, split='Runs/000160_OpusDsdProtTrain/extra/sp-split.pkl', valfrac=0.2, checkpoint=1, log_interval=1000, verbose=False, seed=23942, ind=None, invert_data=True, window=True, window_r=0.85, datadir=None, relion31=True, lazy_single=True, notinmem=False, lazy=False, preprocessed=False, max_threads=16, tilt=None, tilt_deg=45, num_epochs=3, batch_size=20, wd=0, lr=0.00012, lamb=1.0, downfrac=1.0, templateres=192, bfactor=4.0, beta='cos', beta_control=1.0, norm=None, tmp_prefix='tmp', amp=False, multigpu=False, num_gpus=4, do_pose_sgd=False, pretrain=1, emb_type='quat', pose_lr=0.0003, pose_enc=False, pose_only=False, plot=False, qlayers=3, qdim=256, encode_mode='grad', enc_mask=None, use_real=False, optimize_b=False, players=3, pdim=256, pe_type='vanilla', template_type='conv', warp_type=None, symm=None, num_struct=1, deform_size=2, pe_dim=None, domain='fourier', activation='relu')
2023-11-29 15:57:29 Use cuda True
2023-11-29 15:57:29 Loading dataset from /data/flex/ScipionUserData/projects/TestOpusDsd/Runs/000160_OpusDsdProtTrain/extra/input_particles.star
2023-11-29 15:57:29 Loaded 1799 64x64 images
2023-11-29 15:57:29 first image: [[-0.09194159 0.39910644 -0.02999168 ... 0.65424025 2.1068065
0.6698552 ]
[-0.2128554 0.75497377 0.22428153 ... 1.1024915 2.5193024
0.5522364 ]
[ 0.53614616 0.6558835 0.09200134 ... -0.04922847 1.5195653
0.58241683]
...
[-0.12323537 0.11786314 -0.9315448 ... -0.92004204 -1.1743912
-0.9257372 ]
[ 0.04064267 0.4853493 -0.06305265 ... 0.2970134 -0.2873671
-1.0044745 ]
[-0.48678073 -0.38818717 -0.8513743 ... 1.0835191 0.2279402
-1.3637027 ]]
2023-11-29 15:57:29 Image Mean, Std are 0.0024950394872576 +/- 0.8993831276893616
2023-11-29 15:57:29 Reading all images into memory!
2023-11-29 15:57:29 loaded eulers
euler difference: tensor(4.5443e-05) 1799
max difference: torch.return_types.max(
values=tensor([5.9485e-05, 1.7452e-04, 6.1035e-05]),
indices=tensor([ 932, 1219, 705]))
[[ 107.801792 105.980954 -131.130655]
[ 78.074471 117.122663 78.76982 ]
[-141.683945 45.648309 -103.320529]
[-138.71656 45.619975 146.100663]
[-169.744927 39.55955 83.692854]]
tensor([[ 72.1982, 105.9810, 23.3289],
[101.9255, 117.1227, 203.1557],
[-38.3161, 45.6483, 245.0045],
[-41.2834, 45.6199, -7.3841],
[-10.2551, 39.5596, 86.0520]])
nn: 1799 batch_size: 20
[0, 0, 20, 0, 0, 40, 0, 0, 20, 80, 60, 40, 20, 100, 20, 80, 240, 0, 0, 0, 60, 0, 20, 0, 20, 0, 0, 40, 0, 0, 0, 60, 0, 20, 100, 20, 20, 120, 20, 20, 0, 0, 40, 0, 20, 80, 0, 0]
1799 1799 [2, 5, 8, 9, 10, 11, 12, 13, 14, 15, 16, 20, 22, 24, 27, 31, 33, 34, 35, 36, 37, 38, 39, 42, 44, 45]
2023-11-29 15:57:29 Loading ctf params from /data/flex/ScipionUserData/projects/TestOpusDsd/Runs/000160_OpusDsdProtTrain/output/ctfs.pkl
2023-11-29 15:57:29 Image size (pix) : 64
2023-11-29 15:57:29 A/pix : 5.53125
2023-11-29 15:57:29 DefocusU (A) : 35136.05859375
2023-11-29 15:57:29 DefocusV (A) : 33578.890625
2023-11-29 15:57:29 Dfang (deg) : 100.2699966430664
2023-11-29 15:57:29 voltage (kV) : 300.0
2023-11-29 15:57:29 cs (mm) : 2.700000047683716
2023-11-29 15:57:29 w : 0.10000000149011612
2023-11-29 15:57:29 Phase shift (deg) : 0.0
2023-11-29 15:57:29 first ctf params is: [5.531250e+00 3.513606e+04 3.357889e+04 1.002700e+02 3.000000e+02
2.700000e+00 1.000000e-01 0.000000e+00]
initializing 2d grid of size 64
/home/flex/anaconda3/envs/opusdsd-0.3.2b/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3526.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
2023-11-29 15:57:31 creating ctf grid False with grid tensor([[ 0, 1, 2, ..., 30, 31, 32],
[ 1, 1, 2, ..., 30, 31, 32],
[ 2, 2, 3, ..., 30, 31, 32],
...,
[ 3, 3, 4, ..., 30, 31, 32],
[ 2, 2, 3, ..., 30, 31, 32],
[ 1, 1, 2, ..., 30, 31, 32]], device='cuda:0')
tensor(0, device='cuda:0')
2023-11-29 15:57:31 created ctf grid with shape: torch.Size([64, 33, 2]), max_r: 45
2023-11-29 15:57:31 Using circular lattice with radius 32
2023-11-29 15:57:31 model: image supplemented into encoder will be of size 64
2023-11-29 15:57:31 encoder: the input image size is 54
2023-11-29 15:57:31 convtemplate: the output volume is of size 192, resample intermediate activations of size 16 to 12
2023-11-29 15:57:31 decoder: downsampling apix from 5.53125 to 5.53125
torch.Size([1, 54, 54])
2023-11-29 15:57:31 HetOnlyVAE(
(encoder): Encoder(
(transformer_e): SpatialTransformer()
(down1): Sequential(
(0): Conv3d(1, 32, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(1): LeakyReLU(negative_slope=0.2)
(2): Conv3d(32, 64, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(3): LeakyReLU(negative_slope=0.2)
(4): Conv3d(64, 128, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(5): LeakyReLU(negative_slope=0.2)
)
(down2): Sequential(
(0): Conv3d(128, 256, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(1): LeakyReLU(negative_slope=0.2)
(2): Conv3d(256, 512, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(3): LeakyReLU(negative_slope=0.2)
(4): Conv3d(512, 512, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(5): LeakyReLU(negative_slope=0.2)
)
(down3): Sequential(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): LeakyReLU(negative_slope=0.2)
)
(mu): Linear(in_features=512, out_features=12, bias=True)
(logstd): Linear(in_features=512, out_features=12, bias=True)
)
(decoder): VanillaDecoder(
(template): ConvTemplate(
(template1): Sequential(
(0): Linear(in_features=12, out_features=512, bias=True)
(1): LeakyReLU(negative_slope=0.2)
(2): Linear(in_features=512, out_features=2048, bias=True)
(3): LeakyReLU(negative_slope=0.2)
)
(template2): Sequential(
(0): ConvTranspose3d(2048, 1024, kernel_size=(2, 2, 2), stride=(2, 2, 2))
(1): LeakyReLU(negative_slope=0.2)
(2): ConvTranspose3d(1024, 512, kernel_size=(2, 2, 2), stride=(2, 2, 2))
(3): LeakyReLU(negative_slope=0.2)
)
(template3): Sequential(
(0): ConvTranspose3d(512, 256, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(1): LeakyReLU(negative_slope=0.2)
(2): ConvTranspose3d(256, 128, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(3): LeakyReLU(negative_slope=0.2)
)
(template4): Sequential(
(0): ConvTranspose3d(128, 64, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(1): LeakyReLU(negative_slope=0.2)
(2): ConvTranspose3d(64, 32, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(3): LeakyReLU(negative_slope=0.2)
(4): ConvTranspose3d(32, 16, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
(5): LeakyReLU(negative_slope=0.2)
)
(conv_out): ConvTranspose3d(16, 1, kernel_size=(4, 4, 4), stride=(2, 2, 2), padding=(1, 1, 1))
)
(transformer): SpatialTransformer()
)
)
template_type: conv
2023-11-29 15:57:31 61402601 parameters in model
2023-11-29 15:57:31 28196856 parameters in encoder
2023-11-29 15:57:31 33205745 parameters in decoder
2023-11-29 15:57:32 loading train validation split from Runs/000160_OpusDsdProtTrain/extra/sp-split.pkl
num_samples: 940
num_samples: 120
2023-11-29 15:57:32 image will be downsampled to 1.0 of original size 64
2023-11-29 15:57:32 reconstruction will be blurred by bfactor 4.0
2023-11-29 15:57:32 learning rate [0.00012], bfactor: 4.333333333333333, beta_max: 1.0, beta_control: 1.0 for epoch 0
ns: [0, 0, 20, 0, 0, 20, 0, 0, 20, 40, 40, 20, 20, 80, 20, 60, 180, 0, 0, 0, 40, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 40, 0, 20, 80, 0, 20, 100, 0, 20, 0, 0, 20, 0, 0, 60, 0, 0]
current_ind: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Traceback (most recent call last):
File "/home/flex/anaconda3/envs/opusdsd-0.3.2b/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/flex/anaconda3/envs/opusdsd-0.3.2b/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/flex/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/commands/train_cv.py", line 1144, in <module>
main(args)
File "/home/flex/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/commands/train_cv.py", line 991, in main
rot, tran = posetracker.get_pose(ind)
File "/home/flex/scipion3/software/em/opusdsd-0.3.2b/cryodrgn/pose.py", line 312, in get_pose
rot = self.rots[ind]
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
I also had to uninstall and reinstall pytorch because it was limited to cuda 10 and didn't support the RTX 3090 GPUs I have on the new machine. I'll create another issue about that.
Without the supported GPU, I didn't get this error, which makes sense with using cpu
Thank you very much for reporting this! I will look into it.
You’re welcome
James, I reproduced this bug! I fixed it in this commit https://github.com/alncat/opusDSD/commit/b722d2b97aac9a6cf82cfb773f7407214873cd36 . The training script runs correctly now.
but there are might still some bugs without extensive testing.
Thanks!
I'll continue testing it and let you know what I come across
James, I found that opus-dsd only works with pytorch 1.11.0 or below. I tested it using pytorch 1.12.0 and find some bizarre behaviours. I created an environment file contains cuda 11.3 and pytorch 1.10.1 in the recent commit https://github.com/alncat/opusDSD/commit/07234404fad42370c801696563261c31a9dfa754 . Opus-dsd works well in this environment.
Great. Thanks very much