conga
conga copied to clipboard
make_tcr_clumping_plot failing when trying to save figure
When running the reanalyze step described in the README (running run_conga.py with --restart and --all), the run fails with the error ValueError: Image size of 4124x92909 pixels is too large. It must be less than 2^16 in each direction. while inside make_tcr_clumping_plots.
The error happens when plt.savefig(logo_pngfile, dpi=300) is called in make_logo_plots (line 1450 in conga/plotting.py), which is called by make_cluster_logo_plots_figure (line 3130) and in turn is called in make_tcr_clumping_plots (line 3189). It seems like the SVG to PNG conversion using imagemagick was successful for the individual logos plots, and the error is happening when merging the images together. Some quick Googling suggests that the issue may might be related to plt.text calls in make_logo_plots, but at the moment I am unsure of any specific causes.
The commands I ran before the reanalyze step were:
python /path/to/conga/scripts/setup_10x_for_conga.py --filtered_contig_annotations_csvfile filtered_contig_annotations.csv --organism human --no_kpca
python /path/to/conga/scripts/run_conga.py --graph_vs_graph --no_kpca --gex_data filtered_feature_bc_matrix.h5 --gex_data_type 10x_h5 --clones_file filtered_contig_annotations_tcrdist_clones.tsv --organism human --outfile_prefix CoNGA_out
The specific command I ran for the reanalyze step was:
python /path/to/conga/scripts/run_conga.py --restart CoNGA_out_final.h5ad --all --no_kpca --outfile_prefix CoNGA_out_restarted
The CoNGA run is being run on the gene expression and TCR data of 300k cells, using 10X cellranger outputs (the very large size of the data could be a contributing factor to the issue). I am using the most recent version of CoNGA (as of March 4, 2022, the most recent commit was December 10, 2021), and running CoNGA inside a conda python environment that was created using the instructions provided in the README.
My conda environment is the following:
# Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 4.5 1_gnu
argon2-cffi 20.1.0 py36h8f6f2f9_2 conda-forge
arpack 3.7.0 hc6cf775_2 conda-forge
async_generator 1.10 py_0 conda-forge
atk-1.0 2.36.0 h3371d22_4 conda-forge
attrs 21.4.0 pyhd8ed1ab_0 conda-forge
backcall 0.2.0 pyhd3eb1b0_0
blas 1.0 mkl
bleach 4.1.0 pyhd8ed1ab_0 conda-forge
blosc 1.21.0 h8c45485_0
bzip2 1.0.8 h7b6447c_0
ca-certificates 2021.10.8 ha878542_0 conda-forge
cairo 1.16.0 h18b612c_1001 conda-forge
certifi 2021.5.30 py36h5fab9bb_0 conda-forge
cffi 1.14.6 py36hc120d54_0 conda-forge
cycler 0.11.0 pyhd3eb1b0_0
dbus 1.13.18 hb2f20db_0
decorator 4.4.2 pypi_0 pypi
defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge
entrypoints 0.4 pyhd8ed1ab_0 conda-forge
expat 2.4.4 h295c915_0
fastcluster 1.2.4 pypi_0 pypi
fftw 3.3.9 h27cfd23_1
font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge
font-ttf-inconsolata 3.000 h77eed37_0 conda-forge
font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge
font-ttf-ubuntu 0.83 hab24e00_0 conda-forge
fontconfig 2.13.1 h6c09931_0
fonts-conda-ecosystem 1 0 conda-forge
fonts-conda-forge 1 0 conda-forge
freetype 2.11.0 h70c0345_0
fribidi 1.0.10 h36c2ea0_0 conda-forge
gdk-pixbuf 2.42.6 h04a7f16_0 conda-forge
get-version 2.1 pypi_0 pypi
gettext 0.19.8.1 h0b5b191_1005 conda-forge
ghostscript 9.54.0 h9c3ff4c_1 conda-forge
giflib 5.2.1 h36c2ea0_2 conda-forge
glib 2.68.3 h9c3ff4c_0 conda-forge
glib-tools 2.68.3 h9c3ff4c_0 conda-forge
glpk 4.65 h9202a9a_1004 conda-forge
gmp 6.2.1 h58526e2_0 conda-forge
graphite2 1.3.14 h23475e2_0
graphviz 2.48.0 h85b4f2f_0 conda-forge
gst-plugins-base 1.14.0 h8213a91_2
gstreamer 1.14.0 h28cd5cc_2
gtk2 2.24.33 h539f30e_1 conda-forge
gts 0.7.6 h64030ff_2 conda-forge
harfbuzz 2.8.1 h6f93f22_0
hdf5 1.10.4 hb1b8bf9_0
icu 58.2 he6710b0_3
igraph 0.9.4 ha184e22_0 conda-forge
imagemagick 7.0.11_13 pl5320hb118871_0 conda-forge
intel-openmp 2022.0.1 h06a4308_3633
ipykernel 5.5.5 py36hcb3619a_0 conda-forge
ipython 7.16.1 py36h5ca1d4c_0
ipython_genutils 0.2.0 pyhd3eb1b0_1
jbig 2.1 h7f98852_2003 conda-forge
jedi 0.17.0 py36_0
jinja2 3.0.3 pyhd8ed1ab_0 conda-forge
joblib 1.0.1 pyhd3eb1b0_0
jpeg 9d h7f8727e_0
jsonschema 3.0.2 py36_0 conda-forge
jupyter_client 7.1.2 pyhd8ed1ab_0 conda-forge
jupyter_core 4.8.1 py36h5fab9bb_0 conda-forge
jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge
kiwisolver 1.3.1 py36h2531618_0
lcms2 2.12 h3be6417_0
ld_impl_linux-64 2.35.1 h7274673_9
legacy-api-wrap 1.2 pypi_0 pypi
leidenalg 0.8.7 py36hc4f0c31_0 conda-forge
libblas 3.9.0 1_h6e990d7_netlib conda-forge
libcblas 3.9.0 3_h893e4fe_netlib conda-forge
libffi 3.3 he6710b0_2
libgcc-ng 9.3.0 h5101ec6_17
libgd 2.3.3 h695aa2c_0
libgfortran-ng 7.5.0 ha8ba4b0_17
libgfortran4 7.5.0 ha8ba4b0_17
libglib 2.68.3 h3e27bee_0 conda-forge
libgomp 9.3.0 h5101ec6_17
libiconv 1.16 h516909a_0 conda-forge
liblapack 3.9.0 3_h893e4fe_netlib conda-forge
libllvm10 10.0.1 hbcb73fb_5
libpng 1.6.37 hbc83047_0
librsvg 2.50.7 hc3c00ef_0 conda-forge
libsodium 1.0.18 h36c2ea0_1 conda-forge
libstdcxx-ng 9.3.0 hd4cf53a_17
libtiff 4.2.0 h85742a9_0
libtool 2.4.6 h58526e2_1007 conda-forge
libuuid 1.0.3 h7f8727e_2
libwebp 1.2.2 h55f646e_0
libwebp-base 1.2.2 h7f8727e_0
libxcb 1.14 h7b6447c_0
libxml2 2.9.12 h03d6c58_0
llvmlite 0.36.0 py36h612dafd_4
louvain 0.7.0 py36hc4f0c31_0 conda-forge
lz4-c 1.9.3 h295c915_1
lzo 2.10 h7b6447c_2
markupsafe 2.0.1 py36h8f6f2f9_0 conda-forge
matplotlib 3.3.4 py36h06a4308_0
matplotlib-base 3.3.4 py36h62a2d02_0
metis 5.1.0 h58526e2_1006 conda-forge
mistune 0.8.4 py36h8f6f2f9_1004 conda-forge
mkl 2020.2 256
mkl-service 2.3.0 py36he8ac12f_0
mkl_fft 1.3.0 py36h54f3939_0
mkl_random 1.1.1 py36h0573a6f_0
mock 4.0.3 pyhd3eb1b0_0
mpfr 4.1.0 h9202a9a_1 conda-forge
nbclient 0.5.9 pyhd8ed1ab_0 conda-forge
nbconvert 6.0.7 py36h5fab9bb_3 conda-forge
nbformat 5.1.3 pyhd8ed1ab_0 conda-forge
ncurses 6.3 h7f8727e_2
nest-asyncio 1.5.4 pyhd8ed1ab_0 conda-forge
networkx 2.5.1 pypi_0 pypi
notebook 6.3.0 py36h5fab9bb_0 conda-forge
numba 0.53.1 py36ha9443f7_0
numexpr 2.7.3 py36hb2eb853_0
numpy 1.19.2 py36h54aff64_0
numpy-base 1.19.2 py36hfa32c7d_0
olefile 0.46 py36_0
openjpeg 2.4.0 h3ad879b_0
openssl 1.1.1k h7f98852_0 conda-forge
packaging 21.3 pyhd8ed1ab_0 conda-forge
pandas 1.1.5 py36ha9443f7_0
pandoc 2.17.1.1 ha770c72_0 conda-forge
pandocfilters 1.5.0 pyhd8ed1ab_0 conda-forge
pango 1.48.7 hb8ff022_0 conda-forge
parso 0.8.3 pyhd3eb1b0_0
patsy 0.5.1 py36_0
pcre 8.45 h295c915_0
perl 5.32.1 0_h7f98852_perl5 conda-forge
pexpect 4.8.0 pyhd3eb1b0_3
pickleshare 0.7.5 pyhd3eb1b0_1003
pillow 8.3.1 py36h2c7a002_0
pip 21.2.2 py36h06a4308_0
pixman 0.38.0 h516909a_1003 conda-forge
pkg-config 0.29.2 h36c2ea0_1008 conda-forge
prometheus_client 0.13.1 pyhd8ed1ab_0 conda-forge
prompt-toolkit 3.0.20 pyhd3eb1b0_0
ptyprocess 0.7.0 pyhd3eb1b0_2
pycparser 2.21 pyhd8ed1ab_0 conda-forge
pygments 2.11.2 pyhd3eb1b0_0
pynndescent 0.5.6 pypi_0 pypi
pyparsing 3.0.4 pyhd3eb1b0_0
pyqt 5.9.2 py36h05f1152_2
pyrsistent 0.17.3 py36h8f6f2f9_2 conda-forge
pytables 3.6.1 py36h71ec239_0
python 3.6.13 h12debd9_1
python-dateutil 2.8.2 pyhd3eb1b0_0
python-igraph 0.9.6 py36h644ed5e_0 conda-forge
python_abi 3.6 2_cp36m conda-forge
pytz 2021.3 pyhd3eb1b0_0
pyyaml 5.4.1 py36h27cfd23_1
pyzmq 22.1.0 py36h7068817_0 conda-forge
qt 5.9.7 h5867ecd_1
readline 8.1.2 h7f8727e_1
scanpy 1.7.2 pypi_0 pypi
scikit-learn 0.24.2 py36ha9443f7_0
scipy 1.5.2 py36h0b6359f_0
seaborn 0.11.2 pyhd3eb1b0_0
send2trash 1.8.0 pyhd8ed1ab_0 conda-forge
setuptools 58.0.4 py36h06a4308_0
sinfo 0.3.4 pypi_0 pypi
sip 4.19.8 py36hf484d3e_0
six 1.16.0 pyhd3eb1b0_1
sqlite 3.37.2 hc218d9a_0
statsmodels 0.12.2 py36h27cfd23_0
stdlib-list 0.8.0 pypi_0 pypi
suitesparse 5.10.1 hd8046ac_0 conda-forge
tbb 2020.3 intel_304 intel
terminado 0.12.1 py36h5fab9bb_0 conda-forge
testpath 0.5.0 pyhd8ed1ab_0 conda-forge
texttable 1.6.4 pyhd8ed1ab_0 conda-forge
threadpoolctl 2.2.0 pyh0d69192_0
tk 8.6.11 h1ccaba5_0
tornado 6.1 py36h27cfd23_0
tqdm 4.62.3 pypi_0 pypi
traitlets 4.3.3 py36h06a4308_0
umap-learn 0.5.2 pypi_0 pypi
wcwidth 0.2.5 pyhd3eb1b0_0
webencodings 0.5.1 py_1 conda-forge
wheel 0.37.1 pyhd3eb1b0_0
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libice 1.0.10 h7f98852_0 conda-forge
xorg-libsm 1.2.2 h470a237_5 conda-forge
xorg-libx11 1.7.2 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h7f98852_1 conda-forge
xorg-libxrender 0.9.10 h7f98852_1003 conda-forge
xorg-libxt 1.2.1 h7f98852_2 conda-forge
xorg-renderproto 0.11.1 h7f98852_1002 conda-forge
xorg-xextproto 7.3.0 h7f98852_1002 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.5 h7b6447c_0
yaml 0.2.5 h7b6447c_0
zeromq 4.3.4 h9c3ff4c_0 conda-forge
zlib 1.2.11 h7f8727e_4
zstd 1.4.9 haebb681_0
Hi Aviv, Thanks for trying conga! I've run some really big sets and I haven't run into this yet, very cool! As a quick check, if you manually reduce the dpi in the plt.savefig command, say from 300 to 200 or 100, does that "fix" it (ie, make the error go away)? Also curious if you have any log output from before the error... Take care, Phil
Hi Phil,
Here are the last couples lines of log output leading up to the error: Stdout:
making cluster logos: 603 610 CoNGA_out_restarted_tcr_clumping_logos.png
making cluster logos: 604 610 CoNGA_out_restarted_tcr_clumping_logos.png
making cluster logos: 605 610 CoNGA_out_restarted_tcr_clumping_logos.png
making cluster logos: 606 610 CoNGA_out_restarted_tcr_clumping_logos.png
making cluster logos: 607 610 CoNGA_out_restarted_tcr_clumping_logos.png
making cluster logos: 608 610 CoNGA_out_restarted_tcr_clumping_logos.png
making cluster logos: 609 CoNGA_out_restarted_tcr_clumping_logos.png
making: CoNGA_out_restarted_tcr_clumping_logos.png
Stderr:
.................................................. 80000
.................................................. 85000
.................................................. 90000
.................................................. 95000
.................................................. 100000
.................................................. 105000
.................................................. 110000
...................................
... storing 'test' as categorical
Traceback (most recent call last):
File "/path/to/conga/scripts/run_conga.py", line 831, in <module>
pvalue_threshold_for_logos=args.pvalue_threshold_for_tcr_clumping,
File "/path/to/conga/conga/plotting.py", line 3193, in make_tcr_clumping_plots
**logo_plot_args)
File "/path/to/conga/conga/plotting.py", line 3138, in make_cluster_logo_plots_figure
**kwargs)
File "/path/to/conga/conga/plotting.py", line 1450, in make_logo_plots
plt.savefig(logo_pngfile, dpi=300)
File "/path/to/miniconda3/envs/conga_new_env/lib/python3.6/site-packages/matplotlib/pyplot.py", line 859, in savefig
res = fig.savefig(*args, **kwargs)
File "/path/to/miniconda3/envs/conga_new_env/lib/python3.6/site-packages/matplotlib/figure.py", line 2311, in savefig
self.canvas.print_figure(fname, **kwargs)
File "/path/to/miniconda3/envs/conga_new_env/lib/python3.6/site-packages/matplotlib/backend_bases.py", line 2217, in print_figure
**kwargs)
File "/path/to/miniconda3/envs/conga_new_env/lib/python3.6/site-packages/matplotlib/backend_bases.py", line 1639, in wrapper
return func(*args, **kwargs)
File "/path/to/miniconda3/envs/conga_new_env/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py", line 509, in print_png
FigureCanvasAgg.draw(self)
File "/path/to/miniconda3/envs/conga_new_env/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py", line 402, in draw
self.renderer = self.get_renderer(cleared=True)
File "/path/to/miniconda3/envs/conga_new_env/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py", line 418, in get_renderer
self.renderer = RendererAgg(w, h, self.figure.dpi)
File "/path/to/miniconda3/envs/conga_new_env/lib/python3.6/site-packages/matplotlib/backends/backend_agg.py", line 96, in __init__
self._renderer = _RendererAgg(int(width), int(height), dpi)
ValueError: Image size of 4124x92909 pixels is too large. It must be less than 2^16 in each direction.
I will try to run the reanalyze step again with the reduced the DPI of 100. Do you have any suggestions for "fast-forwarding" to this specific step when running the run_conga.py command? With the size of the data, running the entire reanalyze step until the point that it previously crashed would take a few hours.
Best, Aviv
Thanks! You could try replacing " --all " with " --tcr_clumping " but it will still take a little while, I'm afraid.
That is a lot of convergent TCR clusters! Another thing you could to to focus on the most interesting ones would be to add " --min_cluster_size_for_tcr_clumping_logos 10 "
The current default min size is 3, which may be too low for big datasets.
Are you expecting that degree of TCR sequence convergence? It makes me a tiny bit worried that somehow clonotypes might be getting "split" during the preprocessing, leading to apparent high sequence convergence (identical sequences shared between different clonotypes, but they are actually the same clonotype). Just a thought.
Hi Phil,
Rerunning the reanalysis step after changing the DPI value to 100 in the make_logo_plots function made the error go away, although in the generated TCR clumping and graph_vs_graph visualization images the logos are a lower resolution, as would be expected, and slightly more difficult to read. As a workaround to get that part of the code running, however, it worked successfully, and I can play around with increasing the DPI values until I reach the DPI limit of having higher logo resolution and not breaking the savefig function call. Thank you!
The expected amount of TCR sequence convergence within the datasets is unknown, are there any easily modifiable parameters in the preprocessing steps could be tweaked to help reduce any potential "splitting" of clonotypes?
I ran into other issues that impeded on successfully running the rest of the CoNGA workflow, but I will open a new issue with the details of those problems.
Best, Aviv