zip argument #1 must support iteration error + other question
Hey Martin,
Thanks for making this tool, I'm finding it very useful for my current project.
I have a profile hmm database obtained from CONJScan that I want to use to scan through a fasta file containing multiple sequences. I am running into some issues that I can't seem to figure out a work around.
To preface my issue, let me explain what I am trying to do: Using the CONJScan database and python, I am iterating over the profile hmms in a for-loop. Each loop, I am using a profile hmm to scan through a fasta file containing multiple sequences. Then at the end of each loop, I output a graphic via dna_features_viewer with a unique name containing a visualization of my alignments.
There are two problems I am encountering:
-
Occasionally, I will receive an error saying that
zip argument #1 must be iterable, this is in reference tofor ax, hit in zip(axes, hits):...whereargument #1 in zip(axes, hits)is not iterable. I am not sure why this is because aside from the ad-hoc loop I created to go through each profile hmm in my database, everything was done mimicking the example provided on the readthedocs.io page. -
At the end of the process, I will have multiple hits from different hmm profiles on the same fasta sequence. However, I would like to visualize them together, rather then separately. I am unsure if I am using the tool incorrectly or if this is unsupported currently.
Copied below is my code, excuse me for the messiness, I am still testing things out.
import pyhmmer
import os
from dna_features_viewer import GraphicFeature, GraphicRecord
import matplotlib.pyplot as plt
directory = 'profiles'
#iterate over profiles in folder
#this is to iterate over a folder containing many profile Hmm (CONJScan database)
for hmmprofile in os.listdir(directory):
f = os.path.join(directory, hmmprofile)
if os.path.isfile(f):
try:
with pyhmmer.plan7.HMMFile(f) as hmm_file:
hmm = next(hmm_file)
with pyhmmer.easel.SequenceFile("test.fasta", digital=True) as seq_file: #test.fasta contains many sequences in amino acid format
sequences = list(seq_file)
pipeline = pyhmmer.plan7.Pipeline(hmm.alphabet)
hits = pipeline.search_hmm(hmm, sequences)
ali = hits[0].domains[0].alignment
hmm_name = (ali.hmm_name.decode()) #storing the name of the hmm profile in the event that a search succeeds
# create an index so we can retrieve a Sequence from its name
seq_index = { seq.name:seq for seq in sequences }
fig, axes = plt.subplots(nrows=len(hits), figsize=(30, 30), sharex=True)
try:
for ax, hit in zip(axes, hits):
# add one feature per domain
features = [
GraphicFeature(start=d.alignment.target_from-1, end=d.alignment.target_to, color='#00FF00', label=hmm_name) #using the hmm_name to create labels for the graphic feature
for d in hit.domains
]
length = len(seq_index[hit.name])
desc = seq_index[hit.name].description.decode()
# render the feature records
record = GraphicRecord(sequence_length=length, features=features)
record.plot(ax=ax)
ax.set_title(desc)
try:
ax.figure.tight_layout()
ax.figure.savefig(desc + hmm_name + ".png") #using both the descriptor + hmm_name to create a unique result and saving the graphic as a png
except Exception as e:
# print(e)
continue
except Exception as e:
# print(e)
continue
except Exception as e:
# print(e)
continue
Any advise you can provide would help immensely. Thank you.
Hi @willhuynh11 !
This error you're getting, zip argument #1 must support iteration, is quite transparent: it means that the first argument to zip is not iterable; the first argument being axes. I cannot test immediately but I suppose axes may be None in the event where hits is empty; in modern versions of matplotlib passing a zero nrows to subplots raises an error but it could be you're using a version that just returns None there.