pypdb
pypdb copied to clipboard
TypeError: 'NoneType' object is not subscriptable AND json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Hi, Thank you for the development and maintenance of this useful package. With the latest GitHub version of pypdb, I tried to run MMseqs2 searches of many sequences using Query
as below, but the command returned an error in some cases. This error seems to be dependent on query sequences. Here is a reproducible example with one of the problematic sequences.
Query
from pypdb import Query
aa_query = 'MGEILYFDTVLAPLSLFLPIGYHAYLWQCFKSKPSHTYIGIDALRRKGWFLDMKEDVDQKGMLAIQSVRNTLMSTIFIASIAVLVSMALAALTNNAYNASQLFRSAFFGSQIGGIVVLKYGSASLFLLVSFLCSSMAVGFLIDANFLINIGIGQFSSPAYTQTIFERGFTLALIGNRMLCMTFPLILWIFGPVSMALSSLALVWGLYELDFPGKLPSVKHG'
q = Query(aa_query, query_type='sequence', return_type='polymer_entity')
out = q.search()
/Users/kf/miniconda3/envs/pymol/lib/python3.8/site-packages/pypdb/util/http_requests.py:65: UserWarning: Too many failures on requests. Exiting...
warnings.warn("Too many failures on requests. Exiting...")
/Users/kf/miniconda3/envs/pymol/lib/python3.8/site-packages/pypdb/pypdb.py:292: UserWarning: Retrieval failed, returning None
warnings.warn("Retrieval failed, returning None")
Traceback (most recent call last):
File "/Users/kf/Dropbox/repos/csubst/csubst/csubst", line 309, in <module>
args.handler(args)
File "/Users/kf/Dropbox/repos/csubst/csubst/csubst", line 34, in command_site
main_site(g)
File "/Volumes/kfssd1/Dropbox/repos/csubst/csubst/main_site.py", line 658, in main_site
g['pdb'] = pdb_sequence_search(g)
File "/Volumes/kfssd1/Dropbox/repos/csubst/csubst/main_site.py", line 605, in pdb_sequence_search
best_hit = mmseqs2_out['result_set'][0]
TypeError: 'NoneType' object is not subscriptable
Process finished with exit code 1
perform_search
After reading #26, I also checked perform_search
, but it ended up with another error as below. I would appreciate it if you could give me any advice. Thank you.
from pypdb.clients.search.search_client import perform_search
from pypdb.clients.search.operators.sequence_operators import SequenceOperator
aa_query = 'MGEILYFDTVLAPLSLFLPIGYHAYLWQCFKSKPSHTYIGIDALRRKGWFLDMKEDVDQKGMLAIQSVRNTLMSTIFIASIAVLVSMALAALTNNAYNASQLFRSAFFGSQIGGIVVLKYGSASLFLLVSFLCSSMAVGFLIDANFLINIGIGQFSSPAYTQTIFERGFTLALIGNRMLCMTFPLILWIFGPVSMALSSLALVWGLYELDFPGKLPSVKHG'
seq_op = SequenceOperator(sequence=aa_query, identity_cutoff=0.99, evalue_cutoff=1000)
out = perform_search(search_operator=seq_op, return_with_scores=True)
Querying RCSB Search using the following parameters:
{"query": {"type": "terminal", "service": "sequence", "parameters": {"evalue_cutoff": 1000, "identity_cutoff": 0.99, "target": "pdb_protein_sequence", "value": "MGEILYFDTVLAPLSLFLPIGYHAYLWQCFKSKPSHTYIGIDALRRKGWFLDMKEDVDQKGMLAIQSVRNTLMSTIFIASIAVLVSMALAALTNNAYNASQLFRSAFFGSQIGGIVVLKYGSASLFLLVSFLCSSMAVGFLIDANFLINIGIGQFSSPAYTQTIFERGFTLALIGNRMLCMTFPLILWIFGPVSMALSSLALVWGLYELDFPGKLPSVKHG"}}, "request_options": {"return_all_hits": true}, "return_type": "entry"}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/site-packages/pypdb/clients/search/search_client.py", line 183, in perform_search
return perform_search_with_graph(query_object=search_operator,
File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/site-packages/pypdb/clients/search/search_client.py", line 271, in perform_search_with_graph
for query_hit in response.json()["result_set"]:
File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/site-packages/requests/models.py", line 910, in json
return complexjson.loads(self.text, **kwargs)
File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/json/__init__.py", line 357, in loads
return _default_decoder.decode(s)
File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Thank you very much for this issue, that's interesting that it depends on query. If you do the advanced search by the GUI on the RCSB website, are there any special features of the search result?
I just tried it and got no hits in the RCSB advanced search.
Thank you! Does this mean that it is an issue with the API, in which case we would need to throw an error?
No hits seem to be a valid result, and it's not treated as an error in the RCSB's advanced search. Since this is expected behavior in GUI, it might be good to deal with no-hit without an error to make pypdb consistent with the RCSB's GUI.
I am still wondering how I should handle this error. Any suggestions are appreciated.
Hi, I think that it should probably return an empty dict and emit a warning. I have been pretty swamped and haven't found time to troubleshoot this further. If you have a workaround that you like, if you would mind posting the code here (or opening a PR), I can incorporate it into the next version
I'm sorry I missed your reply. As a workaround, I'd like to handle it with try-except for now. If I get a chance to learn more about pypdb’s code in the future, I'll come back to this issue.