pypdb icon indicating copy to clipboard operation
pypdb copied to clipboard

TypeError: 'NoneType' object is not subscriptable AND json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Open kfuku52 opened this issue 3 years ago • 7 comments

Hi, Thank you for the development and maintenance of this useful package. With the latest GitHub version of pypdb, I tried to run MMseqs2 searches of many sequences using Query as below, but the command returned an error in some cases. This error seems to be dependent on query sequences. Here is a reproducible example with one of the problematic sequences.

Query

from pypdb import Query

aa_query = 'MGEILYFDTVLAPLSLFLPIGYHAYLWQCFKSKPSHTYIGIDALRRKGWFLDMKEDVDQKGMLAIQSVRNTLMSTIFIASIAVLVSMALAALTNNAYNASQLFRSAFFGSQIGGIVVLKYGSASLFLLVSFLCSSMAVGFLIDANFLINIGIGQFSSPAYTQTIFERGFTLALIGNRMLCMTFPLILWIFGPVSMALSSLALVWGLYELDFPGKLPSVKHG'
q = Query(aa_query, query_type='sequence', return_type='polymer_entity')
out = q.search()
/Users/kf/miniconda3/envs/pymol/lib/python3.8/site-packages/pypdb/util/http_requests.py:65: UserWarning: Too many failures on requests. Exiting...
  warnings.warn("Too many failures on requests. Exiting...")
/Users/kf/miniconda3/envs/pymol/lib/python3.8/site-packages/pypdb/pypdb.py:292: UserWarning: Retrieval failed, returning None
  warnings.warn("Retrieval failed, returning None")
Traceback (most recent call last):
  File "/Users/kf/Dropbox/repos/csubst/csubst/csubst", line 309, in <module>
    args.handler(args)
  File "/Users/kf/Dropbox/repos/csubst/csubst/csubst", line 34, in command_site
    main_site(g)
  File "/Volumes/kfssd1/Dropbox/repos/csubst/csubst/main_site.py", line 658, in main_site
    g['pdb'] = pdb_sequence_search(g)
  File "/Volumes/kfssd1/Dropbox/repos/csubst/csubst/main_site.py", line 605, in pdb_sequence_search
    best_hit = mmseqs2_out['result_set'][0]
TypeError: 'NoneType' object is not subscriptable

Process finished with exit code 1

perform_search

After reading #26, I also checked perform_search, but it ended up with another error as below. I would appreciate it if you could give me any advice. Thank you.

from pypdb.clients.search.search_client import perform_search
from pypdb.clients.search.operators.sequence_operators import SequenceOperator

aa_query = 'MGEILYFDTVLAPLSLFLPIGYHAYLWQCFKSKPSHTYIGIDALRRKGWFLDMKEDVDQKGMLAIQSVRNTLMSTIFIASIAVLVSMALAALTNNAYNASQLFRSAFFGSQIGGIVVLKYGSASLFLLVSFLCSSMAVGFLIDANFLINIGIGQFSSPAYTQTIFERGFTLALIGNRMLCMTFPLILWIFGPVSMALSSLALVWGLYELDFPGKLPSVKHG'
seq_op = SequenceOperator(sequence=aa_query, identity_cutoff=0.99, evalue_cutoff=1000)
out = perform_search(search_operator=seq_op, return_with_scores=True)
Querying RCSB Search using the following parameters:
 {"query": {"type": "terminal", "service": "sequence", "parameters": {"evalue_cutoff": 1000, "identity_cutoff": 0.99, "target": "pdb_protein_sequence", "value": "MGEILYFDTVLAPLSLFLPIGYHAYLWQCFKSKPSHTYIGIDALRRKGWFLDMKEDVDQKGMLAIQSVRNTLMSTIFIASIAVLVSMALAALTNNAYNASQLFRSAFFGSQIGGIVVLKYGSASLFLLVSFLCSSMAVGFLIDANFLINIGIGQFSSPAYTQTIFERGFTLALIGNRMLCMTFPLILWIFGPVSMALSSLALVWGLYELDFPGKLPSVKHG"}}, "request_options": {"return_all_hits": true}, "return_type": "entry"} 

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/site-packages/pypdb/clients/search/search_client.py", line 183, in perform_search
    return perform_search_with_graph(query_object=search_operator,
  File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/site-packages/pypdb/clients/search/search_client.py", line 271, in perform_search_with_graph
    for query_hit in response.json()["result_set"]:
  File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/site-packages/requests/models.py", line 910, in json
    return complexjson.loads(self.text, **kwargs)
  File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/json/__init__.py", line 357, in loads
    return _default_decoder.decode(s)
  File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Users/kf/miniconda3/envs/pymol/lib/python3.8/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

kfuku52 avatar Nov 28 '21 12:11 kfuku52

Thank you very much for this issue, that's interesting that it depends on query. If you do the advanced search by the GUI on the RCSB website, are there any special features of the search result?

williamgilpin avatar Nov 28 '21 12:11 williamgilpin

I just tried it and got no hits in the RCSB advanced search.

kfuku52 avatar Nov 28 '21 12:11 kfuku52

Thank you! Does this mean that it is an issue with the API, in which case we would need to throw an error?

williamgilpin avatar Nov 28 '21 13:11 williamgilpin

No hits seem to be a valid result, and it's not treated as an error in the RCSB's advanced search. Since this is expected behavior in GUI, it might be good to deal with no-hit without an error to make pypdb consistent with the RCSB's GUI.

kfuku52 avatar Nov 28 '21 14:11 kfuku52

I am still wondering how I should handle this error. Any suggestions are appreciated.

kfuku52 avatar Dec 10 '21 10:12 kfuku52

Hi, I think that it should probably return an empty dict and emit a warning. I have been pretty swamped and haven't found time to troubleshoot this further. If you have a workaround that you like, if you would mind posting the code here (or opening a PR), I can incorporate it into the next version

williamgilpin avatar Dec 10 '21 11:12 williamgilpin

I'm sorry I missed your reply. As a workaround, I'd like to handle it with try-except for now. If I get a chance to learn more about pypdb’s code in the future, I'll come back to this issue.

kfuku52 avatar Dec 30 '21 16:12 kfuku52