litstudy
litstudy copied to clipboard
Scopus400Error: Error translating query - Refining results with "source title" query argument
Hi,
Is it possible to refine/process publications from Scopus limited to source titles containing a specified keyword? For example, my query (and variations thereof) gives me the above error after refining a 2000+- publications:
( TITLE-ABS-KEY ( "recommend* sys*" OR "recommend* servi*" ) AND SRCTITLE ( "comput*" OR "acm" ) )
I had a look at the API reference and the existing issues, but I had some trouble finding an answer to my question.
Thank you
Thanks for using litstudy!
I cannot see the error. The query looks fine. Does the query work if you use it on the Scopus website?
The query works on Scopus and I can find publications which can be exported. I've tried variations without quotation marks, with/without brackets, only one keyword, with/out wildcard, etc. but get the same error.
Here is the error:
Scopus400Error Traceback (most recent call last)
Cell In[2], line 7
4 import logging
5 logging.getLogger().setLevel(logging.CRITICAL)
----> 7 docs_scopus, docs_not_found = litstudy.refine_scopus(docs_scopus)
8 print(len(docs_scopus), "papers found on Scopus")
9 print(len(docs_not_found), "papers NOT found on Scopus")
File ~\AppData\Roaming\Python\Python311\site-packages\litstudy\sources\scopus.py:248, in refine_scopus(docs, search_title)
244 return ScopusDocument.from_eid(record.eid)
246 return None
--> 248 return docs._refine_docs(callback)
File ~\AppData\Roaming\Python\Python311\site-packages\litstudy\types.py:53, in DocumentSet._refine_docs(self, callback)
50 old_docs = []
52 for i, doc in enumerate(progress_bar(self.docs)):
---> 53 new_doc = callback(doc)
55 if new_doc is not None:
56 new_indices.append(i)
File ~\AppData\Roaming\Python\Python311\site-packages\litstudy\sources\scopus.py:236, in refine_scopus.<locals>.callback(doc)
234 if len(title) > 10 and search_title:
235 query = f"TITLE({title})"
--> 236 response = ScopusSearch(query, view="STANDARD", download=False)
237 nresults = response.get_results_size()
239 if nresults > 0 and nresults < 10:
File ~\AppData\Roaming\Python\Python311\site-packages\pybliometrics\scopus\scopus_search.py:206, in ScopusSearch.__init__(self, query, refresh, view, verbose, download, integrity_fields, integrity_action, subscriber, **kwds)
204 self._query = query
205 self._view = view
--> 206 Search.__init__(self, query=query, api='ScopusSearch', count=count,
207 cursor=subscriber, download=download,
208 verbose=verbose, **kwds)
File ~\AppData\Roaming\Python\Python311\site-packages\pybliometrics\scopus\superclasses\search.py:62, in Search.__init__(self, query, api, count, cursor, download, verbose, **kwds)
59 self._cache_file_path = get_folder(api, self._view)/stem
61 # Init
---> 62 Base.__init__(self, params=params, url=URLS[api], download=download,
63 api=api, verbose=verbose)
File ~\AppData\Roaming\Python\Python311\site-packages\pybliometrics\scopus\superclasses\base.py:66, in Base.__init__(self, params, url, api, download, verbose, *args, **kwds)
64 self._json = loads(fname.read_text())
65 else:
---> 66 resp = get_content(url, api, params, *args, **kwds)
67 header = resp.headers
69 if ab_ref_retrieval:
File ~\AppData\Roaming\Python\Python311\site-packages\pybliometrics\scopus\utils\get_content.py:116, in get_content(url, api, params, **kwds)
114 except:
115 reason = ""
--> 116 raise errors[resp.status_code](reason)
117 except KeyError:
118 resp.raise_for_status()
Scopus400Error: Error translating query
--
I'm using jupyter notebook and have the same error via uni VPN and on campus.
Seems that this is a bug. It seems that litstudy tries to search Scopus for the title of the paper by using the query "TITLE({title})", but this results in an incorrect syntax for Scopus for certain titles. This will need further investigation.
However, I don't really understand the line litstudy.refine_scopus(docs_scopus)
. You have loaded documents from Scopus into docs_scopus
and then want to refine them again using Scopus? Or do you load the original documents from a file?
Ahh, maybe that explains why the refining always only works until a certain publication before the error appears.
I exported the .csv from scopus and loaded the file into docs_scopus
and then refined them. Is this only meant to be done for non-scopus datasets?
Ahh, maybe that explains why the refining always only works until a certain publication before the error appears.
Indeed. If you could figure out which publication it fails on, you can remove that one from the dataset as a temporary solution.
I exported the .csv from scopus and loaded the file into docs_scopus and then refined them. Is this only meant to be done for non-scopus datasets?
That is fine, if you load it from a CSV file it indeed makes sense to refine it afterwards. The function refine_scopus
should work on any dataset from any source. It fails here because of a bug :-(
If you would like to look into this issue, we are happy to accept pull requests!
I think what need to happen is probably that the title needs to be "stripped" from punctuation before it is passed to Scopus. For example, if the title is something like:
Research on the number of prime numbers between n² and (n+1)²
The query sent to Scopus will be:
TITLE(Research on the number of prime numbers between n² and (n+1)²)
but all those non-alphabetic characters result in query that is not accepted by Scopus.
Additionally, in the case were a Document
already has a ScopusID, we can just query Scopus directly for the publication without having to search based on the title (I think the CSV file already provides the ScopusID).