pybliometrics
pybliometrics copied to clipboard
ScienceDirect: Object Retrieval
It seems that the only consistent way of identifying objects is by its eid which has following structure:
<file_eid>-<object_ref>.<object_suffix>
An example is 1-s2.0-S0893608024005562-si15.svg'
Therefore the most reliable strategy is to retrieve objects by passing the document identifier and object file name:
ObjectRetrieval('10.1016/j.neunet.2024.106632', filename='gr3.jpg')
To get the file names, users can use the ObjectMetadata class:
o_md = ObjectMetadata('10.1016/j.neunet.2024.106632')
filenames = [f['filename'] for f in o_md.results]
How would users know the filename beforehand?
There is a naming convention. All items are enumerated with a prefix/suffix depending on its type (figure, math formula, pdf):
- Standard Figures are:
gr<nr>.jpg - Formula:
si<nr>.svg
Manually, there are two options:
- Use the ObjectMetadata class and get the filenames of all objects:
o_md = ObjectMetadata('10.1016/j.neunet.2024.106632')
filenames = [f['filename'] for f in o_md.results]
- Check the paper online and inspect the download link: https://ars.els-cdn.com/content/image/1-s2.0-S1566253524004342-gr2_lrg.jpg
Alright, then let's make the class work with the filename. I will include your hints in the documentation.