api
api copied to clipboard
Problems retrieving TaskDocs for materials
I'm trying to retrieve charge density data, and the corresponding task information for the calculations that produced that data.
I'd like to be able to download the VASP input and output files associated with the volumetric charge density data for some materials.
Version Info
python==3.9.16
mp-api==0.30.10
pymatgen==2023.3.23
boto3=1.26.99
emmet-core==0.51.1
Reproduction
I'm trying to retrieve charge density for materials with inc_task_doc=True
from mp_api.client import MPRester
mpid = "mp-149"
with MPRester("<api_key>") as mpr:
chgcar = mpr.get_charge_density_from_material_id(mpid, inc_task_doc=True)
Produces output:
ValueError: No POTCAR for Si with functional PBE found. Please set the PMG_VASP_PSP_DIR environment in .pmgrc.yaml, or you may need to set PMG_DEFAULT_FUNCTIONAL to PBE_52 or PBE_54 if you are using newer psps from VASP.
Full Stack Trace
Retrieving MaterialsDoc documents: 100%|██████████| 1/1 [00:00<00:00, 27413.75it/s]
Retrieving ChgcarDataDoc documents: 100%|██████████| 2/2 [00:00<00:00, 60787.01it/s]
Retrieving ChgcarDataDoc documents: 100%|██████████| 1/1 [00:00<00:00, 25575.02it/s]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[8], line 2
1 with MPRester("") as mpr:
----> 2 chgcar = mpr.get_charge_density_from_material_id(mpid, inc_task_doc=True) # task=True ?? Look at github
3 # print(chgcar)
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/mprester.py:1101, in MPRester.get_charge_density_from_material_id(self, material_id, inc_task_doc)
1098 raise MPRestError(f"No charge density fetched for {material_id}.")
1100 if inc_task_doc:
-> 1101 task_doc = self.tasks.get_data_by_id(latest_doc.task_id)
1102 return chgcar, task_doc
1104 return chgcar
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:839, in BaseRester.get_data_by_id(self, document_id, fields)
836 results = [] # type: List
838 try:
--> 839 results = self._query_resource_data(criteria=criteria, fields=fields, suburl=document_id) # type: ignore
840 except MPRestError:
842 if self.primary_key == "material_id":
843 # see if the material_id has changed, perhaps a task_id was supplied
844 # this should likely be re-thought
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:797, in BaseRester._query_resource_data(self, criteria, fields, suburl, use_document_model, timeout)
774 def _query_resource_data(
775 self,
776 criteria: Optional[Dict] = None,
(...)
780 timeout: Optional[int] = None,
781 ) -> Union[List[T], List[Dict]]:
782 """
783 Query the endpoint for a list of documents without associated meta information. Only
784 returns a single page of results.
(...)
794 A list of documents
795 """
--> 797 return self._query_resource( # type: ignore
798 criteria=criteria,
799 fields=fields,
800 suburl=suburl,
801 use_document_model=use_document_model,
802 chunk_size=1000,
803 num_chunks=1,
804 ).get("data")
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:295, in BaseRester._query_resource(self, criteria, fields, suburl, use_document_model, parallel_param, num_chunks, chunk_size, timeout)
292 if not url.endswith("/"):
293 url += "/"
--> 295 data = self._submit_requests(
296 url=url,
297 criteria=criteria,
298 use_document_model=use_document_model,
299 parallel_param=parallel_param,
300 num_chunks=num_chunks,
301 chunk_size=chunk_size,
302 timeout=timeout,
303 )
305 return data
307 except RequestException as ex:
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:429, in BaseRester._submit_requests(self, url, criteria, use_document_model, parallel_param, num_chunks, chunk_size, timeout)
425 remaining_docs_avail = {}
427 initial_params_list = [{"url": url, "verify": True, "params": copy(crit)} for crit in new_criteria]
--> 429 initial_data_tuples = self._multi_thread(use_document_model, initial_params_list)
431 for data, subtotal, crit_ind in initial_data_tuples:
433 subtotals.append(subtotal)
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:634, in BaseRester._multi_thread(self, use_document_model, params_list, progress_bar, timeout)
630 finished, futures = wait(futures, return_when=FIRST_COMPLETED)
632 for future in finished:
--> 634 data, subtotal = future.result()
636 if progress_bar is not None:
637 progress_bar.update(len(data["data"]))
File ~/.conda/envs/materials-project/lib/python3.9/concurrent/futures/_base.py:439, in Future.result(self, timeout)
437 raise CancelledError()
438 elif self._state == FINISHED:
--> 439 return self.__get_result()
441 self._condition.wait(timeout)
443 if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]:
File ~/.conda/envs/materials-project/lib/python3.9/concurrent/futures/_base.py:391, in Future.__get_result(self)
389 if self._exception:
390 try:
--> 391 raise self._exception
392 finally:
393 # Break a reference cycle with the exception in self._exception
394 self = None
File ~/.conda/envs/materials-project/lib/python3.9/concurrent/futures/thread.py:58, in _WorkItem.run(self)
55 return
57 try:
---> 58 result = self.fn(*self.args, **self.kwargs)
59 except BaseException as exc:
60 self.future.set_exception(exc)
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/mp_api/client/core/client.py:685, in BaseRester._submit_request_and_process(self, url, verify, params, use_document_model, timeout)
682 if response.status_code == 200:
684 if self.monty_decode:
--> 685 data = json.loads(response.text, cls=MontyDecoder)
686 else:
687 data = json.loads(response.text)
File ~/.conda/envs/materials-project/lib/python3.9/json/__init__.py:359, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
357 if parse_constant is not None:
358 kw['parse_constant'] = parse_constant
--> 359 return cls(**kw).decode(s)
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:475, in MontyDecoder.decode(self, s)
473 else:
474 d = json.loads(s)
--> 475 return self.process_decoded(d)
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:454, in MontyDecoder.process_decoded(self, d)
451 elif (bson is not None) and modname == "bson.objectid" and classname == "ObjectId":
452 return bson.objectid.ObjectId(d["oid"])
--> 454 return {self.process_decoded(k): self.process_decoded(v) for k, v in d.items()}
456 if isinstance(d, list):
457 return [self.process_decoded(x) for x in d]
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:454, in <dictcomp>(.0)
451 elif (bson is not None) and modname == "bson.objectid" and classname == "ObjectId":
452 return bson.objectid.ObjectId(d["oid"])
--> 454 return {self.process_decoded(k): self.process_decoded(v) for k, v in d.items()}
456 if isinstance(d, list):
457 return [self.process_decoded(x) for x in d]
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:457, in MontyDecoder.process_decoded(self, d)
454 return {self.process_decoded(k): self.process_decoded(v) for k, v in d.items()}
456 if isinstance(d, list):
--> 457 return [self.process_decoded(x) for x in d]
459 return d
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:457, in <listcomp>(.0)
454 return {self.process_decoded(k): self.process_decoded(v) for k, v in d.items()}
456 if isinstance(d, list):
--> 457 return [self.process_decoded(x) for x in d]
459 return d
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:454, in MontyDecoder.process_decoded(self, d)
451 elif (bson is not None) and modname == "bson.objectid" and classname == "ObjectId":
452 return bson.objectid.ObjectId(d["oid"])
--> 454 return {self.process_decoded(k): self.process_decoded(v) for k, v in d.items()}
456 if isinstance(d, list):
457 return [self.process_decoded(x) for x in d]
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:454, in <dictcomp>(.0)
451 elif (bson is not None) and modname == "bson.objectid" and classname == "ObjectId":
452 return bson.objectid.ObjectId(d["oid"])
--> 454 return {self.process_decoded(k): self.process_decoded(v) for k, v in d.items()}
456 if isinstance(d, list):
457 return [self.process_decoded(x) for x in d]
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:454, in MontyDecoder.process_decoded(self, d)
451 elif (bson is not None) and modname == "bson.objectid" and classname == "ObjectId":
452 return bson.objectid.ObjectId(d["oid"])
--> 454 return {self.process_decoded(k): self.process_decoded(v) for k, v in d.items()}
456 if isinstance(d, list):
457 return [self.process_decoded(x) for x in d]
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:454, in <dictcomp>(.0)
451 elif (bson is not None) and modname == "bson.objectid" and classname == "ObjectId":
452 return bson.objectid.ObjectId(d["oid"])
--> 454 return {self.process_decoded(k): self.process_decoded(v) for k, v in d.items()}
456 if isinstance(d, list):
457 return [self.process_decoded(x) for x in d]
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/monty/json.py:427, in MontyDecoder.process_decoded(self, d)
425 data = {k: v for k, v in d.items() if not k.startswith("@")}
426 if hasattr(cls_, "from_dict"):
--> 427 return cls_.from_dict(data)
428 if pydantic is not None and issubclass(cls_, pydantic.BaseModel): # pylint: disable=E1101
429 return cls_(**data)
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/pymatgen/io/vasp/inputs.py:2262, in Potcar.from_dict(cls, d)
2256 @classmethod
2257 def from_dict(cls, d):
2258 """
2259 :param d: Dict representation
2260 :return: Potcar
2261 """
-> 2262 return Potcar(symbols=d["symbols"], functional=d["functional"])
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/pymatgen/io/vasp/inputs.py:2243, in Potcar.__init__(self, symbols, functional, sym_potcar_map)
2241 self.functional = functional
2242 if symbols is not None:
-> 2243 self.set_symbols(symbols, functional, sym_potcar_map)
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/pymatgen/io/vasp/inputs.py:2339, in Potcar.set_symbols(self, symbols, functional, sym_potcar_map)
2337 else:
2338 for el in symbols:
-> 2339 p = PotcarSingle.from_symbol_and_functional(el, functional)
2340 self.append(p)
File ~/.conda/envs/materials-project/lib/python3.9/site-packages/pymatgen/io/vasp/inputs.py:1897, in PotcarSingle.from_symbol_and_functional(symbol, functional)
1895 d = SETTINGS.get("PMG_VASP_PSP_DIR")
1896 if d is None:
-> 1897 raise ValueError(
1898 f"No POTCAR for {symbol} with functional {functional} found. Please set the PMG_VASP_PSP_DIR "
1899 "environment in .pmgrc.yaml, or you may need to set PMG_DEFAULT_FUNCTIONAL to PBE_52 or "
1900 "PBE_54 if you are using newer psps from VASP."
1901 )
1902 paths_to_try = [
1903 os.path.join(d, funcdir, f"POTCAR.{symbol}"),
1904 os.path.join(d, funcdir, symbol, "POTCAR"),
1905 ]
1906 for p in paths_to_try:
ValueError: No POTCAR for Si with functional PBE found. Please set the PMG_VASP_PSP_DIR environment in .pmgrc.yaml, or you may need to set PMG_DEFAULT_FUNCTIONAL to PBE_52 or PBE_54 if you are using newer psps from VASP.
Looking through the stack trace, it looks like the api is trying to retrieve the docs associated with the latest task, and is unable to locate some vasp files for that task.
To get around this issue, I also tried retrieving ALL of the download information for this material:
with MPRester("<api-key>") as mpr:
data = mpr.get_download_info(material_ids=["mp-149"])
And received the output (task docs metadata, task NOMAD url where it exists):
({MPID(mp-149): [{'task_id': 'mp-655585',
'calc_type': <CalcType.GGA_Static: 'GGA Static'>},
{'task_id': 'mp-656511',
'calc_type': <CalcType.GGA_NSCF_Line: 'GGA NSCF Line'>},
{'task_id': 'mp-655936',
'calc_type': <CalcType.GGA_NSCF_Uniform: 'GGA NSCF Uniform'>},
{'task_id': 'mp-11721',
'calc_type': <CalcType.GGA_Structure_Optimization: 'GGA Structure Optimization'>},
{'task_id': 'mp-149',
'calc_type': <CalcType.GGA_Structure_Optimization: 'GGA Structure Optimization'>},
{'task_id': 'mp-1057373', 'calc_type': <CalcType.GGA_Static: 'GGA Static'>},
{'task_id': 'mp-1057366',
'calc_type': <CalcType.GGA_Structure_Optimization: 'GGA Structure Optimization'>},
{'task_id': 'mp-1057380',
'calc_type': <CalcType.GGA_NSCF_Uniform: 'GGA NSCF Uniform'>},
{'task_id': 'mp-1059585',
'calc_type': <CalcType.GGA_Structure_Optimization: 'GGA Structure Optimization'>},
{'task_id': 'mp-1059589', 'calc_type': <CalcType.GGA_Static: 'GGA Static'>},
{'task_id': 'mp-1059603',
'calc_type': <CalcType.GGA_NSCF_Uniform: 'GGA NSCF Uniform'>},
{'task_id': 'mp-1120258',
'calc_type': <CalcType.GGA_Structure_Optimization: 'GGA Structure Optimization'>},
{'task_id': 'mp-1120259',
'calc_type': <CalcType.GGA_Structure_Optimization: 'GGA Structure Optimization'>},
{'task_id': 'mp-1141021',
'calc_type': <CalcType.GGA_DFPT_Dielectric: 'GGA DFPT Dielectric'>},
{'task_id': 'mp-1248038',
'calc_type': <CalcType.GGA_Structure_Optimization: 'GGA Structure Optimization'>},
{'task_id': 'mp-1249516',
'calc_type': <CalcType.GGA_NMR_Electric_Field_Gradient: 'GGA NMR Electric Field Gradient'>},
{'task_id': 'mp-1267607',
'calc_type': <CalcType.GGA_NMR_Nuclear_Shielding: 'GGA NMR Nuclear Shielding'>},
{'task_id': 'mp-1440634', 'calc_type': <CalcType.GGA_Static: 'GGA Static'>},
{'task_id': 'mp-1686587',
'calc_type': <CalcType.GGA_NSCF_Uniform: 'GGA NSCF Uniform'>},
{'task_id': 'mp-1791788', 'calc_type': <CalcType.GGA_Static: 'GGA Static'>},
{'task_id': 'mp-1594776',
'calc_type': <CalcType.GGA_NSCF_Line: 'GGA NSCF Line'>},
{'task_id': 'mp-1592727',
'calc_type': <CalcType.GGA_NSCF_Line: 'GGA NSCF Line'>},
{'task_id': 'mp-1947498',
'calc_type': <CalcType.R2SCAN_Structure_Optimization: 'R2SCAN Structure Optimization'>},
{'task_id': 'mp-1950734',
'calc_type': <CalcType.PBESol_Structure_Optimization: 'PBESol Structure Optimization'>},
{'task_id': 'mp-1059604',
'calc_type': <CalcType.GGA_NSCF_Line: 'GGA NSCF Line'>},
{'task_id': 'mp-1057384',
'calc_type': <CalcType.GGA_NSCF_Line: 'GGA NSCF Line'>},
{'task_id': 'mp-1536661',
'calc_type': <CalcType.SCAN_Structure_Optimization: 'SCAN Structure Optimization'>},
{'task_id': 'mp-2250750',
'calc_type': <CalcType.GGA_NSCF_Uniform: 'GGA NSCF Uniform'>},
{'task_id': 'mp-2299819',
'calc_type': <CalcType.HSE06_Static: 'HSE06 Static'>},
{'task_id': 'mp-2291052', 'calc_type': <CalcType.GGA_Static: 'GGA Static'>},
{'task_id': 'mp-2683378',
'calc_type': <CalcType.GGA_Structure_Optimization: 'GGA Structure Optimization'>}]},
['https://nomad-lab.eu/prod/rae/api/raw/query?external_id=mp-11721',
'https://nomad-lab.eu/prod/rae/api/raw/query?external_id=mp-149',
'https://nomad-lab.eu/prod/rae/api/raw/query?external_id=mp-1057366',
'https://nomad-lab.eu/prod/rae/api/raw/query?external_id=mp-1057380',
'https://nomad-lab.eu/prod/rae/api/raw/query?external_id=mp-1059585',
'https://nomad-lab.eu/prod/rae/api/raw/query?external_id=mp-1059589',
'https://nomad-lab.eu/prod/rae/api/raw/query?external_id=mp-1059604',
'https://nomad-lab.eu/prod/rae/api/raw/query?external_id=mp-1057384'])
So I can get the task info this way, it's not clear which of these calculations is associated with the material's charge density data.
The two questions I have:
- Is the ValueError seen with the
get_charge_density_from_material_id
method a bug? - Is there a way to find the task_id that produced the charge density data for any one material? Then I could download the VASP files associated with that task.
Thanks for any help you can offer!
@keeganq, thanks for reporting this issue. This is happening as the API client by default tries to deserialize data into appropriate pymatgen objects. Since you have pymatgen installed but do not have the POTCAR configuration fully functional, it is giving you problems. This is something we are aware of on our end, and are planning a couple of different changes to fix it. For now, the easiest thing to do would be to pass monty_decode=False
to MPRester
alongside your API key. This should disable all deserialization by the client.
Additionally, I have just realized that the latest changes to the TaskDoc
model in emmet-core
have broken pulling task data through the API. I have just pinned emmet-core<=0.50.0
, and have patch released to mp-api==0.30.11
. Before pulling data, I would update your installation of both packages.
Thanks @munrojm! This is looking much better now. I am able to retrieve a TaskDoc
with get_charge_density_from_material_id(<mpid>, inc_task_doc=True)
. Would it be safe to assume that this TaskDoc
is the one that is associated with the calculations used for the volumetric charge density data?
Yup! That is correct. The CHGCAR
is taken from that specific calculation.
An update on this: I was able to configure pymatgen with a local set of POTCAR files, and was previously able to retrieve TaskDocs with monty_decode=True
in MPRester
, as you suggested. These TaskDocs would have decoded objects, specifically TaskDoc.orig_inputs.potcar
would be a list of pymatgen.io.vasp.inputs.PotcarSingle
objects.
Unfortunately, this isn't working after some recent changes to the API. The potcar is instead returned as an emmet Potcar object, i.e. it was not decoded. I think I've identified the problem, and it looks very intentional:
https://github.com/materialsproject/api/blob/3ffecd21a859d8a9314ce64faa0d76c15ad29c5c/mp_api/client/mprester.py#L216-L218
Assuming that this behavior was intended, is there a new recommended way to decode objects in the TaskDoc?
Thanks as always for your help!
I've actually default disabled Monty decoding for the task endpoint while we get a better solution for this. You can instead pass the data to the process decoded method of 'MontyDecoder' to manually decode. Instantiating the 'TaskDoc' with the data as input arguments should also decode any data that isn't nested using monty.