qca-dataset-submission
qca-dataset-submission copied to clipboard
Retagging CI Sometimes Does Not Retag
Problem: In #427 (TD) and #432 (Opt) the CI did not retag the datasets, where after waiting for an appropriate amount of time for any sort of updating lag, checking the record tags shows that they have not been updated even though the CI says it was successful.
Work Around: In both situations ds.modify_records(new_tag=base_tag) was run to update the tagging which could be confirmed instantly.
To Reproduce: The tags may be readily checked with:
from collections import Counter
from qcportal import PortalClient
ADDRESS = "https://api.qcarchive.molssi.org:443/"
qc_client = PortalClient(ADDRESS, cache_dir=".")
ds = qc_client.get_dataset("optimization", dataset_name)
count_tags = Counter()
for _, _, rec in ds.iterate_records():
if rec.task is not None:
count_tags[rec.task.tag] += 1
print("Overall", count_tags)
> Overall Counter({'openff': 1000})
Solution: I'm 70% on this) this issue did not occur for #428 which used the new "mw_" feature, while the other PRs did not. If this is true, then the fix should be relatively straightforward.
While fixing this, addressing #423 and maybe #357 would make sense. Also, the way tags are accessed return a PaginatedList which may not be up to date. Switch all instances of list(map(lambda x: x.name, pr.labels)) to [ label.name for label in pr.get_labels() ] (or maybe just pr.get_labels()?)
Let's discuss this in the QCSubmit meeting on Mar 11.
Possibly MolSSI server gets tied up, either on Fridays or a communication issue between retagging the calculations and simultaneous fetching from NRP tying them up?