deeprank2 icon indicating copy to clipboard operation
deeprank2 copied to clipboard

evaluate whether we want to stop suppressing errors by default in `_process_one_query` method of `QueryCollection`

Open DaniBodor opened this issue 2 years ago • 3 comments

In principle, we expect users to have "good" PDB files. Currently, we are catching errors in case there is a problem with the pdbs. Evaluate whether if all PDBs are good, errors still occur and whether we prefer by default raising them and having an option to suppress, or by default catching them and only raising a warning (as is the situation now)

DaniBodor avatar Nov 06 '23 10:11 DaniBodor

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Dec 08 '23 03:12 github-actions[bot]

The decision is to by default throw an error if a faulty PDB file is encountered, meaning the code will stop executing at this point. We will also implement a flag to allow users to suppress the error and proceed with only the non-faulty data.

Potential issues with this approach:

  • if running on a large dataset and only a small % of data is faulty, then time/resources are wasted by re-running the processing step if only one of the last queries is faulty.
  • deal with duplicate files (probably the old files are just overwritten by the new ones)
    • we can try and see if it's possible to (optionally) skip queries for which data exists, which would solve both problems above
  • when suppressing, we should send a clear log of the number of times the error would have occurred, and a list of all instances.

DaniBodor avatar Jul 09 '24 14:07 DaniBodor

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Aug 09 '24 03:08 github-actions[bot]

We still suppress the errors, but provide better feedback of what is going on. No option was added to enforce hard fails.

DaniBodor avatar Sep 06 '24 12:09 DaniBodor