drizzlepac
drizzlepac copied to clipboard
temporary ecsv file not removed and was ingested
Issue HLA-1174 was created on JIRA by Lisa Sherbert:
While testing HSTSDP-2022 to make sure catalog files were not produced, I found an unexpected one and found it was also ingested.
cmd is: sqsh -S GROUCHO -D dadstest2 -e -i hla.sql -P z -U
[0] GROUCHO.dadstest2.1> SELECT CONVERT(VARCHAR(55), afi_file_name) afi_file_name,
[0] GROUCHO.dadstest2.2> afi_archive_class,
[0] GROUCHO.dadstest2.3> afi_generation_date
[0] GROUCHO.dadstest2.4> FROM dbo.archive_files
[0] GROUCHO.dadstest2.5> WHERE afi_generation_date > '2023-12-08' and afi_file_name like '%ecsv'
afi_file_name afi_archive_class afi_generation_date
------------------------------------------------------- ----------------- --------------------------
hst_5397_17_wfpc2_pc_f555w_u27817_point-cat-fxm.ecsv HFS Dec 8 2023 09:50:41:000PM
(1 row affected)
hst_5397_17_wfpc2_pc_f555w_u27817_point-cat-fxm.ecsv is a temporary file which we really do not want to ingest. If it not produced the next time the dataset is ingested, it is not removed from the on-line cache. So we would really like it to not be there when we ingest.
On tldmscsched2: In nigel_u27817_1702071124.524341/ALOG_1702071626_WFPC2_SingleVisitMosaic_u27817.out, I see:
2023342214027 INFO src=wfpc2_svm.set_env_vars msg="os.environ['SVM_CATALOG_PC'] = 'OFF'"
...
2023342214609 WARNING src=drizzlepac.haputils.svm_quality_analysis- [compare_ra_dec_crossmatches] Catalog hst_5397_17_wfpc2_pc_f555w_u27817_point-cat.ecsv Missing! No comparison can be made.
...
2023342214609 WARNING src=drizzlepac.haputils.svm_quality_analysis- Catalog hst_5397_17_wfpc2_pc_f555w_u27817_point-cat.ecsv does not exist. Both the Point and Segment catalogs must exist for comparison.
...
2023342214645 INFO src=drizzlepac.haputils.svm_quality_analysis- Crossmatch reference image hst_5397_17_wfpc2_pc_f555w_u27817_drz.fits contains 1 sources.
2023342214645 INFO src=drizzlepac.haputils.svm_quality_analysis-
2023342214645 INFO src=drizzlepac.haputils.svm_quality_analysis- Wrote temporary source catalog hst_5397_17_wfpc2_pc_f555w_u27817_point-cat-fxm.ecsv
2023342214645 WARNING src=drizzlepac.haputils.svm_quality_analysis- HAP Point sourcelist interfilter comparison (compare_interfilter_crossmatches) encountered a problem.
2023342214645 ERROR src=drizzlepac.haputils.svm_quality_analysis- message
Traceback (most recent call last):
File "/hsttst/project/pipeline/pkgs/miniconda3/envs/caldp_satandtools/lib/python3.9/site-packages/drizzlepac/haputils/svm_quality_analysis.py", line 1920, in run_quality_analysis
compare_interfilter_crossmatches(total_obj_list, json_timestamp=json_timestamp,
File "/hsttst/project/pipeline/pkgs/miniconda3/envs/caldp_satandtools/lib/python3.9/site-packages/drizzlepac/haputils/svm_quality_analysis.py", line 668, in compare_interfilter_crossmatches
filtobj_dict[imgname] = transform_coords(filtobj_dict[imgname],
File "/hsttst/project/pipeline/pkgs/miniconda3/envs/caldp_satandtools/lib/python3.9/site-packages/drizzlepac/haputils/svm_quality_analysis.py", line 925, in transform_coords
xy_centroid_values = np.stack((filtobj_subdict['sources']['xcentroid'],
File "/hsttst/project/pipeline/pkgs/miniconda3/envs/caldp_satandtools/lib/python3.9/site-packages/astropy/table/table.py", line 2055, in __getitem__
return self.columns[item]
File "/hsttst/project/pipeline/pkgs/miniconda3/envs/caldp_satandtools/lib/python3.9/site-packages/astropy/table/table.py", line 264, in __getitem__
return OrderedDict.__getitem__(self, item)
I’m assuming the temp file stayed around due to that Traceback?
Comment by Lisa Sherbert on JIRA:
Would this error only happen if metrics collection is turned on? If so, then it should not happen in Ops because metrics not collected there.
COLLECT_INS_METRICS turn it off and test it on Test
Comment by Lisa Sherbert on JIRA:
I was able to verify that u27817 did NOT produce the temporary ecsv file when COLLECT_INS_METRICS is set to false, which is Good. This issue should NOT happen in Operations.
Test is supposed to be Ops-like but in this case it is not. We collect the metrics but have not been able to do anything with them lately.
Comment by Lisa Sherbert on JIRA:
It may not even need to be worked? It may be you want to keep that file around if that kind of error occurs? Likely needs to be discussed with Michele.
I was mainly concerned that we were ingesting it but that will NOT be an issue in Ops. Test is collecting metrics (why we still do is a question since we are not doing anything with them and cannot because tools we were going to use no longer work) but Ops will NOT collect metrics.
It is a difficult problem to weed out files that should not be ingested and still allow calibration to create new products to be ingested. I thought at some point we were using the manifest file to know what to ingest, but that does not seem to be the case? At least not with WFPC2.