EPATADA Create function to find paired data (T, pH, hardness dependent criteria)

Is your feature request related to a problem? Please describe.

T, pH, hardness, etc. dependent water quality assessment criteria (e.g. metals) require a default value that is site specific OR preferably a paired T, pH, hardness, etc. sample from the same day, time and location as the pollutant sample (e.g. metals).

Describe the solution you'd like

Use similar logic to the ID duplicate results between/within orgs functions and the TADA_PairReplicates function. Create a new function, TADA_FindPairedData.

Group data by org, site, date, characteristics of interest (function would require a site input & characteristics of interest)
Define main characteristic (pollutant such as a metal - only 1) and paired chars of interest (ph, T, DO, hardness, etc. - select multiple)
Look within a user defined time window (default: 10 minutes). See example logic in TADA_PairReplicates
Look within site (default) or include nearby sites (user defined area/radius). Leverage TADA_FindNearbySites

Reminders for TADA contributors addressing this issue

New features should include all of the following work:

[ ] Create the function/code.
[ ] Document all code using comments to describe what is does.
[ ] Create tests in tests folder.
[ ] Create help file using roxygen2 above code.
[ ] Create working examples in help file (via roxygen2).
[ ] Add to appropriate vignette (or create new one).

Jan 22 '24 17:01 cristinamullin

I have done similar work on Illinois data in the past (finding paired T, pH, hardness, etc. sample from the same day, time and location), but only from the same MonitoringLocation/date. I like the idea to leverage TADA_FindNearbySites()

Jan 23 '24 13:01 hillarymarler

Once we've created TADA.MonitoringLocationIdentifier (which would then be modified after site review if the decision to treat two Monitoring Locations as one?), maybe we will not need to leverage TADA_FindNearbySItes() in this function?

Jun 26 '24 16:06 hillarymarler

There needs to be an option to substitute a default value for hardness, pH, temperature.

Jul 25 '24 15:07 hillarymarler

I've been working on this a bit. In the case of hardness, there are multiple characteristic names that correspond to hardness. In the current draft, users can rank the characteristic names so that if more than one is present and a possible pair for a result, the highest ranked one will be selected.

I am not sure what the default order of the ranking should be if the user fails to provide a ranking ref.

Aug 15 '24 14:08 hillarymarler

I have a working draft of a pairing function in the demo_impairment_functions branch (https://github.com/USEPA/EPATADA/blob/demo_impairment_functions/R/Module3.R). Relevant code begins at line 282 and includes two functions, TADA_CreatePairRef and TADA_PairForCriteriaCalc.

I have not yet included a way to set a default value for missing values as I haven't figured out how to incorporate that yet.

Aug 27 '24 12:08 hillarymarler

I moved this function to its own branch/BR (it is no longer in the demo_impairment_functions branch).

Aug 29 '24 20:08 hillarymarler