Create function to find paired data (T, pH, hardness dependent criteria)
Is your feature request related to a problem? Please describe.
T, pH, hardness, etc. dependent water quality assessment criteria (e.g. metals) require a default value that is site specific OR preferably a paired T, pH, hardness, etc. sample from the same day, time and location as the pollutant sample (e.g. metals).
Describe the solution you'd like
Use similar logic to the ID duplicate results between/within orgs functions and the TADA_PairReplicates function. Create a new function, TADA_FindPairedData.
- Group data by org, site, date, characteristics of interest (function would require a site input & characteristics of interest)
- Define main characteristic (pollutant such as a metal - only 1) and paired chars of interest (ph, T, DO, hardness, etc. - select multiple)
- Look within a user defined time window (default: 10 minutes). See example logic in TADA_PairReplicates
- Look within site (default) or include nearby sites (user defined area/radius). Leverage TADA_FindNearbySites
Reminders for TADA contributors addressing this issue
New features should include all of the following work:
-
[ ] Create the function/code.
-
[ ] Document all code using comments to describe what is does.
-
[ ] Create tests in tests folder.
-
[ ] Create help file using roxygen2 above code.
-
[ ] Create working examples in help file (via roxygen2).
-
[ ] Add to appropriate vignette (or create new one).
I have done similar work on Illinois data in the past (finding paired T, pH, hardness, etc. sample from the same day, time and location), but only from the same MonitoringLocation/date. I like the idea to leverage TADA_FindNearbySites()
Once we've created TADA.MonitoringLocationIdentifier (which would then be modified after site review if the decision to treat two Monitoring Locations as one?), maybe we will not need to leverage TADA_FindNearbySItes() in this function?
There needs to be an option to substitute a default value for hardness, pH, temperature.
I've been working on this a bit. In the case of hardness, there are multiple characteristic names that correspond to hardness. In the current draft, users can rank the characteristic names so that if more than one is present and a possible pair for a result, the highest ranked one will be selected.
I am not sure what the default order of the ranking should be if the user fails to provide a ranking ref.
I have a working draft of a pairing function in the demo_impairment_functions branch (https://github.com/USEPA/EPATADA/blob/demo_impairment_functions/R/Module3.R). Relevant code begins at line 282 and includes two functions, TADA_CreatePairRef and TADA_PairForCriteriaCalc.
I have not yet included a way to set a default value for missing values as I haven't figured out how to incorporate that yet.
I moved this function to its own branch/BR (it is no longer in the demo_impairment_functions branch).