metacatui
metacatui copied to clipboard
support structured funding info from EML 2.2.0
Describe the feature you'd like Structured funding information was added to EML 2.2.0: https://eml.ecoinformatics.org/whats-new-in-eml-2-2-0.html#structured-funding-information We should support editing that more detailed information.
- [x] Display structured funding info
- [ ] Edit structured funding info
Is your feature request related to a problem? Please describe. Its difficult to machine-parse our current natural language funding information.
Additional context
- If
fundinginformation present in a 2.1.1 or earlier document, we should make an attempt to forward convert that to the new EML 2.2.0awardelement structure if possible using common patterns for award identifiers. - Funder identifiers might be able to be looked up from the FundRef system by CrossRef, or maybe through ROR, or others.
We run into an issue now with editing EML 2.2.0 documents when we use the editor. The awards field will be empty if we are using the new structured funding info and you can't save until you add something into that field because the field is required. So we will need to do an extra step in R to remove the funding information added every time we use the editor.
@amoeba is this a side effect of simply serializing into EML 2.2.0 format? The new award field is optional and so shouldn't be present in any autoconverted document unless specific steps are taken to transform funding field info into the award subfields. Thoughts?
Sounds like we missed this in the switch to EML 2.2.0: I think we just need to change MetacatUI's logic for checking whether funding information has been set. Currently, dataset/project/funding is probably required but we should really check for dataset/project/funding and/or dataset/project/award, considering having at least one as "having funding".
After looking, I ran into a question: Do we have a use case for deployments wanting to accept funding or award or is our plan to make the funding portion of the editor automatically migrate funding into an award element on parse and only allow editing of funding information using the award element?
The way we redefined funding, it becomes more of a natural language description and can coexist with the structured funding info in award. I could see having a paragraph in funding along with one or more award entries. That said, if all funding contains is something like National Science Foundation 198765 than I think it would be better to convert that into the award structure.
Gotcha. I think allowing editing of the old funding field will probably be useful to someone.
I mocked up two different approaches to allowing simultaneous entry of project/funding and project/award over on https://www.figma.com/file/MztNBDfq7LZ1ZvTLarnftC/Funding-VIew?node-id=0%3A1
One where the autocomplete is kind of a separate thing from the form entry:
And one where it's integrated into the award number form field:
I kinda like the second one a bit more as the separation concept is a bit weird and the second one matches better with our current UI.
I'm adding the "bug" label since this presents issues when we process datasets, as @laijasmine described above. It essentially means that every time we edit a dataset in the editor that we have added structured funding to, we have to go back in R and remove the dummy information we are forced to add.
Thanks @jeanetteclark for the bump. I think we can put an immediate fix in before we get in full structured funding support by relaxing the validation as I commented in https://github.com/NCEAS/metacatui/issues/1403#issuecomment-677980583. @laurenwalker do you see any issue with popping a quick fix in so @jeanetteclark and @laijasmine 's issue above can get resolved quickly? I could get a fix in today I think.
Hey @amoeba @laurenwalker just wanted to check in here on the quick fix - this continues to plague our interns.
Workaround landed in c33fff7a0078c3f080fb2f036ffbe90a2961ed13 (issue #1533). When we add in full support for structured funding information, we'll want to (though we don't strictly need to) revisit the validation logic patch to simplify it.
@laurenwalker please make sure this is in your release roadmap for ADC for the near future -- it is one of the things that makes a lot of extra work for the datateam on every dataset.
Until a full award editor is added to the EML editor, we can make things more compatible with EML 2.2.0 by creating an EML award with an award number. The award number can be intelligently parsed from the user input or NSF Award API lookup selection.
From Jeanette:
things might get a little easier if we:
- serialize into awards
- parse input more intelligently (if its an NSF award number we should just have the 6 digit award nothing else)
- make sure the NSF lookup is working and populate the award section with the relevant info if it is
Here is what the data team does in R right now for each dataset to update the funding section. This could be replicated in MetacatUI when someone selects an NSF award from the lookup:
function(awards, eml_version = "2.2"){
stopifnot(is.character(awards))
stopifnot(eml_version %in% c("2.1", "2.1.1", "2.2", "2.2.0"))
award_nums <- awards
result <- lapply(award_nums, function(x){
url <- paste0("https://api.nsf.gov/services/v1/awards.json?id=", x ,"&printFields=coPDPI,pdPIName,title")
t <- tryCatch(jsonlite::fromJSON(url),
error = function(j) {
j$message <- paste0("The NSF API is most likely down. Check back later. ", j$message)
})
if ("serviceNotification" %in% names(t$response)) {
warning(paste(t$response$serviceNotification$notificationMessage, "\n",
t$response$serviceNotification$notificationType, "for award", x ,
"\n this award will not be included in the project section."), call. = FALSE)
t <- NULL
}
else if (length(t$response$award) == 0){
warning(paste("Empty result for award", x, "\n this award will not be included in the project section."), call. = FALSE)
t <- NULL
}
else t
})
i <- lapply(result, function(x) {!is.null(x)})
result <- result[unlist(i)]
award_nums <- award_nums[unlist(i)]
if (length(award_nums) == 0){
stop(call. = F,
"No valid award numbers were found.")
}
co_pis <- lapply(result, function(x){
extract_name(x$response$award$coPDPI)
})
co_pis <- unlist(co_pis, recursive = F)
co_pis <- do.call("rbind", co_pis)
if (!is.null(co_pis)){
co_pis$role <- "coPrincipalInvestigator"
}
pis <- lapply(result, function(x){
extract_name(x$response$award$pdPIName)
})
pis <- unlist(pis, recursive = F)
pis <- do.call("rbind", pis) %>%
dplyr::mutate(role = "principalInvestigator")
people <- dplyr::bind_rows(co_pis, pis) %>%
dplyr::distinct()
p_list <- list()
for (i in 1:nrow(people)){
p_list[[i]] <- EML::eml$personnel(individualName = list(givenName = people$firstName[i],
surName = people$lastName[i]),
role = people$role[i])
}
titles <- lapply(result, function(x){
unlist(x$response$award$title)
})
if (eml_version %in% c("2.1", "2.1.1")){
award_nums <- paste("NSF", award_nums)
proj <- EML::eml$project(title = titles, personnel = p_list, funding = award_nums)
}
else if (eml_version %in% c("2.2", "2.2.0")){
awards <- list()
for (i in 1:length(award_nums)){
awards[[i]] <- list(title = titles[i],
funderName = "National Science Foundation",
funderIdentifier = "https://doi.org/10.13039/100000001",
awardNumber = award_nums[i],
awardUrl = paste0("https://www.nsf.gov/awardsearch/showAward?AWD_ID=", award_nums[i]))
}
proj <- list(title = titles, personnel = p_list, award = awards)
return(proj)
}
}
We are using the structured award on WFSI (wfsi-data.org). We added a field on the Metadata Overview page that pulls from a controlled list of "Funding Awards". In some use cases there are multiple awards to apply to the dataset. We have to manually add this metadata instead of using our UI enhancement. In this case, we opted to have one project with multiple awards due to the fact that Metacat does not index subproject (project/relatedProject element in solr.
Singe Award
<project>
<title>Funding Award RC19-1064: 3D fuel characterization for evaluating physics-based fire behavior, fire effects and smoke models on US U. S. Department of Defense (DoD) military lands.</title>
<personnel id="1020212546916141">
<individualName><givenName>Roger</givenName><surName>Ottmar</surName></individualName>
<organizationName>USDA FS PNW Research Station</organizationName>
<electronicMailAddress>[email protected]</electronicMailAddress>
<userId directory="https://orcid.org">https://orcid.org/0000-0002-4385-4052</userId>
<role>Principal Investigator</role>
</personnel>
<award>
<funderName>U. S. Department of Defense (DoD), Strategic Environmental Research and Development Program (SERDP)</funderName>
<funderIdentifier>http://dx.doi.org/10.13039/100013316</funderIdentifier>
<awardNumber>RC19-1064</awardNumber>
<title>3D fuel characterization for evaluating physics-based fire behavior, fire effects and smoke models on US U. S. Department of Defense (DoD) military lands.</title>
</award>
</project>
Multiple Awards
This is not optimal as we added something in the abstract that says which award the PIs are connected to.
project/relatedProject may be useful but the top level project would have to be something that connects the two awards.
It is really unclear what a project is vs award.
<project>
<title>Funding Awards RC20-1346 and RC19-1119 for DoD Wildland Fire Science Initiative (WFSI)</title>
<personnel id="1396442241546151">
<individualName>
<givenName>Andrew</givenName>
<surName>Hudak</surName>
</individualName>
<organizationName>USDA FS RMRS</organizationName>
<electronicMailAddress>[email protected]</electronicMailAddress>
<userId directory="https://orcid.org">https://orcid.org/0000-0001-7480-1458</userId>
<role>Principal Investigator</role>
</personnel>
<personnel id="1396442241546150">
<individualName>
<givenName>Chad</givenName>
<surName>Hoffman</surName>
</individualName>
<organizationName>Colorado State University</organizationName>
<electronicMailAddress>[email protected]</electronicMailAddress>
<userId directory="https://orcid.org">https://orcid.org/0000-0001-8715-937X</userId>
<role>Principal Investigator</role>
</personnel>
<abstract>
<section><title>RC20-1346 PI</title><para>Andrew Hudak</para></section>
<section><title>RC19-1119 PI</title><para>Chad Hoffman</para></section>
</abstract>
<award>
<funderName>U. S. Department of Defense (DoD), Strategic Environmental Research and Development Program (SERDP)</funderName>
<funderIdentifier>http://dx.doi.org/10.13039/100013316</funderIdentifier>
<awardNumber>RC20-1346</awardNumber>
<title>Object-based aggregation of fuel structures, physics-based fire behavior and self-organizing smoke plumes for improved fuel, fire, and smoke management on military lands.</title>
</award>
<award>
<funderName>U. S. Department of Defense (DoD), Strategic Environmental Research and Development Program (SERDP)</funderName>
<funderIdentifier>http://dx.doi.org/10.13039/100013316</funderIdentifier>
<awardNumber>RC19-1119</awardNumber>
<title>Characterizing multiscale feedback between forest structure, fire behavior and effects: integrating measurements and mechanistic modeling for improved understanding of patterns and processes.</title>
</award>
</project>