ckanr
ckanr copied to clipboard
package_update deletes resource, some unspecified tags
Running package_update()
deletes the resource associated with the package on my CKAN instance, along with some (not all) of the metadata tags for the package that I do not specify.
upd <- list(
owner_org = x$owner_org,
name = x$name,
title = x$title,
country = x$country,
document_type = "A new thing",
notes = "This is a new note added from code again"
)
r <- package_update(
upd,
x$id
)
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readxl_1.1.0 jsonlite_1.5 httr_1.3.1 ckanr_0.1.2.9924 DBI_1.0.0 DT_0.4 shinydashboard_0.7.1 shinyjs_1.0
[9] googlesheets_0.3.0 shiny_1.2.0 bindrcpp_0.2.2 dplyr_0.7.8 nrgiR_0.1.0 data.table_1.10.4-3
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 dbplyr_1.3.0 pillar_1.2.2 compiler_3.5.0 cellranger_1.1.0 later_0.7.5 bindr_0.1.1 tools_3.5.0 digest_0.6.15
[10] tibble_1.4.2 pkgconfig_2.0.1 rlang_0.3.0.1 cli_1.0.0 rstudioapi_0.8 crosstalk_1.0.0 curl_3.2 yaml_2.1.18 stringr_1.3.0
[19] xml2_1.2.0 htmlwidgets_1.2 tidyselect_0.2.4 glue_1.2.0 R6_2.2.2 purrr_0.2.5 magrittr_1.5 promises_1.0.1 htmltools_0.3.6
[28] assertthat_0.2.0 mime_0.5 xtable_1.8-2 httpuv_1.4.5.1 utf8_1.1.3 stringi_1.1.7 openssl_1.0.1 crayon_1.3.4
Hi @moman822, the (best) way to update a package is to first get the package data, then update fields as required, then update the package with the updated original data.
Hope this helps!
@florianm can you show (or point to) a brief example of what you mean? Do you mean somehow updating the package data locally first? Thanks.
@moman822 here's what I'd do:
# Step 1: get the dataset details as R list
ds_id <- "my-dataset-id-md5-hash"
ds <- ckanr::package_show(ds_id)
# Step 2: update selected fields
ds$title <- "An updated title"
ds$description <- "I have updated title and description, while leaving all other fields unchanged."
# ds contains all other package data, including tags and resources
# Step 3: Update the dataset on CKAN with the locally modified list
result <- ckanr::package_update(ds, ds_id)
Does this work for you?
closing due to inactivity
Hi, I had this same issue, and I think that is actually a bug that must be fixed. As you can read in the documentation pdf file, page 42, the approach is to include in the list just the fields you want to update. Please see this text from the documentation (https://cran.r-project.org/web/packages/ckanr/ckanr.pdf):
## update just chosen things
# Make some changes
x <- list(maintainer_email = "[email protected]")
# Then update the packge
package_update(x, '585d7ea2-ded0-4fed-9b08-61f7e83a3cb2')
@fernandobarbalho What is a bug? you said "that", but what is "that"?
Sorry, I wasn´t so clear. The bug is that the package updated is losing informations related to the fields that are not explicited in the list. As I´ve highlighted above, according to the documentation the user should be able to list in the x argument just the chosen things he/she wants to update.
If I may offer my opinion, the "delete what's not included" behaviour feels logical. Updating the package with a dict implies that the dict contains the entire package metadata. The dict returned by package_show
contains (for free and in one command) all package metadata and is easier to manipulate than starting from scratch with an empty dict and having to get all the structure correct (I'm not good at this, especially when under-caffeinated, so package_show
saved my bacon more than once).
As an approach to prevent unintended data loss, maybe the package_list help could expand more on this topic? e.g. something along the lines of:
package_update replaces all package metadata with the entire dict, and only the dict. Any data present in the package but not in the dict will be dropped.
To add package metadata and retain the existing package metadata:
- get the current package metadata with
package_show
- modify and/or add to the package metadata
- update the package with the package metadata
To replace package metadata with new data and drop any other existing package metadata:
- create a new package metadata dict, or get the current metadata and remove items from it
- update the package metadata
Examples:
## Update existing package metadata
# Step 1: get the dataset details as R list
ds_id <- "my-dataset-id-md5-hash"
ds <- ckanr::package_show(ds_id)
# Step 2: update selected fields
ds$title <- "An updated title"
ds$description <- "I have updated title and description, while leaving all other fields unchanged."
# ds contains all other package data, including tags and resources
# Step 3: Update the dataset on CKAN with the locally modified list
result <- ckanr::package_update(ds, ds_id)
## Replace existing package metadata
# Step 4: illustrate possible or intended data loss - what's not in the dict will be deleted in the package:
del(ds$description)
result_with_deleted_description <- ckanr::package_update(ds, ds_id)
@fernandobarbalho @moman822 if the help for package_update
were to include the above clarification and example, would you feel informed enough?
Yes, it would be perfect.
Thanks for your answer.
From: Florian Mayer [email protected] Sent: Monday, April 15, 2019 11:04:08 PM To: ropensci/ckanr Cc: Fernando Almeida Barbalho; Mention Subject: Re: [ropensci/ckanr] package_update deletes resource, some unspecified tags (#108)
If I may offer my opinion, the "delete what's not included" behaviour feels logical. Updating the package with a dict implies that the dict contains the entire package metadata. The dict returned by package_show contains (for free and in one command) all package metadata and is easier to manipulate than starting from scratch with an empty dict and having to get all the structure correct (I'm not good at this, especially when under-caffeinated, so package_show saved my bacon more than once).
As an approach to prevent unintended data loss, maybe the package_list help could expand more on this topic? e.g. something along the lines of:
package_update replaces all package metadata with the entire dict, and only the dict. Any data present in the package but not in the dict will be dropped.
To add package metadata and retain the existing package metadata:
- get the current package metadata with package_show
- modify and/or add to the package metadata
- update the package with the package metadata
To replace package metadata with new data and drop any other existing package metadata:
- create a new package metadata dict, or get the current metadata and remove items from it
- update the package metadata
Examples:
Update existing package metadata
Step 1: get the dataset details as R list
ds_id <- "my-dataset-id-md5-hash" ds <- ckanr::package_show(ds_id)
Step 2: update selected fields
ds$title <- "An updated title" ds$description <- "I have updated title and description, while leaving all other fields unchanged."
ds contains all other package data, including tags and resources
Step 3: Update the dataset on CKAN with the locally modified list
result <- ckanr::package_update(ds, ds_id)
Replace existing package metadata
Step 4: illustrate possible or intended data loss - what's not in the dict will be deleted in the package:
del(ds$description) result_with_deleted_description <- ckanr::package_update(ds, ds_id)
@fernandobarbalhohttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffernandobarbalho&data=02%7C01%7C%7C3276a2ca0f36421a994d08d6c1f6abc1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636909662513406785&sdata=YBE997kfJq5JhvS5JlRTxwgcoRaNQgDp31vmUVWORtU%3D&reserved=0 if the help for package_update were to include the above clarification and example, would you feel informed enough?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fropensci%2Fckanr%2Fissues%2F108%23issuecomment-483450577&data=02%7C01%7C%7C3276a2ca0f36421a994d08d6c1f6abc1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636909662513416796&sdata=qUexeavZVYQrdvS5f4xFA7WM%2F8yk8%2BhuE01CVdtv1Og%3D&reserved=0, or mute the threadhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAmYio6BrHIIxXibXtl4mT_PpzlMoKfJ6ks5vhQVogaJpZM4a-etF&data=02%7C01%7C%7C3276a2ca0f36421a994d08d6c1f6abc1%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636909662513426801&sdata=1TYu%2Bjwq0PE9oDlfqnswNC9%2BP1JzrlCY2BEXRZASI30%3D&reserved=0.
@sckott should I PR an update to the docs for package_update
?
@florianm , I am sorry but the code you suggested did not work to me. The x argument of the function package_update() must be of class list, as is written in the code:
if (class(x) != "list") {
stop("x must be of class list", call. = FALSE)
}
when I execute this script bellow...
ds_ckan<-ckanr::package_show("343739de-7560-47af-8a0e-d51b363ec273",url="https://apickan.tesouro.gov.br/ckan", as= "list")
ds_ckan$license_title <- "Licença Aberta para Bases de Dados (ODbL) do Open Data Commons"
ds_ckan$license_id<- "odc-odbl"
ds_ckan$license_url <- "http://www.opendefinition.org/licenses/odc-by"
class(ds_ckan)
... the last executed line gives me the following result
> class(ds_ckan)
[1] "ckan_package"
which implies that the condition to execute package_update is not correct, as it can be seen when I execute this following script:
ckanr::package_update( x= ds_ckan,
id= "343739de-7560-47af-8a0e-d51b363ec273",
url ="https://apickan.tesouro.gov.br/ckan",
key = "my_key")
Error: x must be of class list
> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)
Matrix products: default
locale:
[1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ckanr_0.1.2.9924 DBI_1.0.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.0 pillar_1.3.1 compiler_3.4.3 dbplyr_1.3.0 bindr_0.1.1
[6] prettyunits_1.0.2 remotes_2.0.2 tools_3.4.3 digest_0.6.18 pkgbuild_1.0.2
[11] pkgload_1.0.2 tibble_2.0.1 jsonlite_1.6 memoise_1.1.0 pkgconfig_2.0.2
[16] rlang_0.3.1 cli_1.0.1 rstudioapi_0.9.0 curl_3.3 yaml_2.2.0
[21] bindrcpp_0.2.2 withr_2.1.2 httr_1.4.0 dplyr_0.7.8 desc_1.2.0
[26] fs_1.2.6 devtools_2.0.1 tidyselect_0.2.5 rprojroot_1.3-2 glue_1.3.0
[31] R6_2.3.0 processx_3.2.1 sessioninfo_1.1.1 purrr_0.3.0 callr_3.1.1
[36] magrittr_1.5 backports_1.1.3 ps_1.3.0 usethis_1.4.0 assertthat_0.2.0
[41] crayon_1.3.4
@florianm , good news ds_ckan as = 'table' works nice
so, I suggest to put this in your documentation
ds_ckan<-ckanr::package_show(pack_id, url, as= "table")
thanks heaps @fernandobarbalho!
should I PR an update to the docs for package_update
@florianm yes, please do
Still in my "later" basket, sorry haven't had bandwidth yet.
no worries
This has come up again in the context of updating resources, where everything that isn't specified is deleted. In that function, the documentation explicitly states Other metadata, such as name or description, are not updated.
- definitely not the case!
@florianm, do you expect you'd have bandwidth for updating the docs on this? No worries if not, I can take a go on it. Re-opening this issue to keep my eyes on it 👀
Thanks for the reminder, this has been in my "later" basket for too long!
Edit: PR sent, apologies haven't been able to run tests locally (no test server creds).
@sharlagelfand only if that's appropriate - could you DM me test server credentials on the rOpenSci slack? Could you test-drive the examples of package_update and resource_update? Happy to amend PR until correct and satisfactory.