dataverse-client-r icon indicating copy to clipboard operation
dataverse-client-r copied to clipboard

Creating dataset using create_dataset() yields 500 error

Open thomascli19 opened this issue 5 years ago • 19 comments

Please specify whether your issue is about:

Hi There,

I was trying out some of the functions for the dataverse package, and came across it creating a 500 error every time I tried to execute the function. Unfortunately, I can't put my server and API key, but I can show what I did attempt:

## load package
library("dataverse")

## code
meta2<-list(title="test",author="Li,Thomas",datasetContact="Fish, Fishy",dsDescription="FISH",subject="Quantitative Sciences",depositor="Fish, Fishy",dateOfDeposit="Fish Time",datasetContactEmail="[email protected]")
create_dataset("MLPOCs2018",body=meta2) 

When running the debugger, I noticed the 500 error occurs when the POST() function is used from the 'httr' package (which makes sense), so I am trying to see the cause of this (be it permission or something else).

Just in case, the sessionInfo() yields the following:


R version 3.6.0 (2019-04-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] zip_2.0.2         flowCore_1.50.0   magick_2.0        ggplot2_3.2.0     usethis_1.5.0     devtools_2.0.2   
 [7] magrittr_1.5      data.table_1.12.2 dplyr_0.8.1       dataverse_0.2.0  

loaded via a namespace (and not attached):
 [1] tidyselect_0.2.5    remotes_2.1.0       purrr_0.3.2         lattice_0.20-38     pcaPP_1.9-73       
 [6] colorspace_1.4-1    testthat_2.1.1      stats4_3.6.0        yaml_2.2.0          rlang_0.4.0        
[11] pkgbuild_1.0.3      pillar_1.4.1        glue_1.3.1          withr_2.1.2         BiocGenerics_0.30.0
[16] sessioninfo_1.1.1   matrixStats_0.54.0  robustbase_0.93-5   munsell_0.5.0       gtable_0.3.0       
[21] mvtnorm_1.0-11      memoise_1.1.0       Biobase_2.44.0      callr_3.2.0         ps_1.3.0           
[26] curl_3.3            parallel_3.6.0      DEoptimR_1.0-8      Rcpp_1.0.1          corpcor_1.6.9      
[31] backports_1.1.4     scales_1.0.0        desc_1.2.0          pkgload_1.0.2       jsonlite_1.6       
[36] graph_1.62.0        fs_1.3.1            digest_0.6.19       processx_3.3.1      grid_3.6.0         
[41] rprojroot_1.3-2     cli_1.1.0           lazyeval_0.2.2      tibble_2.1.3        cluster_2.1.0      
[46] crayon_1.3.4        rrcov_1.4-7         pkgconfig_2.0.2     MASS_7.3-51.4       xml2_1.2.0         
[51] prettyunits_1.0.2   assertthat_0.2.1    httr_1.4.0          rstudioapi_0.10     R6_2.4.0           
[56] compiler_3.6.0  

If needed, I can also provide the log from the server that shows the "API internal error"

Thanks!

thomascli19 avatar Jul 01 '19 15:07 thomascli19

If needed, I can also provide the log from the server that shows the "API internal error"

The server.log file would be quite valuable to us. Can you please either attach it here (you'll have to add ".txt") or email it to [email protected]?

pdurbin avatar Jul 01 '19 15:07 pdurbin

Yeah sure thing! I have attached the .txt file with this message. dataverse_API_internal error.txt

thomascli19 avatar Jul 01 '19 17:07 thomascli19

@thomascli19 thanks, are you running Dataverse 4.15? If so, here's where the NullPointerException is being thrown:

https://github.com/IQSS/dataverse/blob/v4.15/src/main/java/edu/harvard/iq/dataverse/util/json/JsonParser.java#L273

pdurbin avatar Jul 01 '19 17:07 pdurbin

I'm actually running version 4.9.4. Would this NullPointerException that is being thrown also appear in this version?

thomascli19 avatar Jul 01 '19 17:07 thomascli19

Line 273 looks the same:

https://github.com/IQSS/dataverse/blob/v4.9.4/src/main/java/edu/harvard/iq/dataverse/util/json/JsonParser.java#L273

Can you please try it on https://demo.dataverse.org to see if you get a 500 error there too?

pdurbin avatar Jul 01 '19 18:07 pdurbin

Just tried and it yields the same error, even when I input the "body" argument as an empty list.

thomascli19 avatar Jul 01 '19 19:07 thomascli19

@thomascli19 that's not so great if "create dataset" doesn't work.

I assume you're aware that you could also use curl or the Python or Java libraries listed at http://guides.dataverse.org/en/4.15/api/client-libraries.html to create datasets. (I'm using pyDataverse myself.)

In #21 a new maintainer is being sought. Let me at least mention @monogan in case he has time to take a look.

pdurbin avatar Jul 01 '19 19:07 pdurbin

Thanks for pinging me @pdurbin. This is an important package that definitely needs to be maintained. I'll admit that I would love to give back here, but I'm struggling to find the time for now. I think I can get there in 6-12 months, but I suspect that's not a great timeframe for you or @thomascli19.

Let me also mention @maxheld83. Max, would you have any interest in taking the baton at the maintainer? Did I ever forward you the information that Thomas Leeper gave me on this project? If having a second fiddle was pivotal to your taking the lead, just let me know.

In short, I'm maintaining interest, and if need be I think I can get up to speed and take the baton. If someone else like Max would like to take the helm, I would gladly stand aside or play backup, as is preferred.

ghost avatar Jul 02 '19 04:07 ghost

Thanks @pdurbin and @monogan, I'm not particularly concerned about the time frame at this point since I have a workaround to create datasets (which is simply going on the dataverse site and manually making one). I wish I could be able to help out but my experience with this sort of stuff is extremely limited. I really appreciate it though!

thomascli19 avatar Jul 02 '19 12:07 thomascli19

Glad you've got a workaround @thomascli19. If anyone on this thread (or anyone who reads this or anyone a reader knows) wants to pick up the baton, I support it. Otherwise, let's keep touching base in the future. I hope to find the time to do this soon.

ghost avatar Jul 02 '19 21:07 ghost

For what it's worth, I put a call out to the Dataverse community: https://groups.google.com/d/msg/dataverse-community/j-cGlnpKs14/S3PW3f1MBQAJ

pdurbin avatar Jul 02 '19 21:07 pdurbin

Great move @pdurbin! Thank you for doing that. I'll reiterate, if anyone tell you, "I'll do it if I can find a good second fiddle," feel free to put them in touch with me. It'd be easier for me to swing that than playing point.

ghost avatar Jul 02 '19 21:07 ghost

Any news on this issue? I'm coming across the same problem with create_dataset(). I've tried different arguments including:

  1. https://cran.r-project.org/web/packages/dataverse/vignettes/A-introduction.html
# create a list of metadata
metadat <- list(title = "My Study",
                creator = "Doe, John",
                description = "An example study")

# create the dataset
dat <- initiate_dataset("mydataverse", body = metadat)
  1. https://cran.r-project.org/web/packages/dataverse/vignettes/D-archiving.html
# create the dataset
ds <- create_dataset("mydataverse")

Note: I also tried the SWORD-based workflow described on this same webpage to no avail. initiate_sword_dataset() runs but add_file() throws an error.

  1. body parameters to match the online html tags for the required fields, similar to @thomascli19's example
# create a list of metadata
metadat <- list(title = "My Study",
                author = "Doe, John",
                datasetContact = "Doe, John",
                dsDescription = "An example study",
                subject = "Other")

# create the dataset
dat <- initiate_dataset("mydataverse", body = metadat)

maia-sh avatar Oct 18 '20 19:10 maia-sh

I'm got stuck here too. It would have saved me time if this was mentioned in the README. Will try pyDataverse for the time being.

sindribaldur avatar Oct 20 '20 16:10 sindribaldur

I'm sorry this isn't working. I'll have time this weekend to look.

wibeasley avatar Oct 20 '20 18:10 wibeasley

This workaround (https://github.com/IQSS/dataverse-client-r/issues/82#issuecomment-788738907) using initiate_sword_dataset() instead of create_dataset() and using. add_dataset_file() instead of add_file() appears to be a good workaround.

kuriwaki avatar Mar 02 '21 14:03 kuriwaki

I previously got the same errors, but have an httr code working in R now. If others can help me with my question in #82(comment) we may be able to solve the question here as well (as the dataverse package uses httr as well in a similar fashion)?

Danny-dK avatar Mar 03 '21 08:03 Danny-dK

With current version on dev the error seems a bit more narrowed down.

library("dataverse")
packageVersion("dataverse") # dev
#> [1] '0.3.11'
metadat <- list(
  title = "test upload",
  creator = "Shiro Kuriwaki",
  datasetContact = "Shiro Kuriwaki",
  description = "Test Create",
  Subject = "Other"
)

create_dataset(
  dataverse = "kuriwaki",
  body = jsonlite::toJSON(metadat, auto_unbox = TRUE),
  server = "demo.dataverse.org",
  key = rstudioapi::askForPassword()
)

Created on 2022-04-09 by the reprex package (v2.0.1)

This gives

Error in create_dataset(dataverse = "kuriwaki", body = jsonlite::toJSON(metadat, : 
Forbidden (HTTP 403). Failed to Validation Failed: Description Text is required. (Invalid 
value:edu.harvard.iq.dataverse.DatasetField[ id=null ]), Subject is required. (Invalid 
value:edu.harvard.iq.dataverse.DatasetField[ id=null ]), Title is required. (Invalid 
value:edu.harvard.iq.dataverse.DatasetField[ id=null ]), Contact E-mail is required. (Invalid 
value:edu.harvard.iq.dataverse.DatasetField[ id=null ]), Author Name is required. (Invalid 
value:edu.harvard.iq.dataverse.DatasetField[ id=null ])..

So perhaps getting the metadata slots right could do it

kuriwaki avatar Apr 09 '22 17:04 kuriwaki

Yes. You either have to create a json file with the required metadata fields https://github.com/IQSS/dataverse-client-r/issues/82#issuecomment-790443425 of which you can get the correct json format at https://guides.dataverse.org/en/5.3/_downloads/dataset-create-new-all-default-fields.json, or you can recreate the json with a very ugly paste code (doesn't require the json file) https://github.com/IQSS/dataverse-client-r/issues/82#issuecomment-790674787, or you can do it with an ugly for function requiring some json files https://github.com/IQSS/dataverse-client-r/issues/82#issuecomment-796942036). Even within the pydataverse either a json file or a csv template is required https://github.com/IQSS/dataverse-client-r/issues/82#issuecomment-794118699), there does not seem to be a helper function to recreate the require metadata like in initiate_sword_dataset() https://github.com/IQSS/dataverse-client-r/issues/82#issuecomment-792241533

Danny-dK avatar Apr 11 '22 07:04 Danny-dK