csvy icon indicating copy to clipboard operation
csvy copied to clipboard

[feature request] YAML header as a list

Open talegari opened this issue 6 years ago • 1 comments

Hi,

This is a great package!

I was pretty surprised to find that csvy::get_yaml_header gives a character vector and not a list. It might be a good idea to provide a list output to do something with it. (or a different function altogether say csvy::get_header?)

pacman::p_load("magrittr")
csvy::write_csvy(iris, "iris.csvy")

# this gives a character vector, not readable!
csvy::get_yaml_header("iris.csvy")
#>  [1] "profile: tabular-data-package" "name: iris"                   
#>  [3] "fields:"                       "- name: Sepal.Length"         
#>  [5] "  type: number"                "- name: Sepal.Width"          
#>  [7] "  type: number"                "- name: Petal.Length"         
#>  [9] "  type: number"                "- name: Petal.Width"          
#> [11] "  type: number"                "- name: Species"              
#> [13] "  type: string"                "  levels:"                    
#> [15] "  - setosa"                    "  - versicolor"               
#> [17] "  - virginica"                 "--- "

# meatadata is a recursive structure, a list might be better
metadata_list <- csvy::get_yaml_header("iris.csvy") %>% 
  textConnection() %>% 
  yaml::read_yaml()

metadata_list
#> $profile
#> [1] "tabular-data-package"
#> 
#> $name
#> [1] "iris"
#> 
#> $fields
#> $fields[[1]]
#> $fields[[1]]$name
#> [1] "Sepal.Length"
#> 
#> $fields[[1]]$type
#> [1] "number"
#> 
#> 
#> $fields[[2]]
#> $fields[[2]]$name
#> [1] "Sepal.Width"
#> 
#> $fields[[2]]$type
#> [1] "number"
#> 
#> 
#> $fields[[3]]
#> $fields[[3]]$name
#> [1] "Petal.Length"
#> 
#> $fields[[3]]$type
#> [1] "number"
#> 
#> 
#> $fields[[4]]
#> $fields[[4]]$name
#> [1] "Petal.Width"
#> 
#> $fields[[4]]$type
#> [1] "number"
#> 
#> 
#> $fields[[5]]
#> $fields[[5]]$name
#> [1] "Species"
#> 
#> $fields[[5]]$type
#> [1] "string"
#> 
#> $fields[[5]]$levels
#> [1] "setosa"     "versicolor" "virginica"

Created on 2018-12-18 by the reprex package (v0.2.0).

Session info
devtools::session_info()
#> Session info -------------------------------------------------------------
#>  setting  value                       
#>  version  R version 3.5.1 (2018-07-02)
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  tz       Asia/Kolkata                
#>  date     2018-12-18
#> Packages -----------------------------------------------------------------
#>  package    * version date       source                      
#>  backports    1.1.3   2018-12-14 cran (@1.1.3)               
#>  base       * 3.5.1   2018-07-05 local                       
#>  compiler     3.5.1   2018-07-05 local                       
#>  csvy         0.3.0   2018-12-18 Github (leeper/csvy@af0aa8d)
#>  data.table   1.11.8  2018-09-30 cran (@1.11.8)              
#>  datasets   * 3.5.1   2018-07-05 local                       
#>  devtools     1.13.6  2018-06-27 CRAN (R 3.5.0)              
#>  digest       0.6.18  2018-10-10 cran (@0.6.18)              
#>  evaluate     0.11    2018-07-17 CRAN (R 3.5.0)              
#>  graphics   * 3.5.1   2018-07-05 local                       
#>  grDevices  * 3.5.1   2018-07-05 local                       
#>  htmltools    0.3.6   2017-04-28 CRAN (R 3.5.0)              
#>  jsonlite     1.5     2017-06-01 CRAN (R 3.5.0)              
#>  knitr        1.20    2018-02-20 CRAN (R 3.5.0)              
#>  magrittr   * 1.5     2014-11-22 CRAN (R 3.5.0)              
#>  memoise      1.1.0   2017-04-21 CRAN (R 3.5.0)              
#>  methods    * 3.5.1   2018-07-05 local                       
#>  pacman       0.4.6   2017-05-14 CRAN (R 3.5.0)              
#>  Rcpp         0.12.19 2018-10-01 cran (@0.12.19)             
#>  rmarkdown    1.10    2018-06-11 CRAN (R 3.5.0)              
#>  rprojroot    1.3-2   2018-01-03 CRAN (R 3.5.0)              
#>  stats      * 3.5.1   2018-07-05 local                       
#>  stringi      1.2.4   2018-07-20 CRAN (R 3.5.0)              
#>  stringr      1.3.1   2018-05-10 CRAN (R 3.5.0)              
#>  tools        3.5.1   2018-07-05 local                       
#>  utils      * 3.5.1   2018-07-05 local                       
#>  withr        2.1.2   2018-03-15 CRAN (R 3.5.0)              
#>  yaml         2.2.0   2018-07-25 CRAN (R 3.5.0)

talegari avatar Dec 18 '18 09:12 talegari

I was also surprised that there was no function for this. I am using this to get the metadata in a list form -- code based on code in read_csvy. md is the metadata as read from the file, md_list is after it is processed with yaml and md_vec is a character vector of column types whose names are the column names.

library(yaml)
md <- get_yaml_header("df.csvy")
md_list <- yaml.load(paste(md, collapse = "\n"))
md_vec <- sapply(md_list$fields, function(x) setNames(x[[2]], x[[1]]))

ggrothendieck avatar Nov 28 '21 15:11 ggrothendieck