platypus icon indicating copy to clipboard operation
platypus copied to clipboard

Warning while creating the train/test/valid split in the BCCD example

Open borkowski1110 opened this issue 4 years ago • 1 comments

Describe the bug This bug is associated with the BCCD example: https://github.com/maju116/platypus/blob/yolo3_fix/examples/Blood%20Cell%20Detection/Blood-Cell-Detection.md?fbclid=IwAR1-c-JKTEj6rCad5uCdDh84zzQ7Hv7rdXKZclQQZpAUOGiFyXNpwxj8p-Y

To Reproduce There is a possibility that running this code:

walk2(c("train", "valid", "test"), list(train_ids, valid_ids, test_ids), ~ {
  annots <- annot_paths[.y]
  images <- images_paths[.y]
  dir_name <- .x
  annots %>% walk(~ file.copy(., gsub("(BCCD)", paste0("BCCD/", dir_name), .)))
  images %>% walk(~ file.copy(., gsub("(BCCD)", paste0("BCCD/", dir_name), .)))
})

will result in a set of the following warnings:

49: In file.create(to[okay]) :
  cannot create the file '~/train_Dataset-master/BCCD/train/Annotations/BloodImage_00229.xml', reason: 'No such file or directory'

In my opinion this happens because of excessive action of the gsub function which replaces every match of 'BCCD' with a 'train', 'test' or 'valid' string creating paths of the form:

~/train_Dataset-master/BCCD/train/Annotations/BloodImage_00229.xml'

Instead of

~/BCCD_Dataset-master/BCCD/train/Annotations/BloodImage_00229.xml'

Session information (please complete the following information):

  • OS: [e.g. iOS]: MS Windows 8.1 64 bit
  • R version: 4.0.2
  • Python version: 3.7.6
  • TensorFlow (Python) version (tensorflow::tf_version()): 2.0
  • R session information (sessionInfo()):
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8.1 x64 (build 9600)

Matrix products: default

Random number generation:
 RNG:     Mersenne-Twister 
 Normal:  Inversion 
 Sample:  Rounding 
 
locale:
[1] LC_COLLATE=Polish_Poland.1250  LC_CTYPE=Polish_Poland.1250    LC_MONETARY=Polish_Poland.1250 LC_NUMERIC=C                  
[5] LC_TIME=Polish_Poland.1250    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] here_0.1         abind_1.4-5      platypus_0.1.1   keras_2.3.0.0    tensorflow_2.2.0 forcats_0.5.0    stringr_1.4.0   
 [8] dplyr_1.0.2      purrr_0.3.4      readr_1.4.0      tidyr_1.1.2      tibble_3.0.3     ggplot2_3.3.2    tidyverse_1.3.0 

loaded via a namespace (and not attached):
 [1] progress_1.2.2     reticulate_1.16    tidyselect_1.1.0   haven_2.3.1        lattice_0.20-41    colorspace_1.4-1  
 [7] vctrs_0.3.4        generics_0.0.2     base64enc_0.1-3    XML_3.99-0.5       blob_1.2.1         rlang_0.4.7       
[13] pillar_1.4.6       glue_1.4.2         withr_2.3.0        DBI_1.1.0          dbplyr_1.4.4       RColorBrewer_1.1-2
[19] modelr_0.1.8       readxl_1.3.1       lifecycle_0.2.0    munsell_0.5.0      gtable_0.3.0       cellranger_1.1.0  
[25] rvest_0.3.6        tfruns_1.4         fansi_0.4.1        broom_0.7.1        Rcpp_1.0.5         scales_1.1.1      
[31] backports_1.1.10   jsonlite_1.7.1     fs_1.5.0           gridExtra_2.3      hms_0.5.3          stringi_1.5.3     
[37] rprojroot_1.3-2    grid_4.0.2         cli_2.0.2          tools_4.0.2        magrittr_1.5       crayon_1.3.4      
[43] whisker_0.4        pkgconfig_2.0.3    zeallot_0.1.0      ellipsis_0.3.1     Matrix_1.2-18      prettyunits_1.1.1 
[49] xml2_1.3.2         reprex_0.3.0       lubridate_1.7.9    assertthat_0.2.1   httr_1.4.2         rstudioapi_0.11   
[55] R6_2.4.1           compiler_4.0.2    

Additional context The solution could be using some additional package for paths manipulation e.g. pathlibr

borkowski1110 avatar Oct 18 '20 22:10 borkowski1110

Replacing that walk2 with the following seems to do the trick.

walk2(c("train", "valid", "test"), list(train_ids, valid_ids, test_ids)[2], ~ {
  annots <- annot_paths[.y]
  images <- images_paths[.y]
  dir_name <- .x
  annots %>% walk(~ file.copy(., gsub(BCCD_path, paste0(BCCD_path, '/', dir_name), .)))
  images %>% walk(~ file.copy(., gsub(BCCD_path, paste0(BCCD_path, '/', dir_name), .)))
})

borkowski1110 avatar Oct 18 '20 23:10 borkowski1110