GBM-tune
GBM-tune copied to clipboard
may you share link from where to get data please
as usual great code and ideas but d0_train <- fread(paste0("/var/data/airline/",yr-1,".csv")) may you share link from where to get data please
is it this data? https://www.kaggle.com/usdot/flight-delays or https://www.kaggle.com/giovamata/airlinedelaycauses or https://openflights.org/data.html or https://www.stat.purdue.edu/~sguha/rhipe/doc/html/airline.html or http://web.mit.edu/airlinedata/www/default.html
here:
https://github.com/szilard/GBM-tune/blob/7047ae1d8a133aefcc338665ffbee3864cc529ff/1-train_test-same_yr/run-tuning.R#L9-L24
(see the commented out lines with wget for the URLs)
Thanks for soon answer something goes wrong with this code may you please clarify what can be done ?
set.seed(123) for yr in 1990 1991; do Error: unexpected symbol in "for yr" wget http://stat-computing.org/dataexpo/2009/$yr.csv.bz2 Error: unexpected symbol in " wget http" bunzip2 $yr.csv.bz2 Error: object 'bunzip2' not found wget http://stat-computing.org/dataexpo/2009/$1990.csv.bz2 Error: unexpected symbol in "wget http" yr <- 1990 wget http://stat-computing.org/dataexpo/2009/$yr.csv.bz2 Error: unexpected symbol in "wget http"
install.packages("wget") WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/ Installing package into ‘C:/Users/sndr/Documents/R/win-library/3.5’ (as ‘lib’ is unspecified) Warning in install.packages : package ‘wget’ is not available (for R version 3.5.1)
install.packages("Rtools ") WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/ Installing package into ‘C:/Users/sndr/Documents/R/win-library/3.5’ (as ‘lib’ is unspecified) Warning in install.packages : package ‘Rtools ’ is not available (for R version 3.5.1)
install.packages("Rtools") WARNING: Rtools is required to build R packages but is not currently installed. Please download and install the appropriate version of Rtools before proceeding:
https://cran.rstudio.com/bin/windows/Rtools/ Installing package into ‘C:/Users/sndr/Documents/R/win-library/3.5’ (as ‘lib’ is unspecified) Warning in install.packages : package ‘Rtools’ is not available (for R version 3.5.1)
well, that's not R code, it's a bash (unix) script
you can also just download those files manually, e.g. http://stat-computing.org/dataexpo/2009/1991.csv.bz2
great thanks for soon answer
but I get this
and this
curl("http://stat-computing.org/dataexpo/2009/$yr.csv.bz2") A connection with
description "http://stat-computing.org/dataexpo/2009/$yr.csv.bz2" class "curl"
mode "r"
text "text"
opened "closed"
can read "yes"
can write "no"
yr [1] 1990
and this
Yeah, I see. It seems the provider has deleted the data.
http://stat-computing.org/dataexpo/2009/the-data.html
You might be able to find a copy somewhere else, though.
E.g. here: https://github.com/h2oai/h2o-2/wiki/Hacking-Airline-DataSet-with-H2O
Airlines all years 1987-2008: https://s3.amazonaws.com/h2o-airlines-unpacked/allyears.csv (12 GB)
though I'm not 100% sure it is exactly the same data (that is same rows and same columns).
Szilard super thanks for help very kind of you will try to download this data