Kaggle-Competition-Favorita icon indicating copy to clipboard operation
Kaggle-Competition-Favorita copied to clipboard

the data problem.

Open zhanchey opened this issue 5 years ago • 7 comments

hi, i'm trying to run the code on my mac. it seems that there is many problems when i use the data downloaded from kaggle, could you please upload the data which could make this model work? thank you very much!

zhanchey avatar May 05 '19 02:05 zhanchey

Me too. Have a very hard time with the dataset. I appreciate any help.

59ranjbar avatar May 09 '19 21:05 59ranjbar

The data is a few gigabytes large, it's not possible to upload them onto github. Did you guys try the data cleaning instruction on the "How to Run" section?

LenzDu avatar May 10 '19 21:05 LenzDu

Also see #2

LenzDu avatar May 10 '19 21:05 LenzDu

following the instructions, the code can work finally, thank you very much. and i have another question, what is the meaning of the parameter n_range in the code? why it is set as 16? i just cannot understand it

zhanchey avatar May 11 '19 06:05 zhanchey

Hi Zhanchey, I believe your problem and mine is almost the same. Could you explain how you resolve your issues? Thanks

59ranjbar avatar May 13 '19 17:05 59ranjbar

n_range controls how many different periods of data the generator yields. It goes with day_skip to decide the start date in each batch. E.g. if n_range=16 and day_skip=7 it collects the data starts at day 0, 7, 14, ..., 7*16. I believe 16 is an arbitrary number

LenzDu avatar May 14 '19 05:05 LenzDu

Also see #2 Regarding the data preparation could you please explain how you have done it in python. I am still confused about how to do it in python. Appreciate it in advance.

59ranjbar avatar May 15 '19 17:05 59ranjbar