CPT icon indicating copy to clipboard operation
CPT copied to clipboard

question on load_files() function

Open jermaine1ronquillo opened this issue 7 years ago • 7 comments

looking at the code for the load_files() function, why did you append the values of the test file in the data list?

for index, row in test.iterrows(): data.append(row.values) target.append(list(row.values))

jermaine1ronquillo avatar Apr 27 '18 08:04 jermaine1ronquillo

Given, the sequential nature of the problem, test set can be merged with the train set to create a bigger training dataset. Hence the append.

NeerajSarwan avatar Apr 27 '18 09:04 NeerajSarwan

Thank you for the response.

Can the merging be optional? And I'm looking forward for the python implementation of the other algorithms (when can this be?). And one last thing, if you have any idea on time series that can be made into an example using CPT is a great help.

Again thank you

jermaine1ronquillo avatar Apr 28 '18 03:04 jermaine1ronquillo

The merging is currently optional. Merging happens only when you supply a test file to the load_files. Currently modifying the structure. Keep an eye out.

NeerajSarwan avatar Apr 28 '18 18:04 NeerajSarwan

Thank you.

I have a question, why do you add the list() command in test.append(list(row.values)), while you did not use it in 'train.append(row.values)' ?

jermaine1ronquillo avatar Apr 29 '18 07:04 jermaine1ronquillo

Given, the sequential nature of the problem, test set can be merged with the train set to create a bigger training dataset. Hence the append. it is mistaken to add test data for training, they call it data leaking. can you test your code without such a bug?

Sandy4321 avatar Apr 30 '18 18:04 Sandy4321

@Sandy4321 I am well aware about what data leaking is. But this isn't the case of data leaking as far as I know. Also, there merging of train and test is not mandatory as I have mentioned above. Feel free to fork the repository and test out the code.

NeerajSarwan avatar May 03 '18 13:05 NeerajSarwan

@jermaine1ronquillo, I have removed the inconsistency. The train and test append operation should be consistent now. Thanks for pointing it out.

NeerajSarwan avatar May 03 '18 13:05 NeerajSarwan