CPT question on load_files() function

looking at the code for the load_files() function, why did you append the values of the test file in the data list?

for index, row in test.iterrows(): data.append(row.values) target.append(list(row.values))

Apr 27 '18 08:04 jermaine1ronquillo

Given, the sequential nature of the problem, test set can be merged with the train set to create a bigger training dataset. Hence the append.

Apr 27 '18 09:04 NeerajSarwan

Thank you for the response.

Can the merging be optional? And I'm looking forward for the python implementation of the other algorithms (when can this be?). And one last thing, if you have any idea on time series that can be made into an example using CPT is a great help.

Again thank you

Apr 28 '18 03:04 jermaine1ronquillo

The merging is currently optional. Merging happens only when you supply a test file to the load_files. Currently modifying the structure. Keep an eye out.

Apr 28 '18 18:04 NeerajSarwan

Thank you.

I have a question, why do you add the list() command in test.append(list(row.values)), while you did not use it in 'train.append(row.values)' ?

Apr 29 '18 07:04 jermaine1ronquillo

Given, the sequential nature of the problem, test set can be merged with the train set to create a bigger training dataset. Hence the append. it is mistaken to add test data for training, they call it data leaking. can you test your code without such a bug?

Apr 30 '18 18:04 Sandy4321

@Sandy4321 I am well aware about what data leaking is. But this isn't the case of data leaking as far as I know. Also, there merging of train and test is not mandatory as I have mentioned above. Feel free to fork the repository and test out the code.

May 03 '18 13:05 NeerajSarwan

@jermaine1ronquillo, I have removed the inconsistency. The train and test append operation should be consistent now. Thanks for pointing it out.

May 03 '18 13:05 NeerajSarwan