machine-unlearning icon indicating copy to clipboard operation
machine-unlearning copied to clipboard

Error on the fifth shard

Open clemley opened this issue 4 years ago • 7 comments

On the first request of the fifth shard I believe there is an index error as it causes an error. All other pieces run properly aside from the fifth shard. Is there a way to fix this?

clemley avatar Dec 11 '20 02:12 clemley

I found that the number of data in purchase2_train.npy generated by running init.sh was 249215, which was different from the number in the datasetfile. So I fix this by modifying this code in prepare_data.py : X_train, X_test, y_train, y_test = train_test_split(data, label, **test_size=0.1**)

Hope that helps

huxi2 avatar Jul 02 '21 08:07 huxi2

According to the proposal, the change from 0.2 to 0.1 still has the above problems.

swagStar123-code avatar Dec 12 '22 13:12 swagStar123-code

Same here. Even I change from 0.2 to 0.1, there is still the index error IndexError: index 280367 is out of bounds for axis 0 with size 280367.

any suggestion till now?

KatieHYT avatar Jun 08 '23 11:06 KatieHYT

Any solution found regarding this issue?

nimeshagrawal avatar Aug 11 '23 11:08 nimeshagrawal

The problem is there in the datasets/purchase/datasetfile. They have hard coded train and test sample size. The prepare_data.py splits according to test_size = 0.2, but "datasetfile" has sample sizes according to test_size = 0.1. Hence, change train & test sample size in "datasetfile". (Replace with nb_train = 249215 and nb_test = 62304)

nimeshagrawal avatar Aug 11 '23 12:08 nimeshagrawal

Thanks for your solution. It solved my problem perfectly.

The problem is there in the datasets/purchase/datasetfile. They have hard coded train and test sample size. The prepare_data.py splits according to test_size = 0.2, but "datasetfile" has sample sizes according to test_size = 0.1. Hence, change train & test sample size in "datasetfile". (Replace with nb_train = 249215 and nb_test = 62304)

scottshufe avatar Sep 14 '23 10:09 scottshufe

The problem is there in the datasets/purchase/datasetfile. They have hard coded train and test sample size. The prepare_data.py splits according to test_size = 0.2, but "datasetfile" has sample sizes according to test_size = 0.1. Hence, change train & test sample size in "datasetfile". (Replace with nb_train = 249215 and nb_test = 62304)

This. And remember to run python prepare_data.py after making this change.

GM-git-dotcom avatar Nov 15 '23 03:11 GM-git-dotcom