MP-SPDZ
MP-SPDZ copied to clipboard
SGDLogisti for millions data
Hello, here is my code
N=10000
data1 = np.loadtxt('./party0_train.csv', delimiter=',', skiprows=1)
data2 = np.loadtxt('./party1_train.csv', delimiter=',', skiprows=1)
data1 = data1[1:N, 1:]
data2 = data2[1:N, 1:]
X_train_guest=sfix.input_tensor_via(1, data1[:,1:])
Y_train_guest=sfix.input_tensor_via(1,data1[:,0])
X_train_host=sfix.input_tensor_via(0, data2)
X_train = X_train_guest.concat_columns(X_train_host)
log = ml.SGDLogistic(3, N-1)
log.fit(X_train, Y_train_guest)
I have data at the million-level scale that I want to use for logistic regression. I found that when the dataset reaches a certain magnitude, this message appears(e.g. N= 10000) ,What does this mean?
tensor-0-begin-loop-1 blowing up rounds: (2999 / 2999) ** 3 < 2999
tensor-0-begin-loop-5 blowing up rounds: (2999 / 2999) ** 3 < 2999
and it it very slow , i would like to know approximately how long it takes to train logistic regression with millions dataset , and are there any optimization methods available
You can safely ignore the message. What protocol are you using? Is the time more than linear in the number of samples?