MP-SPDZ icon indicating copy to clipboard operation
MP-SPDZ copied to clipboard

difference between predict() and predict_proba()

Open cnt0 opened this issue 10 months ago • 1 comments

Hello, the following program

program.use_edabit(True)
from Compiler import ml

ml.set_n_threads(5)

X_test = sfix.Matrix(200, 8)
X_test.assign_all(0)

optimizer = ml.SGDLogistic(1)
optimizer.init(X_test)

start = 0
for var in optimizer.opt.trainable_variables:
    start = var.read_from_file(start)

optimizer.predict(X_test).print_reveal_nested()

launched with lowgear-party.x gives the following results:

Time = 13.1989 seconds 
Data sent = 1375.96 MB in ~30151 rounds (party 1 only)
Global data sent = 2751.91 MB (all parties)

when I'm trying to replace the last line with

optimizer.predict_proba(X_test).print_reveal_nested()

the results are like this:

Time = 60.2574 seconds 
Data sent = 9306.22 MB in ~202503 rounds (party 1 only)
Global data sent = 18612.4 MB (all parties)

What causes such a difference between 2 benchmarks? I'd expect it to be about the same. I'm using the release 0.3.7

cnt0 avatar Apr 22 '24 10:04 cnt0

predict_proba() uses sigmoid to compute the probabilities, which is much more expensive than just comparing the output of the dense layer to 0. The latter suffices to output the 0/1 required for the best guess.

mkskeller avatar Apr 22 '24 11:04 mkskeller