emp-agmpc icon indicating copy to clipboard operation
emp-agmpc copied to clipboard

AG-MPC with FerretCOT?

Open weikengchen opened this issue 4 years ago • 6 comments

It seems that FerretCOT architecture has been stabilized. Would AG-MPC use FerretCOT by default? Any suggestions on the number of threads to be used? (How major the computation cost is in FerretCOT?)

I would cc @carlweng here!

weikengchen avatar Dec 04 '20 19:12 weikengchen

By the way, I am also working to plug it in. May push a PR if I make some good progress.

weikengchen avatar Dec 04 '20 19:12 weikengchen

Some update.

I have implemented a prototype that replaces IKNP with FerretCOT: https://github.com/weikengchen/emp-agmpc/commit/7814c9ea29f80979324a57f4367095f25f14fa76.

It is not ready for PR or deployment, as the code sometimes produces a segfault, sometimes okay.

There are still three challenges in using the current implementation of FerretCOT by default in emp-agmpc:

  1. FerretCOT needs many NetIO for multithreading, which would require changes to emp-agmpc to create and pass these. It might be a useful alternative to only use multithreading for the computation part, but not the communication part (which ideally would not be the bottleneck?), which would simplify the code.

  2. FerretCOT generates many, many COT at once. FerretCOT generates roughly n COT in one call, as the output of the LPN. In practice, this might not be suitable for one-time garbling of a small circuit (many are wasted) but would be useful if the circuit is large, or many circuits need to be garbled. One potential solution is to compute only the partial result of LPN, depending on how many COT are needed. From the code and construction, this seems possible.

  3. FerretCOT would be very efficient with many cores per batch of COT, yet in AG-MPC, we need pairs of COT batch between every two parties, so in a multiparty setting, to fully use the benefits of FerretCOT, it would need a lot of cores (num_of_party * num_COT_core).

weikengchen avatar Dec 06 '20 03:12 weikengchen

In regard to FerretCOT:

  1. It is possible to use only 1 NetIO, but some parts of the code will have to be reconstructed. It is definitely reasonable to give this choice in the future.
  2. I think for now we can assume it will be used for large circuits. If the circuit is small, the user can use the IKNP directly.
  3. The performance of FerretCOT does not deteriorate much when I change from 4 threads to only 1 thread, so it does not heavily rely on muti-cores. You can use 1 thread for now, in this way actually only 1 NetIO is needed.

carlweng avatar Dec 06 '20 16:12 carlweng

emp now supports opening many netio from the same port, so there is not much harm to have many netio, right?

On Sun, Dec 6, 2020 at 10:31 AM CK Weng [email protected] wrote:

In regard to FerretCOT:

  1. It is possible to use only 1 NetIO, but some parts of the code will have to be reconstructed. It is definitely reasonable to give this choice in the future.
  2. I think for now we can assume it will be used for large circuits. If the circuit is small, the user can use the IKNP directly.
  3. The performance of FerretCOT does not deteriorate much when I change from 4 threads to only 1 thread, so it does not heavily rely on muti-cores. You can use 1 thread for now, in this way actually only 1 NetIO is needed.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/emp-toolkit/emp-agmpc/issues/12#issuecomment-739527032, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARKGCWY4VQRMNQSSSWDZ4TSTOWULANCNFSM4UN3Q3LQ .

-- Sent from Gmail Mobile

wangxiao1254 avatar Dec 06 '20 16:12 wangxiao1254

Thanks! I updated my prototype so that it would approximate the amortized FUND_IND cost based on how many OTs are ready, how many OTs are used, by the following code:

if(party == 1) {
	int cot_limit = mpc->fpre->abit->abit1[2]->ot_limit;
	int cot_used = mpc->fpre->abit->abit1[2]->ot_used;

	cout << "COT limit/used: " << cot_limit << "\t" << cot_used << " \n"<< flush;
	cout <<"FUNC_IND adjusted:\t"<<party<<"\t"<<t2 * (1.0 * cot_used / cot_limit)<<" \n"<<flush;
}

which would help people who want to do a benchmark.

The corresponding PR is here: https://github.com/weikengchen/emp-agmpc/commit/364453cccdde71511585250c52e9550e0aadefeb

This seems something quite similar to HE-based SPDZ, in that the offline phase may produce much more triples than a specific program needs. In HE-based SPDZ it is due to the packing of FHE ciphertexts and batching multiple ciphertexts in one network packet to alleviate the effect of network latency. Here, it is for LPN.

weikengchen avatar Dec 06 '20 23:12 weikengchen

(And the prototype has occasion segfault because all the FerretCOT instances I used want to read/write to the same pre_ot_data_reg_recv/send files. ~Will fix soon.~ It has been fixed.)

weikengchen avatar Dec 06 '20 23:12 weikengchen