algebra icon indicating copy to clipboard operation
algebra copied to clipboard

Can arm64 also be optimized for field arithmetic operation?

Open brew0722 opened this issue 11 months ago • 1 comments

I am developing a program using arkworks' groth16 snark library. Proof verification benchmark performance results were sufficiently fast in the local development environment, but very slow performance results were observed in the embedded environment.

As a result of using the profiler tool, most of the overhead occurred in ark-ff's field arithmetic operation (mul_assign). The current arithmetic implementation of ark-ff appears to have inline assembly optimization only for x86_64. image

The embedded environment uses arm64 architecture and has low-performance hardware such as Raspberry Pi. Of course, low hardware performance is the main cause, but considering the generic mobile environment, I think arm64 optimization support is necessary.

I would like to ask if you have any plans to support the arm64 arithmetic optimization.

brew0722 avatar Mar 12 '24 03:03 brew0722

I found related docs as following, and I maybe understand why montgomery optimization based on arm64 ISA is difficult. Unless the same instructions are provided as Intel ADX and BMI2, assembly optimization probably won't help much.

However, I don't know if this conclusion is accurate due to my limited knowledge of cryptography, so please close the issue if there are no other opinions after the final review.

https://research.nccgroup.com/2021/09/10/optimizing-pairing-based-cryptography-montgomery-multiplication-in-assembly/

brew0722 avatar Mar 12 '24 08:03 brew0722