curve25519-dalek icon indicating copy to clipboard operation
curve25519-dalek copied to clipboard

Problem with large stack frames on BPF target

Open brson opened this issue 3 years ago • 9 comments

I recently tried to compile curve25519-dalek for a Solana program, which targets BPF. This platform has stack frames limited to 4KB, and when I build for it I see these errors:

Error: Function _ZN117_$LT$curve25519_dalek..edwards..EdwardsBasepointTableRadix128$u20$as$u20$curve25519_dalek..traits..BasepointTable$GT$6create17h27a4aab4
d6ab2304E Stack offset of -7680 exceeded max offset of -4096 by 3584 bytes, please minimize large stack variables
Error: Function _ZN117_$LT$curve25519_dalek..edwards..EdwardsBasepointTableRadix256$u20$as$u20$curve25519_dalek..traits..BasepointTable$GT$6create17h399b7a66
709f3a4bE Stack offset of -15360 exceeded max offset of -4096 by 11264 bytes, please minimize large stack variables
Error: Function _ZN16curve25519_dalek5field79_$LT$impl$u20$curve25519_dalek..backend..serial..u64..field..FieldElement51$GT$8pow2250117hba5efddbdd192654E Sta
ck offset of -4496 exceeded max offset of -4096 by 400 bytes, please minimize large stack variables
Error: Function _ZN209_$LT$curve25519_dalek..window..NafLookupTable8$LT$curve25519_dalek..backend..serial..curve_models..ProjectiveNielsPoint$GT$$u20$as$u20$
core..convert..From$LT$$RF$curve25519_dalek..edwards..EdwardsPoint$GT$$GT$4from17h77ae8748c9a7483fE Stack offset of -10920 exceeded max offset of -4096 by 68
24 bytes, please minimize large stack variables
Error: Function _ZN205_$LT$curve25519_dalek..window..NafLookupTable8$LT$curve25519_dalek..backend..serial..curve_models..AffineNielsPoint$GT$$u20$as$u20$core
..convert..From$LT$$RF$curve25519_dalek..edwards..EdwardsPoint$GT$$GT$4from17h24d4257e81f959d3E Stack offset of -8320 exceeded max offset of -4096 by 4224 by
tes, please minimize large stack variables
Error: Function _ZN212_$LT$curve25519_dalek..window..LookupTableRadix64$LT$curve25519_dalek..backend..serial..curve_models..ProjectiveNielsPoint$GT$$u20$as$u
20$core..convert..From$LT$$RF$curve25519_dalek..edwards..EdwardsPoint$GT$$GT$4from17hb02d6ad1deb177f8E Stack offset of -5648 exceeded max offset of -4096 by
1552 bytes, please minimize large stack variables
Error: Function _ZN208_$LT$curve25519_dalek..window..LookupTableRadix64$LT$curve25519_dalek..backend..serial..curve_models..AffineNielsPoint$GT$$u20$as$u20$c
ore..convert..From$LT$$RF$curve25519_dalek..edwards..EdwardsPoint$GT$$GT$4from17hb0139d965a2acb9dE Stack offset of -4328 exceeded max offset of -4096 by 232
bytes, please minimize large stack variables
Error: Function _ZN213_$LT$curve25519_dalek..window..LookupTableRadix128$LT$curve25519_dalek..backend..serial..curve_models..ProjectiveNielsPoint$GT$$u20$as$
u20$core..convert..From$LT$$RF$curve25519_dalek..edwards..EdwardsPoint$GT$$GT$4from17h4544329dbf12acddE Stack offset of -10768 exceeded max offset of -4096 b
y 6672 bytes, please minimize large stack variables
Error: Function _ZN209_$LT$curve25519_dalek..window..LookupTableRadix128$LT$curve25519_dalek..backend..serial..curve_models..AffineNielsPoint$GT$$u20$as$u20$
core..convert..From$LT$$RF$curve25519_dalek..edwards..EdwardsPoint$GT$$GT$4from17hf4e99ee7b010e2c8E Stack offset of -8168 exceeded max offset of -4096 by 407
2 bytes, please minimize large stack variables
Error: Function _ZN213_$LT$curve25519_dalek..window..LookupTableRadix256$LT$curve25519_dalek..backend..serial..curve_models..ProjectiveNielsPoint$GT$$u20$as$
u20$core..convert..From$LT$$RF$curve25519_dalek..edwards..EdwardsPoint$GT$$GT$4from17h7e641802879453f1E Stack offset of -21008 exceeded max offset of -4096 b
y 16912 bytes, please minimize large stack variables
Error: Function _ZN209_$LT$curve25519_dalek..window..LookupTableRadix256$LT$curve25519_dalek..backend..serial..curve_models..AffineNielsPoint$GT$$u20$as$u20$
core..convert..From$LT$$RF$curve25519_dalek..edwards..EdwardsPoint$GT$$GT$4from17h1e9be112f934ec3bE Stack offset of -15848 exceeded max offset of -4096 by 11
752 bytes, please minimize large stack variables

It appears that having the lookup tables on the stack causes the stack frames to be too large for the BPF target. At runtime this results in access violations.

Would it be feasible to have a compile flag to put these tables in boxes off the stack?

brson avatar May 28 '21 22:05 brson

@brson is there a specific reason you need Ed25519?

If you are trying to target minimal stack size, you might be better off with an X25519-based signature algorithm instead. There aren't published standards for one, but there are papers and running code.

tarcieri avatar May 29 '21 00:05 tarcieri

@tarcieri no I just need something easy and as foolproof as reasonable. I spent the last bit switching to ecdsa with the k256 crate, and it seems to avoid the overlarge stack frames, but I did find the api more challenging, lots of options there. I need to read more about ecdsa to understand what I'm doing better. I'll google x25519. Thanks for the advice.

Edit: Just to be a bit more helpful, the challenges I had with the k256 crate were mostly about chasing down and figuring out which trait methods I should be using.

brson avatar May 29 '21 02:05 brson

Edit: Just to be a bit more helpful, the challenges I had with the k256 crate were mostly about chasing down and figuring out which trait methods I should be using.

Yeah, we could definitely use better docs. I've been trying to add breadcrumbs as I encounter people having problems.

This signature algorithm will probably give you the best bang for your buck code-side wise, but this implementation is in C. It could be reimplemented on top of x25519-dalek (but I'm generally gathering you're looking for something ready-to-use):

https://sourceforge.net/p/strobe/code/ci/master/tree/x25519.c

tarcieri avatar May 29 '21 02:05 tarcieri

Hi @brson! Thanks for bringing this up. May I ask how you were using ed25519-dalek such that you ended up using the larger precomputed lookup tables? I have an experimental branch for using that in ed25519-dalek, but it's not included in any release yet.

In any case, the lookup tables are quite large. The default (smallest) one using radix-16 windows produces about 30KB of precomputed data, and it just goes up from there, so ideally these should be placed on the heap somewhere. Is that possible on your target system?

isislovecruft avatar Aug 03 '21 23:08 isislovecruft

@isislovecruft This is how I had ed25519-dalek configured in Cargo.toml:

ed25519-dalek = { version = "1.0.1", default-features = false, features = ["u64_backend"] }

The target was Solana's bpf interpreter via their custom toolchain. With 4k stacks I think the tables would need to be on the heap, and Solana's target does have allocator support.

I note though that this is no longer a problem for me personally, as there are other reasons not to run crypto code on solana's platform: the instruction count limit is too low to verify signatures at all.

brson avatar Aug 05 '21 18:08 brson

Any solution for that? I have been active in developing a zero-knowledge verifier on the Solana program and ed25519-dalek is the best lib for that. However, the lookup table in ed25519-dalek is too large for the 4k limit of stack in Solana programs. Really appreciate it if anyone could direct me around this issue.

tuphan-dn avatar Mar 05 '22 08:03 tuphan-dn

@tuphan-dn Solana has an ed25519 "native" program that can supposedly be used for this purpose. Unfortunately:

  1. ~It is not activated yet on mainnet~ this appears to have changed.
  2. It is a "precompile" and can't be directly called from another Solana program. Instead it has to be called from an instruction included in a transaction, so the runtime can run the signature verification before the rest of the transaction executes. Another program can then look through the other instructions to verify the signature was verified.

FWIW I tried another signature verification scheme (ECDSA) in my toy project for which I wanted to used ed25519 and there was not enough CPU budget, so there's probably no point changing ed25519-dalek for the sake of solana programs. There is though another native solana program that can be used to request more CPU budget, though I don't know if it is activated on mainnet either.

brson avatar Mar 05 '22 18:03 brson

Oh @tuphan-dn the ed25519 program appears to have been activated on mainnet recently.

brson avatar Mar 05 '22 19:03 brson

@tuphan-dn Solana has an ed25519 "native" program that can supposedly be used for this purpose. Unfortunately:

  1. ~It is not activated yet on mainnet~ this appears to have changed.
  2. It is a "precompile" and can't be directly called from another Solana program. Instead it has to be called from an instruction included in a transaction, so the runtime can run the signature verification before the rest of the transaction executes. Another program can then look through the other instructions to verify the signature was verified.

FWIW I tried another signature verification scheme (ECDSA) in my toy project for which I wanted to used ed25519 and there was not enough CPU budget, so there's probably no point changing ed25519-dalek for the sake of solana programs. There is though another native solana program that can be used to request more CPU budget, though I don't know if it is activated on mainnet either.

Thank you for a detailed suggestion.

As I have mentioned, I would like to build a zk verifier, then I need to use point operations by ed25519-dalek, not actually the signature scheme. I have pinged mvines, one of the active solana contributors, and unfortunately, solana programs is not ready for onchain zk proofs. Even there are some syscalls, but it's far from mass adoption.

tuphan-dn avatar Mar 07 '22 07:03 tuphan-dn

@brson if you have any way of testing, I'd be curious if #489 might address this issue if you disable the basepoint-tables feature added in that PR

tarcieri avatar Dec 27 '22 22:12 tarcieri

@tarcieri I did a cursory test with curve25519-dalek commit 267961b7ee23602d080773188e47694de4d02df6 (current master).

If I turn off both "precomputed-tables" and "alloc" I don't see any warnings about frame size. I assume "basepoint-tables" changed to "precomputed-tables".

With "alloc" I still see:

Error: Function _ZN209_$LT$curve25519_dalek..window..NafLookupTable8$LT$curve25519_dalek..backend..serial..curve_models..ProjectiveNielsPoint$GT$$u20$as$u20$core..convert..From$LT$$RF$curve25519_dalek..edwards..EdwardsPoint$GT$$GT$4from17h80a3443e349df2b3E Stack offset of 10952 exceeded max offset of 4096 by 6856 bytes, please minimize large stack variables
Error: Function _ZN205_$LT$curve25519_dalek..window..NafLookupTable8$LT$curve25519_dalek..backend..serial..curve_models..AffineNielsPoint$GT$$u20$as$u20$core..convert..From$LT$$RF$curve25519_dalek..edwards..EdwardsPoint$GT$$GT$4from17ha0516bf3a9f62ca8E Stack offset of 8352 exceeded max offset of 4096 by 4256 bytes, please minimize large stack variables

I did have to munge the manifest to remove ["zeroize?/alloc"] as the solana toolchain uses rust 1.59, which doesn't seem to support that syntax.

brson avatar May 23 '23 21:05 brson

It looks like alloc is enabling NafLookupTable8, which is too big for eBPF. Is there something in particular you need alloc for?

rozbb avatar May 25 '23 06:05 rozbb

@rozbb I don't need the alloc feature. I do not currently need to compile dalek for ebpf at all, just following up on the issue.

brson avatar May 25 '23 16:05 brson

Cool, I think we can call this fixed then. Glad to get reports that (disabling) the precomputed-tables feature is working out!

tarcieri avatar May 25 '23 16:05 tarcieri