solana icon indicating copy to clipboard operation
solana copied to clipboard

Programs are unaware of non-default heap sizes

Open tiago18c opened this issue 2 years ago • 8 comments

Problem

When transactions request extra heap size via the ComputeBudgetInstruction::RequestHeapFrame, programs still run out of memory at the 32Kb default heap size.

The culprit is the default allocator that is unaware of the extra heap and always defaults to the default heap size as linked bellow. When changing the default value to match the higher heap frame size, everything works, but still stuck with a static value that is not useful for programs with multiple instructions which only some require custom heap sizes. https://github.com/solana-labs/solana/blob/cd7f03bc88c0f32aa2583029572c73a26955242b/sdk/program/src/entrypoint.rs#L168C13-L168C50

Proposed Solution

There should be a way for the program allocator to become aware of the current heap size allocated by the VM, but I'm not comfortable with the VM to propose actual solutions.

tiago18c avatar Jul 24 '23 15:07 tiago18c

could fetch limit from instruction introspection during bump alloc initialization, would be a relatively quick fix without addition of a new syscall

mschneider avatar Nov 20 '23 00:11 mschneider

We removed the error handling, for a quick workaround. Not ideal dev-ex, but gets the job done: https://github.com/blockworks-foundation/mango-v4/pull/801

mschneider avatar Dec 01 '23 13:12 mschneider

It looks like declaring custom-heap feature fixes this issue. As far as I understand, with the feature defined, Solana doesn’t define a global allocator and if the smart contract doesn’t define one either, whatever Rust allocator for the target is used. I’m however struggling to figure out what that allocator is.

mina86 avatar Jan 04 '24 12:01 mina86

Looks like the allocator is set at https://github.com/solana-labs/solana/blob/a2d2eb3ba80824f55c9cbd267ab881a285ca324a/programs/bpf_loader/src/lib.rs#L256 and it gets the heap size from the vm configuration. On first glance it seems that the global allocator defined by #[entrypoint] is pointless since what runtime provides us has superset of features.

mina86 avatar Jan 04 '24 15:01 mina86

you suggest deleting the global allocator implementation altogether? i think the issue with the default runtime allocator is that it's not a bump allocator and hence uses more compute

mschneider avatar Jan 07 '24 00:01 mschneider

i think the issue with the default runtime allocator is that it's not a bump allocator and hence uses more compute

If BpfAllocator is the default runtime allocator, than it is a bump allocator. Plus it’s part of the runtime so it should execute more efficiently than smart contract code. The one problem I’ve noticed is that sol_alloc_free_ doesn’t support arbitrary alignment and maybe that’s why it has been deprecated?

I may be wrong about sol_alloc_free_ being the default allocator. If it is though, the most convenient resolution for users would be to extend that syscall with alignment argument and than use it rather than defining global allocator in #[entrypoint].

However, at the moment I’m not really suggesting anything. I’m trying to understand what’s going on.

mina86 avatar Jan 08 '24 14:01 mina86

could fetch limit from instruction introspection during bump alloc initialization, would be a relatively quick fix without addition of a new syscall

This requires access to instruction sysvar AccountInfo. It’s hardly a quick fix since it requires changing the API of the smart contract call. Or do you mean something else than load_instruction_at_checked?

mina86 avatar Feb 07 '24 16:02 mina86

Also note that solana_program::entrypoint::deserialize and load_instruction_at_checked perform allocations. Current BumpAllocator implementation allocates from the end of the heap (which is correct and faster for bump allocators) which means the allocator needs to know the size of heap at the moment of first allocation. If heap size is to be read from the call to ComputeBudgetInstruction than:

  1. BumpAllocator need to be changed into one allocating from the beginning of heap (which is slightly less performant),
  2. parsing done by the two functions needs to be reimplemented in a new bit of code (though advantage is that a lot of the features can be stripped down) or
  3. if solana_program never Box::leaks, after figuring out size of heap the allocator can be reset with the new size.

Still, IMO the actual issue is the need to pass the instructions sysvar program. (Why instructions sysvar program is even a thing I don’t know. I don’t understand why Solana does things they way it does to be honest).

mina86 avatar Feb 08 '24 11:02 mina86