move [Bug] Support for generic entry functions

We revisit this in stage#2 or later so keeping this task to track support of generic entry functions.

@nvjle suggested that one approach would be to have the compiler generate one generic function and use some form of type-descriptor at runtime. The type descriptor could be the same as our hidden rtty desc, or instead the language_storage::TypeTag

We probably need to put checks:

preventing users from abusing the generic entry points.
limiting the kind of type descriptors allowed for entry functions

Jun 01 '23 15:06 ksolana

This is part of #161.

Jun 01 '23 15:06 nvjle

moving it for later as this can wait

Sep 08 '23 17:09 ksolana

Many Move smart contracts are parameterized by the type of an asset they're handling, e.g. Coin, where Coin can be any particular token type. That is to say, generic structs and functions in Move binary packages are quite a common thing, not an exotic feature. Move bytecode supports type parameterization, with instantiation being performed at run-time by the VM. I'm not sure how we can generate generic LLVM code and determine struct layout at compile time, if the actual type parameters can be other structs possibly defined in other modules or even packages.

Sep 10 '23 13:09 dmakarov

Many Move smart contracts are parameterized by the type of an asset they're handling, e.g. Coin, where Coin can be any particular token type. That is to say, generic structs and functions in Move binary packages are quite a common thing, not an exotic feature. Move bytecode supports type parameterization, with instantiation being performed at run-time by the VM. I'm not sure how we can generate generic LLVM code and determine struct layout at compile time, if the actual type parameters can be other structs possibly defined in other modules or even packages.

We already handle generics (i.e., where function signatures or structs are parameterized by type) in all cases with the exception of Move entry functions. Currently these implemented as follows:

In the case of ordinary Move function generic signatures, we expand ("monomorphize"/concretize) each function into a concrete instantiation. This is essentially the same as is done by C++ template expansion or Rust generics.
In the case of native Move functions, we leave them generic and pass an implicit MoveType parameter for each generic type parameter. The move-native library routines must interpret these types at runtime. This interpretation is what accounts for much of the complexity in what is otherwise a simple library.

Similarly, Move entry functions could be implemented as either #1 or #2 above-- or a combination. If all (or many) of the expected type parameters for a given entry point happened to be known (e.g., we know a function typically takes SOL, or u32), then #1 applies. We'd concretize/expand the function as usual. If, on the other hand, we have the case where we do not necessarily know the expected types (or there are "too many")[*], the we could do something similar (if not identical to) #2. We also concretize generic structs, which is a considerably more tricky than functions.

The Coin example is a place where #2 would make sense. Note, however, that Coin is part of the vendor/chain-specific framework library. We do not have to do tokens (or anything else) the same way (i.e, generic). I don't seem to remember most Solana functions being generic like this (but I do need to go back and review, e.g., a Solana token "transfer" API). This is why in the meeting on Friday I opined that this functionality seems not high priority. But I do agree that it adds expressiveness to the language/interface and is very nice and convenient. In either case, if someone is interested in implementing it, the "simplest" way to do it is to follow what we already do for #2.

Also, for Coin specifically, I think we would likely need to be instead using Solana's SPL Token or Token-2022 API. At least for one of these APIs, they are not generic/parameterized in the way that Coin is. For example, below is transfer:

pub fn transfer(
    token_program_id: &[Pubkey],
    source_pubkey: &[Pubkey],
    destination_pubkey: &[Pubkey],
    authority_pubkey: &[Pubkey],
    signer_pubkeys: &[&[Pubkey],
    amount: [u64]
) -> [Result]<[Instruction], [ProgramError]>

[*] As is done in some systems, the expected or likely types can be annotated in comments or language-specific annotation mechanisms. In some cases, a set of expected/possible types are obvious (often from type inference itself)-- e.g., type T is an integer primitive-- which is a very small number of types in the Move Language).

Sep 10 '23 17:09 nvjle

Sounds good to me. We definitely can postpone support of generic entry functions. I added a small example using a generic entry function in #355. It seems currently this function is not included in function declarations that declare_functions_walk creates. I'll probably debug this a bit, just to understand what we do.

Sep 10 '23 20:09 dmakarov

Sounds good to me. We definitely can postpone support of generic entry functions. I added a small example using a generic entry function in #355. It seems currently this function is not included in function declarations that declare_functions_walk creates. I'll probably debug this a bit, just to understand what we do.

One reason it is not included in declare_functions_walk is because in your example, it (public entry fun bar<T: store>(coin: &Coin<T>): u64) is never instantiated (that is, it is never called and it is not native). Please see the rbpf test examples I wrote for this sort of thing in tests/rbpf-tests/rgeneric-func* and tests/rbpf-tests/generic_struct*. For the native case, see any testcase that uses stdlib, such as vector or option. Essentially all of those require generics (the MoveType case I mentioned earlier). A simple example follows:

module 0x100::MX {
    public fun generic_id<T>(v: T): T {
        v
    }
}

module 0x200::MX {
    use 0x100::MX::generic_id;

    public fun square32(n: u32): u32 {
        generic_id<u32>(n) * generic_id<u32>(n)
    }

    public fun square8(n: u8): u8 {
        generic_id<u8>(n) * generic_id<u8>(n)
    }
}

script {
    fun main() {
        let x2 = 0x200::MX::square8(8);
        assert!(x2 == 64, 0xf00);

        let y2 = 0x200::MX::square32(32);
        assert!(y2 == 1024, 0xf01);

        // Second instantiation of generic_id<u8>.
        let z = 0x100::MX::generic_id<u8>(33);
        assert!(z == 33, 0xf02);
    }
}

I did try to copiously document what I implemented in declare_functions_walk (and related), and likewise for generic structs (declare_structs and related) in the comments in the source code.

Sep 10 '23 21:09 nvjle

Makes sense. My mistake, not paying due attention.

Sep 10 '23 21:09 dmakarov

An aside (splitting out from above): Regarding Coin specifically-- this does not seem to be the killer app for generics in the Solana context. That is, as near as I can tell, SPL Token APIs are all concrete (see, e.g., transfer above).

If I understand correctly, we will need to interoperate with existing Solana programs (including system programs). At least for token creation and transfers, I think we'd have to adhere to SPL Token APIs-- and likewise for any other standard Solana framework items. On the other hand, if we are free to design our own framework along the lines of Aptos or Sui, then we could support generic APIs, such as Aptos' legacy Coin, etc.

I certainly think redoing them generically would be better long term from the expressiveness and modernity point of view, but we may not have that option just now.

Sep 10 '23 21:09 nvjle