ethcontract-rs
ethcontract-rs copied to clipboard
Add strategy for name collision.
Since we 'rustify' the contract identifiers (i.e. balanceOf -> balance_of) we can run into name collisions. This can also happen if some contract functions names are the same as default function names (for example address() to retrieve contract instance address, then the contract currently can't define a function address(...) as the identifiers will collide). Also solidity supports (? confirm) function overloading which rust does not.
We need to come up with a strategy for name collisions.
Indeed, it's possible to define overloaded functions in solidity:
contract ShoppingList {
function add(string memory name) public returns (bool) {
// ...
}
function add(string memory name, uint16 count) public returns (bool) {
// ...
}
}
The code above has roughly the following abi:
{
"contractName": "ShoppingList",
"abi": [
{
"constant": false,
"inputs": [
{ "internalType": "string", "name": "name", "type": "string" },
{ "internalType": "uint16", "name": "count", "type": "uint16" }
],
"name": "add",
"outputs": [
{ "internalType": "bool", "name": "", "type": "bool" }
],
"payable": false,
"stateMutability": "nonpayable",
"type": "function"
},
{
"constant": false,
"inputs": [
{ "internalType": "string", "name": "name", "type": "string" }
],
"name": "add",
"outputs": [
{ "internalType": "bool", "name": "", "type": "bool" }
],
"payable": false,
"stateMutability": "nonpayable",
"type": "function"
}
]
}
In some cases, it's possible to generate unique names for overloaded functions, but I feel developers would often have troubles "guessing" correct names without the help of IDE or generated docs.
Would it make sense to restrict these situations on the compile-time?
Currently there are a couple of issues around this. The first being how should ethcontract handle name collisions. I'm not sure what the best strategy is, but we have a couple of options:
- Provide a compile time error which states the name collision and allow callers to map function signatures to other method names. This is a more manual process, as the developer is responsible for manually providing new method names for each of the offending functions. It is, however, a nice feature to have (even outside the scope colliding function names) and a segway into other neat features like overloading input or output types (which can be useful for data that is manually encoded). I think this is a good starting point and should be implemented regardless of what name collision strategy we end up choosing.
- For contract function collisions we can be cleaver and use something like
Either, so in this case the method signature in Rust would befn add(&self, params: Either<Vec<u8>, (Vec<u8>, u16) >)-> MethodBuilder<bool>;. The limitation with this would be with name collisions with methods that are defined for all generated contracts (likefn address(&self) -> Address;). Personally I'm kind of fond of this approach, but I haven't considered all the cases yet and it might not be feasible. - Yet another solution would be to do the dead-simple way of just generating
add_0andadd_1methods, but as you said you would have to guess the correct names and which is less than ideal. Also, I am not suresolcguarantees an order to the functions in its JSON encoded ABI; if it doesn't, thenadd_0andadd_1could unexpectedly change across multipleShippingListcontract builds which is also not ideal. - Alternatively, as a spin-off of the previous solution, we could go the "OpenGL way" (I don't know if it is OpenGL specific but its where I've seen it) of specifying colliding function names by post-fixing them with type-dependent notation. For example
add(bytes,uint16)would becomeadd_buandadd(bytes)would becomeadd_b. This eliminates some of the guesswork (as the names would be predictable). However, I am not sure if this is enough to uniquely identify methods. For example in solidity are two methodsadd(uint16,uint16)andadd(uint256,uint256)allowed? If so then theuintsize would have to be part of the notation and might make method names cumbersome.
I think starting with 1 is a good-enough interim solution for now. I would like to evaluate the feasibility of 2 as well as a long-term solution.
The other side of the equation is ethabi-crate for allowing function name collisions. Currently we are pegged at [email protected] because of web3 which does not support functions with colliding names. We would have to wait for a new web3 release or figure out a way to de-couple the web3 version of ethabi from ethcontract's version.
Started by updating ethcontract's version of ethabi to support multiple functions with the same name #203
I've been thinking about the possible approaches for some time.
At first, I was seriously considering the 1st option, the one that requires manual work from the developer. But I can't come up with an idea about efficient and convenient UX to specify the method names.
I really like the idea with Either, although it might not be enough. In the case when overloaded functions have a different return type, we almost certainly would need to separate the methods on Rust side. There can be exceptions for this statement when the return types are not the same, but "compatible". uint32 and uint256 technically can be converted to a bigger type (uint256 in this case), but practically the conversion is questionable.
If we want an automatic and convenient we might consider taking the 4th option with falling back to the 2nd option when possible.
There are bright examples of what I mean:
// Fallback to Option is possible here.
function foo(string memory name, bool force) public returns (bool); // fn foo(String, Option<u16>, bool) -> bool
function foo(string memory name, uint16 count, bool force) public returns (bool); // fn foo(String, Option<u16>, bool) -> bool
// Fallback to Either perfectly fits the case.
function bar(string memory name) public returns (bool); // fn bar(Either<String, u16>) -> bool
function bar(uint16 price) public returns (bool); // fn bar(Either<String, u16>) -> bool
// The return type doesn't match - so we have to separate them.
function baz(string memory name) public returns (string memory); // fn baz_with_name(String) -> String
function baz(uint16 price) public returns (bool); // fn baz_with_price(u16) -> bool
// Counter example - mixing Either and Option is not allowed here.
// fn qux(String, Either<u16, u32>, Option<bool>) -> bool
// The above "obvious" translation is not correct, it should be either (u16 AND bool) OR u32.
// So, in this case we need to separate the methods.
function qux(string memory name, uint16 price, bool force) public returns (bool); // fn qux_with_force(String, u16, bool) -> bool
function qux(string memory name, uint32 price) public returns (bool); // fn qux(String, u32) -> bool
// Complex example - type information in the method signature is needed to distinguish them.
function quux(string memory name, uint16 price, bool force) public returns (bool); // fn quux_with_force(Option<String>, Either<u16, u32>, bool) -> bool
function quux(uint32 price, bool force) public returns (bool); // fn quux_with_force(Option<String>, Either<u16, u32>, bool) -> bool
function quux(uint32 price) public returns (bool); // fn quux_with_uint32(u32) -> bool
function quux(uint64 price) public returns (bool, bool); // fn quux_with_uint64(u64) -> (bool, bool)
I don't think the examples represent a full set of possible use cases, but at least (subjectively) common ones.
Very likely, it would make sense to use parameter name information when it's available to generate function names. The algorithm supposes to extract information from ABI based on parameters list and return type, and generate sane Rust methods.
Also, the example is a little simplified. In the real world, instead of Either, it might be better to generate own enumeration for each method:
enum FooParam {
T0(String),
T1(bool),
T2(u32),
}
fn foo<P>(force: bool, param: P) -> bool
where
P: Into<FooParam>,
{
// ...
}
impl From<bool> for FooParam {
fn from(other: bool) -> Self {
FooParam::T1(other)
}
}
// ... remaining conversion implementation
This will allow to simply type foo(true, 0) or explore passing borrowed values.
At first, I was seriously considering the 1st option, the one that requires manual work from the developer. But I can't come up with an idea about efficient and convenient UX to specify the method names.
Its definitely the least ideal of the options. That being said, I think it should be implemented regardless as a way to manually override a contract method name analogous to how we can manually override the contract name.
P: Into<FooParam>,
I like this idea a lot as it makes things more ergonomic.
Thanks for the excellent write up! I also agree with starting with option 4 would be a great start and then adding some cleaver features like using Option where it applies or using a parameter enum where it works.
I would suggest, however, to just use an Either enum as to not have to generate parameter types, as I think the code can be completely generic (and probably implemented with a macro):
enum Either<A,B> {
A(A),
B(B),
}
Impl From<A> for Either<A, _> {}
Impl From<B> for Either<_, B> {}
enum Either3<A, B, C> {...}
enum Either4<A, B, C, D> {...}
enum Either5<A, B, C, D, E> {...}
// ...
And then
// Maybe include a type alias for convenience? Then you can run into other
// name collisions in the current Rust module where the code is generated into.
type FooParam = Either3<String, bool, u32>;
fn foo<P>(force: bool, param: P) -> bool
where
P: Into<FooParam>,
{
// ...
}
With #238 and #246, this issue can now be worked around. There is an example on how to deal with duplicate method signature here: overloaded.rs.
I'm going to keep this issue open, as I think we can do a bit better (and implement some of the ideas discussed above), but this is a good first step!
Hello any work around or solution for ethcontract-generator when there are overloaded functions in the abi file. Most of ERC721 abi now will have two safeTransferForm function, which will lead to compile error when using the ethcontract-generator. Currently I have to remove one of two functions to avoid the compile error.