uniffi-rs icon indicating copy to clipboard operation
uniffi-rs copied to clipboard

Params without a default appearing after params with a default will cause Python syntax errors

Open mhammond opened this issue 2 years ago • 9 comments

As seen in https://github.com/mozilla/application-services/issues/5738 - nothing in Uniffi prevents you from specifying a param without a default value after a param with one. In this scenario, the Python bindings will generate code causing:

SyntaxError: non-default argument follows default argument

mhammond avatar Jul 21 '23 20:07 mhammond

Would the correct solution here be to throw an error if that is detected in the UDL or to reorder the arguments behind the scenes or something else?

jeddai avatar Jul 21 '23 20:07 jeddai

I thought this was just related to functions, but it's also a problem for records with default values. Eg, consider this app-services record - Python generates a constructor for this:

class InsertableBookmark:

    def __init__(self, guid = DEFAULT, parent_guid, position, date_added = DEFAULT, last_modified = DEFAULT, url, title = DEFAULT):
        ...

which blows up in the same way.

mhammond avatar Nov 03 '23 19:11 mhammond

I think this would be solvable if we insisted on kwargs - eg, if the sig was __init__(self, **kwargs): and lots of boilerplate that unpacked the kwargs dict etc. This would however be a breaking change for Python consumers. How much appetite do we have for that kind of breaking change? CC @jeddai @badboy who are the only consumers of Python bindings I know...

(Edit to add: I think the current situation is a bit of a footgun. Eg, InsertableBookmark(a, b, c, d) isn't very readable, whereas InsertableBookmark(guid=a, parent_guid=b, position=c, date_added=d) seems like a significant improvement. While that's possible today IIUC, it's not mandatory, which is what using kwargs would cause)

mhammond avatar Nov 03 '23 20:11 mhammond

I think this would be solvable if we insisted on kwargs - eg, if the sig was __init__(self, **kwargs): and lots of boilerplate that unpacked the kwargs dict etc. This would however be a breaking change for Python consumers. How much appetite do we have for that kind of breaking change? CC @jeddai @badboy who are the only consumers of Python bindings I know...

The downside of that is that we lose the self-documentation: argument names act as documentation if properly chosen. Glean would probably not affected much, given our limited use of Glean Python, though we would need to still ensure it works going forward.

(Edit to add: I think the current situation is a bit of a footgun. Eg, InsertableBookmark(a, b, c, d) isn't very readable, whereas InsertableBookmark(guid=a, parent_guid=b, position=c, date_added=d) seems like a significant improvement. While that's possible today IIUC, it's not mandatory, which is what using kwargs would cause)

Aparently one can make named parameters mandatory:

def func(*, guid, parent_guid, position):
    pass

See https://docs.python.org/3.5/reference/compound_stmts.html#function-definitions (via https://stackoverflow.com/questions/2965271/how-can-we-force-naming-of-parameters-when-calling-a-function).

This also allows default values before non-defaulted arguments

def func(*, guid, parent_guid=123, position):
    pass

I'm not really sure about the way forward though. None of the available options are objectively the right thing all the time (and I dislike the kwargs solution the most). Maybe instead UniFFI should just error out in this case, as Python does?

>>> def func(a=1, b): pass
  File "<stdin>", line 1
    def func(a=1, b): pass
                  ^
SyntaxError: non-default argument follows default argument

Then it's up to the dev to define it correctly and they are still free to write a wrapper the way they want in pure Python.

badboy avatar Nov 07 '23 15:11 badboy

Thanks!

The downside of that is that we lose the self-documentation: argument names act as documentation if properly chosen.

Right - but I'm saying that's a downside of the current situation - it's currently fine to say InsertableBookmark(a, b, c) whereas IMO it's preferable to insist on being forced to specify the argument names. Reading InsertableBookmark(a, b, c) means you need to cross-reference the record definition to work out what a is, etc.

(Unless you mean self-documenting when reading the constructor implementation - I don't really think that would end up being a big deal - the names would still be there, just shifted. As you note though, the * syntax seems to have the best of both worlds)

(To be clear, we'd never just take kwargs and stick it in dict - we'd need to extract the named elements individually, so there's be ample opportunity to add comments and docs)

Aparently one can make named parameters mandatory:

Ah, I wasn't aware of that syntax, so yeah, that would be better than dealing with kwargs explicitly!

None of the available options are objectively the right thing all the time (and I dislike the kwargs solution the most).

Can you please explain what you would hate about that syntax. I think we'd end up with:

class InsertableBookmark:
    def __init__(self, *, guid = DEFAULT, parent_guid, position, date_added = DEFAULT, last_modified = DEFAULT, url, title = DEFAULT):
        ...

which seems quite perfect to me, and IMO is objectively the right thing. What does the right thing look like to you?

Maybe instead UniFFI should just error out in this case, as Python does?

>>> def func(a=1, b): pass
  File "<stdin>", line 1
    def func(a=1, b): pass
                  ^
SyntaxError: non-default argument follows default argument

My problem with this is that I don't think UniFFI should enforce this arbitrary restriction just because of Python. app-services has used a record like this for many years and no one has ever considered it "bad" - I think it's been seen as perfectly natural for record with many fields and works perfectly fine with Swift and Kotlin.

It also means it would be impossible to evolve the record to introduce a new default without also reordering the elements.

mhammond avatar Nov 07 '23 16:11 mhammond

(and to be clear, I'm talking here about just constructors for records and not for arbitrary functions. I think there's a case to be made to do this for functions too, but I think we should split the concerns out as it also seems reasonable to just do this for record constructors.

mhammond avatar Nov 07 '23 16:11 mhammond

(and to be clear, I'm talking here about just constructors for records and not for arbitrary functions. I think there's a case to be made to do this for functions too, but I think we should split the concerns out as it also seems reasonable to just do this for record constructors.

I think this was the missing piece for me. I thought mostly about functions, less so about constructors only. I'm ok with chosing this for constructors then. but see below.

Right - but I'm saying that's a downside of the current situation - it's currently fine to say InsertableBookmark(a, b, c) whereas IMO it's preferable to insist on being forced to specify the argument names. Reading InsertableBookmark(a, b, c) means you need to cross-reference the record definition to work out what a is, etc.

(Unless you mean self-documenting when reading the constructor implementation - I don't really think that would end up being a big deal - the names would still be there, just shifted. As you note though, the * syntax seems to have the best of both worlds)

I thought about the generated documentation for that piece of code. The docs will list the function name and its arguments. With the kwargs-approach all you see is the function name and **kwargs. Fields then need to be documented manually (which currently isn't even possible, the docs PRs hopefully would solve that). Of course the self, *, param1, ... solution side-steps that.

Can you please explain what you would hate about that syntax. I think we'd end up with:

class InsertableBookmark:
    def __init__(self, *, guid = DEFAULT, parent_guid, position, date_added = DEFAULT, last_modified = DEFAULT, url, title = DEFAULT):
        ...

which seems quite perfect to me, and IMO is objectively the right thing. What does the right thing look like to you?

It probably is with that many parameters. But is it when there's only two parameters x, y?

Maybe instead UniFFI should just error out in this case, as Python does?

>>> def func(a=1, b): pass
  File "<stdin>", line 1
    def func(a=1, b): pass
                  ^
SyntaxError: non-default argument follows default argument

My problem with this is that I don't think UniFFI should enforce this arbitrary restriction just because of Python. app-services has used a record like this for many years and no one has ever considered it "bad" - I think it's been seen as perfectly natural for record with many fields and works perfectly fine with Swift and Kotlin.

It also means it would be impossible to evolve the record to introduce a new default without also reordering the elements.

Instead UniFFI would enforce the pattern of requiring names. It's still an explicit (arbitrary) design decision UniFFI makes on how people need to structure their code. (Side-note: Got a link to that app-services record? I'm curious)

I'm leaning towards that this is a good pattern and that it's ok to enforce that for constructors, so I won't block if we're going forward with it

but I am curious to hear @jeddai's thought on this too

badboy avatar Nov 08 '23 10:11 badboy

(Side-note: Got a link to that app-services record? I'm curious)

The 3 structs here all do something similar. I only noticed this when I tried to create Python bindings for that component.

mhammond avatar Nov 08 '23 15:11 mhammond

After talking yesterday I went ahead today and at least wrote a fixture to capture this and implemented the *, args way. It's easy. Then I realized:

  1. We don't even set default values for Ruby :O
  2. I don't think there's a similar way in Ruby to do the equivalent of *, args.

badboy avatar Nov 10 '23 12:11 badboy