Named Unions as function parameters
What's up?
From the discussion on #4499, we propose allowing functions to take parameters that are typed by named unions (aka enums, #4038).
The core syntax would look like this:
type InvoiceStatus = (
Paid = 0 ||
Unpaid = 1 ||
Canceled = 2 ||
)
let filter_status = func status <InvoiceStatus> tbl <relation> -> (
filter (this.status == status) tbl
)
from invoices
filter_status InvoiceStatus.Paid
If the type of the function parameter can be resolved, it should be possible to drop the explicit reference to the type name when specifying the parameter value.
from invoices
filter_status Paid
This would be handled by prepending the parameter type to the name resolution order, so InvoiceStatus would be searched first, followed then by the remaining namespace scopes.
The same would also apply for named parameters:
let filter_status = func status <InvoiceStatus>:Paid tbl <relation> -> (
filter (this.status == status) tbl
)
from invoices
filter_status # defaults to InvoiceStatus.Paid
# or
from invoices
filter_status status:Unpaid
# or, explicitly
from invoices
filter_status status:InvoiceStatus.Unpaid
With this, the join transformation in the standard library would have the updated definition:
type join_side = (
inner = "inner" ||
outer = "outer" ||
left = "left" ||
right = "right" ||
)
let join = func
`default_db.with` <relation>
condition <bool>
side <join_side>:inner
tbl <relation>
-> <relation> internal join
A similar update would also be made for from_text format:...
As an aside, it would also be worth considering a similar feature of named union namespace resolution for typed relations, when those are implemented. I could see something like this working very well:
from invoices (customer_id = text || amount = float || status = InvoiceStatus)
filter status == Paid
# or, from #2930
filter (match this.status {
Paid => true,
Unpaid { amount } => amount < 10.0,
_ => false,
})
I think the syntax in other languages such as Rust, Python and C# is much nicer.
In Rust it would be so much cleaner:
enum InvoiceStatus {
Paid,
Unpaid,
Canceled,
}
I think this would be a very nice addition to the language!
--
Do we need the mapping to ints / strings?
Or could we just have the variants like type InvoiceStatus = [Paid, Unpaid, Canceled]?
(possibly the syntax would need to be slightly different — possibly Paid etc doesn't refer to anything despite being on the right-hand side, but conceptually...)
I found the discussion of named unions in #2930 compelling, not to mention that the parsing of that syntax is already done (#4038). I would guess that we want to have just one syntax for named unions / enums, not that the existing one is perfect, but it might have certain advantages when parsing... I'm not sure, though, and I don't feel inspired to dive into the parser 😄
As for the necessity of mapping named union members to values, I think it's probably better to make it explicitly defined by users rather than implicit. I don't think there's a single convention that users would expect -- I can imagine that one user would expect [Paid, Unpaid, Canceled] to map to [0, 1, 2] while another would expect string mapping to ["Paid", "Unpaid", "Canceled"].
Requiring explicit definitions would also help a lot if named unions (as enums) end up getting cast to other data types. This casting is probably going to be very common, especially if named unions are used for typing relations, where the underlying database types are disconnected from the types defined in the PRQL context.
Requiring explicit definitions would also help a lot if named unions (as enums) end up getting cast to other data types. This casting is probably going to be very common, especially if named unions are used for typing relations, where the underlying database types are disconnected from the types defined in the PRQL context.
Yes, if we cast them to other types then I totally agree. type InvoiceStatus is indeed going to get cast, since there's no Enum type in the underlying data. Whereas type JoinDirection is only used in PRQL — but maybe there aren't many of these...
This is a nice feature overview of exactly how I think PRQL should approach sum types!
We will need to invest some time into how this will be implemented in SQL and unfortunately I won't we able to help here for some time, as I have other priorities right now.