prql
prql copied to clipboard
filter based on a list of values
Hi team,
Is the SQL IN method currently supported in PRQL? I didn't manage to find a single example in the book.
WHERE column_name IN (value1, value2, ...)
This would be great.
We probably have to add it into the compiler (can't be done in stdlib); either specific to the in function, or to compile a PRQL list into a SQL list in arbitrary places.
I think either would be fine to start, but maybe @aljazerzen or others have examples where e.g. we'd want to treat a list differently depending on the context.
This might be a reasonable 2nd PR, or a 1st PR for someone familiar with rust.
Thanks for this @max-sixty. I'm using the s string as a workaround for now :). I'm not quite familiar with rust to write a PR, but would definitely like to use PRQL more often to provide useful feedbacks. Really appreciate your effort in this amazing project
From #2170
Currently there's almost a clear split:
Symbols, such as
+,*, are all infix — i.e. they go between operands,5 + 8,4 * 3
- Some are prefix, such as
!or-when they're unaryFunctions are all prefix — i.e. they go before operands — i.e.
sum foo,in 5..10 bar
- And then the pipe symbol lets them go after operands — i.e.
foo | sum,bar | in 5..10, which is how we normally use them...except for
and&or, which are infix.
I am wondering if this perspective affects the in operator. (i.e., I have a feeling that the in operator cannot be symbolized)
In R, the in operator is %in%, is there a possibility of something like that?
If it's gonna be named in, this should probably be a function as you note. Which means it would have syntax like this:
from a
filter (x | in [1, 2, 3])
If we come up with a symbol for it to be a binary operator, syntax would be:
from a
filter x % [1, 2, 3]
I really wouldn't like to go with the third option of having %in% that you proposed - it would be hard to parse and I don't find it more readable than the pipeline syntax.
Thanks, in function x | in [1, 2, 3] looks good!
Oh, I didn't realize that in function already exists for another use. (#1330)
from employees
filter (age | in ..50)
Perhaps the current in function may need to be renamed?
Perhaps the current
infunction may need to be renamed?
I think ideally we should support both usages, with the same name...
I also agree with supporting both uses of in with the same name and syntax like x | in [1, 2, 3].
Considering #2612, it would be filter (x | in {1, 2, 3})?
Or, it seems [1, 2, 3] is an array, so filter (x | in [1, 2, 3])?
We want an array (all entries are the same type, no names) which would now be filter (x | in [1, 2, 3]).
We could also support filter (x | in {true, 2, 'hello'}), but that's much harder to implement.