edgedb-cli
edgedb-cli copied to clipboard
In repl implement variables of non-string types
Currenly, only strings are accepted
(hint: you can cast strings to other types in the edgeql itself)
(hint: you can cast strings to other types in the edgeql itself)
Maybe this should be the preferred way of how we implement the feature -- add a compiler flag to accept all arguments as text, basically implicitly compiling SELECT <int32>$0
as SELECT <int32><str>$0
We already have json_parameters
, wouldn't that work?
Another question: how do we input strings? Currently there's no requirement for quotes, e.g.
Parameter <str>$foo> blah blah
translates into "blah blah"
string.
This is problematic as it's unclear how to enter an array of strings.
IMO, while it's slightly inconvenient, we should require the user to enter EdgeQL literals, so string/json/datetime quotes should be required.
IMO, they should be proper literals.
IMO, they should be proper literals.
Yep. And then we can syntax highlight the input.
That sounds like over-engineering: the primary use case for the feature is to quickly past query from the code an fill in the gaps. It's unlikely that there would be any complex texts that can't be typed directly. If they are in 1% of cases, you can just edit the original query and put complex constant there.
That said, we can do something like this, if you think this is a crucial feature:
Parameter <str>$var (use Ctrl+P for quoted string):
Maybe this should be the preferred way of how we implement the feature -- add a compiler flag to accept all arguments as text
As I've said in the zoom, we can have much shorter feedback loop if we parse/validate things directly in repl: we can highlight invalid value as you type, and show error and repeat input right after return key. In case you have a query with more than one parameter, it could be a huge time saver.
That sounds like over-engineering: the primary use case for the feature is to quickly past query from the code an fill in the gaps.
Yes, but if the query has an <array<str>>$param
it becomes unclear how to enter it. The uncertainty goes away if we stick with EdgeQL literals syntax. It's not over-engineering, it's consistent design.
Parameter
$var (use Ctrl+P for quoted string):
Not sure I understand the idea. Do you want to add quotes/escapes once Ctrl+P is pressed? I'm not sure we need this unless a lot of users request that.
As I've said in the zoom, we can have much shorter feedback loop if we parse/validate things directly in repl: we can highlight invalid value as you type, and show error and repeat input right after return key. In case you have a query with more than one parameter, it could be a huge time saver.
Yeah, I hear you loud and clear and I agree. There are upsides and some downsides, mainly:
-
The implementation is way more complex than simply letting EdgeDB server / Postgres to do the encoding. While this leads to potentially better UX it adds extra code for what is a relatively minor feature.
-
(and this it the main one) When we have extensions API we will allow extensions to implement their own scalars. This is only a question of "when" because we need to add support for GIS. Extensions will be required to provide JSON/str encoders and decoders for their added scalar types. When this happens, it would be painful if at all possible to implement support for those custom scalars in our REPL.
IMO https://github.com/edgedb/edgedb-cli/issues/15#issuecomment-631242953 is the way to go.
Yes, but if the query has an <array
>$param it becomes unclear how to enter it. The uncertainty goes away if we stick with EdgeQL literals syntax. It's not over-engineering, it's consistent design.
And also convenience goes away with it.
I would go with line per item. With Ctrl+D or alike to end the list.
Parameter $var (use Ctrl+P for quoted string):
Not sure I understand the idea. Do you want to add quotes/escapes once Ctrl+P is pressed? I'm not sure we need this unless a lot of users request that.
Yes, I mean "adding quotes and switch to quoted string version", so you can use escape codes if you need to. And yes, I think this features is not needed for 99% use cases, and is rare enough that you can edit you query if you need escape codes.
The implementation is way more complex than simply letting EdgeDB server / Postgres to do the encoding. While this leads to potentially better UX it adds extra code for what is a relatively minor feature.
The whole point of having REPL is UX. We could go with python -c 'import edgedb; edgedb.connect()
otherwise.
So making tool harder to use in 99% use cases is counter-intuitive. You can insert values into placeholders yourself if you have enough time. Editing is even better because you don't have to go up in the history for each variable after you made a typo. No quotes, and immediate feedback means less typos, and means you can test queries from the code right away.
When we have extensions API we will allow extensions to implement their own scalars. This is only a question of "when" because we need to add support for GIS. Extensions will be required to provide JSON/str encoders and decoders for their added scalar types. When this happens, it would be painful if at all possible to implement support for those custom scalars in our REPL.
How do we format those types? How do we support those types in language bindings?
And also convenience goes away with it.
Not really, I'm fine with typing proper string literals in REPLs. In EdgeDB repl, in Python repl, etc. I really don't understand why entering specifically string arguments should be any different.
I would go with line per item. With Ctrl+D or alike to end the list.
What if we add two dimensional arrays in EdgeDB 2.0?
The whole point of having REPL is UX. We could go with python -c 'import edgedb; edgedb.connect() otherwise.
UX also benefits from consistency. If I use proper string literals in my queries why would I type them differently when they are arguments? I bet some people would input them as literals and then would be confused why their input is double quoted. Do I need to escape some symbols in this special mode? What if I want to type a regexp? Is my input going to be classified as a raw string or not?
So IMO what you're suggesting is actually bad UX.
No quotes, and immediate feedback means less typos, and means you can test queries from the code right away.
You can provide immediate feedback and highlight syntax with quotes too.
How do we format those types? How do we support those types in language bindings?
We'll likely need to add another option to the protocol specifically for our REPL to render non-native scalars as strings and accept them as strings. Working with them in language binding would require a native extension to that binding.
"Special cases aren't special enough to break the rules." The more we are discussing this the more I believe that having a special input mode for string literals for query arguments would be a mistake.
Personally, I don't find having to type (single) quotes around a literal inconvenient or even noticeable. Especially so, if the editor adds the closing quote automatically like modern IDEs do.
In general, I agree with @1st1 and @elprans. I'm especially worried about escaping characters in strings as muscle memory will be at odds with what the eyes are seeing in the input prompt. One obvious use of REPL is to test some queries before putting them into code. Testing often implies trying at least one or two "tricky" inputs. So stuff like 'I\'m "happy"'
and the like. Copy-pasting these values from code will also be annoyingly messy if the syntax is not consistent with actual literals.
And regexps become a nightmare if escaping is ambiguous (but @1st1 already mentioned regexp).
No quotes, and immediate feedback means less typos, and means you can test queries from the code right away.
You can provide immediate feedback and highlight syntax with quotes too.
No, if the only thing that edgedb tells me is "it's a json value" (I can't say its int or a date or a string).
What if we add two dimensional arrays in EdgeDB 2.0?
There are certainly things that are specific for a type. For date/time types we should have something like Ctrl+N
for now. For some types, e.g. for bytes, we might always require escaped input.
So we can either make the edgeql-style default. Or JSON-style input, or whatever we decide for that specific type.
Testing often implies trying at least one or two "tricky" inputs. So stuff like 'I'm "happy"' and the like.
I don't understand a use case. 90% times there is an ID and all I do is putting different ids, and other 9% is probably some filter which is perhaps enum. What do you test? That variables properly work or that postgres is able to store a quote in a string?
And regexps become a nightmare if escaping is ambiguous (but @1st1 already mentioned regexp).
Well, there aren't enough use cases for regexps in DB. It's probably going to be used often in constraints. I don't think parametrizing DDL is a common use case.
Even when I'm using regexps in queries (which is very very rare, because there are no indexes for regexps), I don't parametrize them (giving the chance for DB to have compiled version of it cached, I'm not sure if that actually works, though). While I admit, there are still reasons of why it can be parameterized by chance...
Anyway, being "raw by default" is easier for regexps not vice versa. So I don't understand the argument.
Two more things:
- If we go with "json_parameters" we:
- We don't have type info about parameter (so can't provide a shortcut for
datetime_current()
for example) - The input syntax is JSON not EdgeQL. E.g.
1_000
will not be supported when we support this in EdgeQL. - We can't switch syntax by some Ctrl + something shortcut
- We don't have type info about parameter (so can't provide a shortcut for
- What if we have a type that has quite bad representation in JSON? E.g. if we represent
cal::relativedelta
as a{"days": 12, "seconds": 120}
(which looks quite good for JSON use-cases for me), we can't support nicer12 days 2 minutes
syntax that is a way more human-writable.
At the end of the day:
I'm +1 okay with being able to use EdgeQL syntax for constants
I'm -1 to have it default for strings and string-like types like dates (anything having to_xxx(s: str)
can accept that format, perhaps everything else might default to EdgeQL syntax)
I'm -1 on hiding types from repl behind json_parameters
And I'm -0 on supporting JSON syntax for inputting variable at all (even if not default)
If we go with "json_parameters" we: We don't have type info about parameter (so can't provide a shortcut for datetime_current() for example)
We can have type info about parameter. We first Parse
the query once normally -- we now have a descriptor for query input. All types are known. We then Parse
the query again with json_paramaters
. Now values can be submitted as JSON but you can do some additional validation/syntax highlight.
Alternatively we can possibly even extend our typedesc (it supports type annotations) so that only one Parse
is needed, but I'd keep it simple.
The input syntax is JSON not EdgeQL. E.g. 1_000 will not be supported when we support this in EdgeQL.
That's true and a bit unfortunate. But using JSON is our only safe bet to support future EdgeDB third-party extensions (including our own GIS).
We can't switch syntax by some Ctrl + something shortcut
If you follow the approach with parsing the query two times you can still implement that.
What if we have a type that has quite bad representation in JSON? E.g. if we represent cal::relativedelta as a {"days": 12, "seconds": 120} (which looks quite good for JSON use-cases for me), we can't support nicer 12 days 2 minutes syntax that is a way more human-writable.
Well, again, given that we use JSON as encoding protocol and if we parse the query 2 times we can have alternative syntax mode too, it would be parsed and encoded in JSON for some types, e.g.:
db> SELECT <cal::relativedelta>$foo;
Please provide values for query parameters in JSON.
Press Crtl+P for alternative input methods.
Parameter <cal::relativedelta>$foo (as JSON) > {"days": 12, "seconds": 120}
or if the use presses Ctrl+P:
db> SELECT <cal::relativedelta>$foo;
Please provide values for query parameters as prompted.
Press Crtl+P for alternative input methods.
Parameter <cal::relativedelta>$foo (as DAYS/SECONDS) > 12/120
But I'm not sure this all is needed -- this feature is very niche. We want the feature to be extendable in the future, but it's not like we have to invest significant resources to make the best possible UX for it.
What if we have a type that has quite bad representation in JSON?
We have a rule that every scalar data type must have a valid text representation and hence can be transmitted as a JSON string. This applies to all current and future data types, including those from third party extensions. If we choose to parse input on the client side we will have to force users to make their arguments <str>
, which somewhat defeats the point of being able to quickly paste and run a query unmodified.
Having thought about this more and Yuy's idea of doing Parse
twice we can actually try and implement a hybrid solution: the CLI can handle all known types as EdgeQL constants (including fancy input like 1_000_000), require text input for unsupported types, and run the query with json_parameters
if any text inputs were done. This requires the ability to render known types into text on the part of the CLI.
Ultimately, I agree with Yury. Although this functionality us a nice quality of life improvement, it's not mission-cirtical. Let's just make sure we only accept valid EdgeQL literals as input (i.e. no hacks allowing omission of quotes etc), and revisit this when we have the opportunity.
We can have type info about parameter. We first
Parse
the query once normally -- we now have a descriptor for query input. All types are known. We thenParse
the query again withjson_paramaters
. Now values can be submitted as JSON but you can do some additional validation/syntax highlight.
That sounds like more work, not less. I.e. with current approach we need text -> native type
converter (we need native type -> binary
for protocol impl anyway). Now with your approach we need text -> native type -> json
conversion.
If we're doing more work, I'd say we may have a partial json_parameters mode that only wraps unknown types (either explicitly selectively or implicitly all unknown types). Not sure how hard is that in edgedb. But technically we can do that in client as we already have query rewrite mechanism in Rust (and we know that query is syntactically correct by first parse).
The good thing about the latter approach is that we can skip it until we implement actual extensions (yes the danger is that extension types will not be supported as the input in old version of CLI, but that's acceptable thing in my opinion).
And yes, I think we should implement packers/unpackers for all our own extensions anyway. Since this is an order of magnitude easier comparing to all other stuff in extension, like figuring out semantics, indexes, an so forth.
I.e. we actually spent more time on continuation prompt (and going to spent more, as second PR is not yet landed), which adds miserable UX difference compared to this.
I'd say we may have a partial json_parameters mode that only wraps unknown types (either explicitly selectively or implicitly all unknown types). Not sure how hard is that in edgedb.
-1 on implementing "partial json mode" in the protocol. Again, I stand by my suggestion in https://github.com/edgedb/edgedb-cli/issues/15#issuecomment-633766403 to use 2 prepares.
So basically here are three options:
- Accept EdgeDB literals
- Accept JSON literals
- Have custom input for each type
Basically (1), (3) both require custom input for every type so are generally have the same amount of work.
With (2) There might be more work on conversions text -> native type -> json
, but that only for each type we want to have nicer editing support (i.e. validation, insert "now" for date, or underscores in numbers), so I'm not sure what the overall difference is.
So what I think, sooner or later we want the following:
- Native literals, but not default for simple types like numbers and strings
- Copy-pasteable JSON into variable (so extra quoting is harder)
- Validation and hinting
- In studio we also want datepicker for dates and copy-pasteable strings, so converting for JSON will not help studio.
And yes, for unknown types two Prepare
approach + rewriting query (<custom_type>$x
becomes <custom_type>(<json>$x)
) doesn't sound too bad for me.
Afterthought: for implementing native literals we probably can also use rewriting query approach (and rely on constant extraction for optimization), but I'm not sure this is a great idea.
Native literals, but not default for simple types like numbers and strings
I still think that strings should require explicit quotes, e.g.
$foo > 'aaa'
Basically (1), (3) both require custom input for every type so are generally have the same amount of work.
Not for every type, but the most common ones. Although I think this is such a minor feature and it's an open question what percentage of users will ever even discover it. I wouldn't even mind if we explicitly prompted users to enter JSON literals and be done with this.
Native literals, but not default for simple types like numbers and strings
I still think that strings should require explicit quotes, e.g.
Like I've already said:
- This makes it harder to paste something (i.e. should I open single or double quote before pasting? Escaping?)
- This requires client-side parser anyway (see below)
Basically (1), (3) both require custom input for every type so are generally have the same amount of work.
Not for every type, but the most common ones. Although I think this is such a minor feature and it's an open question what percentage of users will ever even discover it. I wouldn't even mind if we explicitly prompted users to enter JSON literals and be done with this.
How do you expect this to work if only "most common ones"? E.g. while datetime literal is just '2019-07-05Т01:01:01'
, the actual thing we must send is milliseconds since epoch. So there should be not just parser of the string literal but the parser of the actual date, regardless of whether we ask user to include quotes or not.
So basically here are three options: Accept EdgeDB literals
I think this is the best and most consistent approach. I don't think it would be too much trouble for the CLI (and other frontends) to render numerics and booleans into JSON and assume that everything else is a string literal representing a valid type rendering (no interpretation necessary).
Oh, and since we know the type of the argument, we don't have to force users to enter quotes around the value, although it's unclear what to do if they do enter quotes, i.e do we double-quote them or assume the value as-is?
unclear what to do if they do enter quotes, i.e do we double-quote them or assume the value as-is?
This is why I don't like the idea of "simplifying" the string literal input. IMO we should require quotes always, by default. We can expose a REPL config setting to disable it though.
This is why I don't like the idea of "simplifying" the string literal input.
Well, the motivation for this feature is user convenience, and not having to type quotes is a nice QOL improvement. I think we can detect if the passed value is already quoted (starts with '
or "
) and not quote it. If you really need to send '"foo"'
(nested quotes) you can still do it by typing all those quotes literally.
Think about it this way: you don't enter string values into forms in quotes, and, well, this prompt is sort-of a form (and will be a form in GUI).