trurl
trurl copied to clipboard
Please support zero-sized fragment and query
scheme://host/path/#
and scheme://host/path/
are different and so is scheme://host/path/?
.
It would be nice if --get
lets you differentiate them and and --set
lets you produce them.
I think this is a limitation of libcurl's urlapi
From the manual for curl_url_get:
CURLUPART_QUERY
The initial question mark that denotes the beginning of the query part is a delimiter only. It is not part of the query contents.
A not-present query will lead part to be set to NULL. A zero-length query will lead part to be set to a zero-length string.
The query part will also get pluses converted to space when asked to URL decode on get with the CURLU_URLDECODE bit.
I was mainly referring to it normalising scheme://host/path/?
as scheme://host/path/
even though it can distinguish no query and empty query. But yeah, also note that it can only do that for queries, not fragments.
$ ./foo 'scheme://host/path/'
in: scheme://host/path/
out: scheme://host/path/
query: NULL
frag: NULL
$ ./foo 'scheme://host/path/?'
in: scheme://host/path/?
out: scheme://host/path/
query:
frag: NULL
$ ./foo 'scheme://host/path/#'
in: scheme://host/path/#
out: scheme://host/path/
query: NULL
frag: NULL
$ ./foo 'scheme://host/path/?#'
in: scheme://host/path/?#
out: scheme://host/path/
query:
frag: NULL
$ ./foo 'scheme://host/path/?#hello'
in: scheme://host/path/?#hello
out: scheme://host/path/#hello
query:
frag: hello
$ ./foo 'scheme://host/path/?hello#'
in: scheme://host/path/?hello#
out: scheme://host/path/?hello
query: hello
frag: NULL
libcurl always normalises empty query/fragment as no query/fragment; and it does not provide a way to distinguish empty fragment from no fragment.
#include <curl/curl.h>
int main(int const argc, char const *const argv[])
{
if (argc != 2)
return 1;
CURLU *const uh = curl_url();
curl_url_set(uh, CURLUPART_URL, argv[1],
CURLU_NON_SUPPORT_SCHEME|
CURLU_GUESS_SCHEME|
CURLU_URLENCODE);
char *url;
curl_url_get(uh, CURLUPART_URL, &url, CURLU_DEFAULT_PORT);
char *query;
curl_url_get(uh, CURLUPART_QUERY, &query, CURLU_DEFAULT_PORT);
char *frag;
curl_url_get(uh, CURLUPART_FRAGMENT, &frag, CURLU_DEFAULT_PORT);
printf("in:\t%s\n"
"out:\t%s\n"
"query:\t%s\n"
"frag:\t%s\n",
argv[1], url, query ? query : "NULL", frag ? frag : "NULL");
curl_free(url);
curl_free(query);
curl_free(frag);
curl_url_cleanup(uh);
return 0;
}
Yes, I was hoping that it could be reflected in trurl as well. (and curl
itself is in the good bucket for that :))
Thank you for providing also the full demo code :)
See https://github.com/curl/curl/pull/13396
With libcurl supporting empty queries and fragments now, how do you think we should enable this in trurl?
Probably would be useful to have --unset
to clean up query and frag, and make so trurl scheme://host/path --set query=""
would return scheme://host/path?
.
and have --get {component} return nothing if not present and the empty line if present.
But those would be breaking changes.
But those would be breaking changes.
I think we are free to do breaking changes if we want, at least before an official version one. I think the bigger problem is that they would work differently depending on what the underlying libcurl in use supports...
another (clunky) solution may be a new flag--allow-empty
for --get
and --set
?
or something clever with the the modifiers in the get brackets --get
, something like --get "{empty:query} {empty:fragment}"
? Im not sure how this would work with setting empty fields though.
Im not sure how this would work with setting empty fields though.
I figure we might need to do some other syntax extension/change for that. Maybe
- Without assign:
--set fragment
- Or with another character instead of assign, like a colon:
--set fragment:
Of course, this would only work for query and fragment. Maybe path?
There is also the shell problem: how would a script differentiate between a blank query and a non-existing one?
trurl example.com/? -g a{query}a
vs
trurl example.com/ -g a{query}a
What is the expected output for a zero length query vs a non-existing one?
I guess the latter has to report an error somehow, maybe adding a {fail:component}
modifier so both behaviors are supported?
#336 at least partly satisfies this.
@lu-zero does this satisfy your use case or is there anything more you want/need to differentiate empty/missing components for?
I think it is enough, thank you :)