starlette icon indicating copy to clipboard operation
starlette copied to clipboard

Make the URL.include_query_params() to support multiple query string parameters with the same name

Open lexabug opened this issue 3 years ago • 11 comments

The URL.include_query_params() overwrites query string parameters with the same name. For example:

from starlette.datastructures import URL, MultiDict

url = URL('my_test_url_example')
query_params = MultiDict([('my_id', '143155'), ('language', 'en'), ('list_p[]', 'item1'), ('list_p[]', 'original_name'), ('list_p[]', 'item_3')])
new_url = url.include_query_params(**query_params)

So the new_url will be:

URL('my_url_example?my_id=143155&language=en&list_p%5B%5D=item_3')

Which is incorrect. The specification of the URLs allows multiple query string parameters with the same name, so the backend that should process that query string must correctly treat such parameters as arrays/lists.

This change adds a new optional argument to the URL.include_query_params() that is handled a container of query string parameters and those parameters are appended to the original query string.

lexabug avatar Jul 06 '22 08:07 lexabug

I've applied suggestions of @adriangb. Take a look.

lexabug avatar Jul 07 '22 11:07 lexabug

I think it would make sense to include explode flag for serialization, so with explode=False we would get ?id=1,2,3 instead of ?id=1&id=2&id=3.

aminalaee avatar Jul 08 '22 07:07 aminalaee

I think it would make sense to include explode flag for serialization, so with explose=False we would get ?id=1,2,3 instead of ?id=1&id=2&id=3.

This is also achievable with joining items outside the include_query_params() call.

lexabug avatar Jul 12 '22 07:07 lexabug

@lexabug I would like to take a step back and try to clarify the original use case. It sounds like you are using URL as a standalone structure to parse/manipulate URLs. Is that correct?

adriangb avatar Jul 12 '22 07:07 adriangb

@lexabug I would like to take a step back and try to clarify the original use case. It sounds like you are using URL as a standalone structure to parse/manipulate URLs. Is that correct?

The original use case that lead me to this suggestion is: I have a web service that works like an API gateway, so it receives a specific requests, processes it with different middlewares and then the request is sent to a target service. Sometimes original URLs may contain query string parameters representing arrays/lists like user_ids[]=111&user_ids[]=222&user_ids[]=333. In such case my API gateway service was failing to proxy the request correctly, because of the include_query_params() couldn't accept multiple args with the same name. I hope that is clear.

lexabug avatar Jul 12 '22 09:07 lexabug

Could you use an external (i.e. non-Starlette) library to do the URL parsing/building, and then hand it off to Starlette? URL parsing is complicated and full of dragons, I would be a bit concern that this sort of change would set the precedent for Starlette providing this functionality (currently it only really provides the minimum required for other functionality in Starlette to work).

adriangb avatar Jul 12 '22 17:07 adriangb

The proposed solution have some drawbacks:

  1. adding a new argument breaks current API contract (people are not used to *args but know **kwargs and the latter one is very common across various frameworks)
  2. what would be the result if both __items and **kwargs passed?
  3. it becomes cumbersome to use include_query_params and replace_query_params in jinja templates because we don't have a flexibility to build MultiDict in the template (you can do that, but the template code will quickly get unreadable). Instead, users would have to write custom jinja plugins or prepare the value somewhere else.

I would like to propose to explore an alternative, where include_query_params and replace_query_params can see iterables (sets, lists, tuples) in kwargs as multi params. This is less invasive and should not break anything:

url.include_query_params(page=1, search='my query', tags=['tag1', 'tag2', 'tag3'])
# ?page=1&search=my%20query&tags=tag1&tags=tag2&tags=tag3

Also, keys in kwargs cannot contain any special characters like brackets, making tags[] not possible to use. This leads to another idea to use dict

url.include_query_params({
    'page': 1,
    'search': 'my query',
    'tags[]': ['tag1', 'tag2', 'tag3'],
})
# ?page=1&search=my%20query&tags[]=tag1&tags[]=tag2&tags[]=tag3

url.replace_query_params({
    'page': 2,
})
# ?page=2&search=my%20query&tags[]=tag1&tags[]=tag2&tags[]=tag3

This is the most flexible solution of all I know, but it is the most complicated and definitely a breaking change. It may make sense to introduce the third method update_query_params to do the same thing.


Another point worth to mention is that there is no any common naming convention for multiparams exists. Some frameworks expect tags[]=tag1&tags[]=tag2, some like tags=tag1&tag=tag2, others do tags=tag1,tag2,tag3. But Starlette would need to choose one.

alex-oleshkevich avatar Aug 16 '22 21:08 alex-oleshkevich

@adriangb while it is achievable by extra coding, I am sure that URL manipulation is one of the basic features of web frameworks and should be in the Starlette's core. We already have URL class which is incomplete in this sense.

alex-oleshkevich avatar Aug 16 '22 21:08 alex-oleshkevich

Do Django and/or Flask have in-depth URL manipulation utilities?

adriangb avatar Aug 16 '22 22:08 adriangb

I don't know. What I wanted to say is that if Starlette provides a tool to manipulate URLs it should be complete. Lists in query parameters are a pretty common thing.

alex-oleshkevich avatar Aug 17 '22 09:08 alex-oleshkevich

Given all the discussions and back and forths about include_query_params, I would suggest that we leave include_query_params() alone, and add an append_query_params method, taking (e.g.) an Iterable[Tuple[str, str]] as an argument. (I am pondering whether such a method should use *args or not)

jhominal avatar Aug 31 '22 21:08 jhominal

After thinking a lot, I agree that append_query_params is a nice add-on that comes along with the rest of methods of the URL class.

alex-oleshkevich avatar Oct 04 '22 08:10 alex-oleshkevich

How did you folks overcome this? Is there still a need for it? :eyes:

Kludex avatar Mar 09 '23 19:03 Kludex

I haven't found anything better than this yet. https://github.com/alex-oleshkevich/ohmyadmin/blob/master/ohmyadmin/ordering.py#L78

alex-oleshkevich avatar Mar 09 '23 21:03 alex-oleshkevich

I haven't found anything better than this yet. alex-oleshkevich/ohmyadmin@master/ohmyadmin/ordering.py#L78

2 lines... Problem solved? :eyes:

Kludex avatar Mar 09 '23 21:03 Kludex

I haven't found anything better than this yet. alex-oleshkevich/ohmyadmin@master/ohmyadmin/ordering.py#L78

2 lines... Problem solved? 👀

In templates it is very inconvenient to do like this.

alex-oleshkevich avatar Mar 09 '23 22:03 alex-oleshkevich

Do you still think the append method is the best solution here?

Kludex avatar Mar 09 '23 22:03 Kludex

Do you still think the append method is the best solution here?

Yes, it solves the problem.

alex-oleshkevich avatar Mar 10 '23 08:03 alex-oleshkevich

Do you still think the append method is the best solution here?

Yes, it solves the problem.

PR welcome for append_query_params.

Thanks for the discussion everybody, and the PR @lexabug . 🙏

Kludex avatar Jun 20 '23 19:06 Kludex