Open-Assistant icon indicating copy to clipboard operation
Open-Assistant copied to clipboard

537: Endpoint to list frontend users

Open Vechtomov opened this issue 2 years ago • 5 comments

Resolves #537

Vechtomov avatar Jan 08 '23 21:01 Vechtomov

I've left a couple of questions here #537, don't forget to consider them during review

Vechtomov avatar Jan 08 '23 23:01 Vechtomov

  • I'm really not a fan of the username string comparisons here. as far as I can tell, we don't even have an index to support a prefix query here (although maybe the composite index would do the work). I'd be much more in favor of a classic limit and offset pagination, instead of the lt, gt, and max_count. @olliestanley is there a particular reason we're doing the prefix queries?

@andreaskoepf would have to confirm on the reasoning for this, but I assumed it was for cases like wanting to lookup a user by username without knowing auth_method or api_client_id, so you could use these criteria to list all users with a certain username regardless of auth_method, api_client_id

olliestanley avatar Jan 10 '23 08:01 olliestanley

@andreaskoepf would have to confirm on the reasoning for this, but I assumed it was for cases like wanting to lookup a user by username without knowing auth_method or api_client_id, so you could use these criteria to list all users with a certain username regardless of auth_method, api_client_id

but then let's add a prefix filter (w/ corresponding index). probably the majority of use cases of the endpoint will be listing and paging, and it seems quite cumbersome to do that with the current implementation, especially when we impose some upper limit on the number of users returned (which we have to).

yk avatar Jan 10 '23 09:01 yk

Hi, @yk. Thanks for the good review. I've made changes but I still don't understand why we need to pass api_client_id for untrusted clients. We pass api_key with the request and it uniquely defines api_client.

Vechtomov avatar Jan 10 '23 09:01 Vechtomov

Hi, @yk. Thanks for the good review. I've made changes but I still don't understand why we need to pass api_client_id for untrusted clients. We pass api_key with the request and it uniquely defines api_client.

I think in your question, you referred to another endpoint where this is done. The reasoning there was the following: a trusted api key should be able to also see users of other keys, and would pass those keys along, i.e. saying "give me users of this other key". I honestly don't see a super good reason why we might do this, so I'd be comfortable leaving that away and just saying trusted frontends can just read globally.

yk avatar Jan 10 '23 10:01 yk

I am bit irritated by the paging discussion here.

I deliberately asked to select by interval-starting point + limit. Naive paging with offset and limit is not scalable. Selecting result elements BY OFFSET (!) means in many cases the db-engine runs a linear search over the results until it reaches the indices of the index-window to return. For larger datasets and high offsets this has very bad performance characteristics.

Regarding indices: BTREE indices store strings ordered (based on the configured collation) and are always used string comparisions == <= < > >= etc are used .. query engine will also use them for LIKE "prefix%" conditions.

andreaskoepf avatar Jan 12 '23 21:01 andreaskoepf

:x: pre-commit failed. Please run pre-commit run --all-files locally and commit the changes. Find more information in the repository's CONTRIBUTING.md

github-actions[bot] avatar Jan 12 '23 21:01 github-actions[bot]