graphql-relay-py
graphql-relay-py copied to clipboard
Paginate without requiring length
The current implementation requires you to pass the total length of the iterable, and this usually means making an extra query to find out the count. There's no need for this, since we can actually accomplish the same results by trying to fetch n+1 results, which will mean that there are more to be consumed, without the need for the real total count.
This is the custom implementation we have applied to spare us the count query, which can be costly and adds an unnecessary overhead to every query. We only need the real total count, when that value is requested (using a custom CountableConnection
)
class CountableConnection(graphene.relay.Connection):
class Meta:
abstract = True
total_count = graphene.Int()
@staticmethod
def resolve_total_count(root, info, *args, **kwargs):
try:
t_count = root.iterable.count()
except:
t_count = len(root.iterable)
return t_count
def connection_from_list_slice(list_slice, args=None, connection_type=None,
edge_type=None, pageinfo_type=None):
'''
Given an iterator it consumes the needed amount based on the pagination
params, and also tries to figure out if there are more results to be
consumed. We do so by trying to fetch one more element than the specified
amount, if we were able to fetch n+1 it means there are more to be consumed.
This spares the caller passing the total count of results, which
usually means making an extra query just to find out that number.
'''
from graphql_relay.utils import base64, unbase64, is_str
from graphql_relay.connection.connectiontypes import Connection, PageInfo, Edge
connection_type = connection_type or Connection
edge_type = edge_type or Edge
pageinfo_type = pageinfo_type or PageInfo
args = args or {}
before = args.get('before')
after = args.get('after')
first = args.get('first')
last = args.get('last')
if first:
after = get_offset_with_default(after, -1) + 1
_slice = list_slice[after: max(after, 0) + first + 1] # fetch n+1
items = _slice[:-1]
if len(items) < first:
items = _slice[:] # If there are not enough, get them all
edges = [
edge_type(
node=node,
cursor=offset_to_cursor(after + i)
)
for i, node in enumerate(items)
]
elif last:
if before:
before = get_offset_with_default(before)
_slice = list_slice[max(before-last-1, 0):before] # fetch n+1
else:
# beware that not specifying before results in the need
# to calculate the total amount
_slice = list_slice[(last*-1)-1:]
try:
before = list_slice.count()
except:
before = len(list_slice)
items = _slice[1:]
if len(items) < last:
items = _slice[:] # If there are not enough, get them all
edges = [
edge_type(
node=node,
cursor=offset_to_cursor(before - last -1 + i)
)
for i, node in enumerate(items)
]
else: # we are not allowing to pass after/before without first/last
items = list_slice[:]
edges = [
edge_type(
node=node,
cursor=offset_to_cursor(i)
)
for i, node in enumerate(items)
]
first_edge_cursor = edges[0].cursor if edges else None
last_edge_cursor = edges[-1].cursor if edges else None
has_previous_page = False
if (isinstance(last, int) and len(_slice) > last) or after > 0:
has_previous_page = True
return connection_type(
edges=edges,
page_info=pageinfo_type(
start_cursor=first_edge_cursor,
end_cursor=last_edge_cursor,
has_previous_page=has_previous_page,
has_next_page=len(_slice) > first if isinstance(first, int) else False
)
)
@syrusakbary Not sure if we want this as the default or at least have this as an optional stragey. If this sounds reasonable I can make a PR for this.
@syrusakbary @stefanoulis This is our updated version that fixes some bugs
def connection_from_list_slice(list_slice, args=None, connection_type=None,
edge_type=None, pageinfo_type=None):
'''
Given an iterator it consumes the needed amount based on the pagination
params, and also tries to figure out if there are more results to be
consumed. We do so by trying to fetch one more element than the specified
amount, if we were able to fetch n+1 it means there are more to be consumed.
This spares the caller passing the total count of results, which
usually means making an extra query just to find out that number.
'''
from graphql_relay.utils import base64, unbase64, is_str
from graphql_relay.connection.connectiontypes import Connection, PageInfo, Edge
connection_type = connection_type or Connection
edge_type = edge_type or Edge
pageinfo_type = pageinfo_type or PageInfo
args = args or {}
before = args.get('before')
after = args.get('after')
first = args.get('first')
last = args.get('last')
first_is_negative = type(first) == int and first < 0
last_is_negative = type(last) == int and last < 0
# Validations
if first_is_negative or last_is_negative:
raise ValidationException("first and last can't be negative values")
if first and last:
raise ValidationException(
"Including a value for both first and last is strongly discouraged,"
" as it is likely to lead to confusing queries and results"
)
if before and last is None:
raise ValidationException(
"before without last is not supported"
)
if after and first is None:
raise ValidationException(
"after without first is not supported"
)
if before and after:
raise ValidationException(
"Mixing before and after is not supported"
)
if type(first) == int:
after_offset = get_offset_with_default(after, -1)
start_offset = after_offset + 1
end_offset = start_offset + first
if type(last) == int:
if before:
before_offset = get_offset_with_default(before)
else:
try:
before_offset = int(list_slice.count())
except (TypeError, AttributeError):
before_offset = len(list_slice)
end_offset = before_offset
start_offset = before_offset - last if before_offset - last > 0 else 0
_slice = list_slice[start_offset:end_offset]
edges = [
edge_type(
node=node,
cursor=offset_to_cursor(start_offset + index)
)
for index, node in enumerate(_slice)
]
# To know if there are more pages we just check if there is n+1 element
# In this case it would just be the end_offset element because array indexes
# start from zero
try:
list_slice[end_offset]
has_next_page = True
except IndexError:
has_next_page = False
has_previous_page = start_offset > 0
first_edge_cursor = edges[0].cursor if edges else None
last_edge_cursor = edges[-1].cursor if edges else None
return connection_type(
edges=edges,
page_info=pageinfo_type(
start_cursor=first_edge_cursor,
end_cursor=last_edge_cursor,
has_previous_page=has_previous_page,
has_next_page=has_next_page
)
)
@sebastiandev Thanks for the snippets. Just quickly tested and it seems that there is an error when no pagination arguments are passed on, otherwise it seems to work well.
All in all It would be great if you could follow up with a PR - even if it is not merged quickly it would make it much easier for others to integrate a custom version into their code and comment to improve it.
@sebastiandev @sliverc Keep in mind that the idea of this project is to copy the functionality from graphql-relay-js over to Python. I think it is wise to stick to that idea and just replicate what is there.
First, because it is easier to switch from one language to the other if you need. Second, because it is easier to follow the updates in the JS land. If we deviate too much, it is hard to get the latest changes back into the Python version.
However, I understand sometimes it may be desirable to improve the existing functionality or add stuff that may make more sense in Python. My first attempt would be to try to get improvements into the JS library. If that's not possible we should include them in a way that is easy to maintain as an add-on. E.g. by not changing the function we inherited from JS, but adding another function, with separate tests.
I'm currently preparing v2 and v3 of this library. The goal of v2 is to achieve compatibility with GraphQL-Core v2 and not do much more, it should stay compatible and bot break things. The goal of v3 is to achieve compatibility with GraphQL-Core v3 = GraphQL-Core-next, and also to bring it up to date with graphql-relay-py. We can introduce breaking changes in this version.
Currently, I have v3 more or less finished, but the best Python version of this function is also causing me headaches. Maybe we can discuss how we should proceed here.
Note that v3 also introduces type hints, and I'm wondering what the best equivalent type for an array in JS would be in Python. I think list
may be too restrictive. Maybe a typingCollection or a typing.Sequence? Also should we rename the term "list" in the functions and parameters? It looks too restrictive to me. You could just as well pass a tuple or a SQLAlchemy query or numpy array or anything. Maybe just rename it back to "array" like in the JS version?
@sliverc created the PR here
@sliverc created the PR here
@sliverc Can you check his PR and try to merge? Your code makes QuerySet into list and it will cause problems.
for the record @ironyee @sliverc added a fix to avoid an extra query when checking the n+1
@sebastiandev
Is this still an open issue? Are there new ways to avoid querying all data when paginating with relay.Connection
?
@Cito do you know if it's something that will be fixed in the future?
@ayyoubelamrani4 Again (see comment above), the scope of this project is to be a replicate of graphql-relay-js in Python. So as long as we faithfully replicate that functionality, nothing is "broken" on this side that needs to be "fixed".
If you are missing features that would be really useful for everyone, then the preferred way to get them into graphql-relay-py is to request them on the issue tracker of graphql-relay-js. This has many advantages:
- There is a larger developer community that could give feedback for the best implementation and API
- If it is implemented there in the reference implementation, both communities can benefit from the feature, not only the Python developers, maybe it even gets ported to other languages
- We make sure that relay-py stays compatible with relay-js when we don't implement something that later relay-js may solve in a different way
That said, I will release a new version of relay-py after core 3.3 has been released, and will then also update relay-py and consider this and some other open issues. Still, if you want to push this or other new features, the most effective way forward is to lobby for it on the relay-js issue tracker.