graphql-relay-py icon indicating copy to clipboard operation
graphql-relay-py copied to clipboard

Paginate without requiring length

Open sebastiandev opened this issue 6 years ago • 9 comments

The current implementation requires you to pass the total length of the iterable, and this usually means making an extra query to find out the count. There's no need for this, since we can actually accomplish the same results by trying to fetch n+1 results, which will mean that there are more to be consumed, without the need for the real total count.

This is the custom implementation we have applied to spare us the count query, which can be costly and adds an unnecessary overhead to every query. We only need the real total count, when that value is requested (using a custom CountableConnection)

class CountableConnection(graphene.relay.Connection):
    class Meta:
        abstract = True

    total_count = graphene.Int()

    @staticmethod
    def resolve_total_count(root, info, *args, **kwargs):
        try:
            t_count = root.iterable.count()
        except:
            t_count = len(root.iterable)
        
        return t_count

def connection_from_list_slice(list_slice, args=None, connection_type=None,
                               edge_type=None, pageinfo_type=None):
    '''
    Given an iterator it consumes the needed amount based on the pagination
    params, and also tries to figure out if there are more results to be
    consumed. We do so by trying to fetch one more element than the specified
    amount, if we were able to fetch n+1 it means there are more to be consumed.

    This spares the caller passing the total count of results, which
    usually means making an extra query just to find out that number.
    '''
    from graphql_relay.utils import base64, unbase64, is_str
    from graphql_relay.connection.connectiontypes import Connection, PageInfo, Edge

    connection_type = connection_type or Connection
    edge_type = edge_type or Edge
    pageinfo_type = pageinfo_type or PageInfo

    args = args or {}

    before = args.get('before')
    after = args.get('after')
    first = args.get('first')
    last = args.get('last')

    if first:
        after = get_offset_with_default(after, -1) + 1
        _slice = list_slice[after:  max(after, 0) + first + 1]  # fetch n+1

        items = _slice[:-1]
        if len(items) < first:
            items = _slice[:]  # If there are not enough, get them all

        edges = [
            edge_type(
                node=node,
                cursor=offset_to_cursor(after + i)
            )
            for i, node in enumerate(items)
        ]

    elif last:
        if before:
            before = get_offset_with_default(before)
            _slice = list_slice[max(before-last-1, 0):before]  # fetch n+1

        else:
            # beware that not specifying before results in the need
            # to calculate the total amount
            _slice = list_slice[(last*-1)-1:]

            try:  
                before = list_slice.count()
            except:
                before = len(list_slice)

        items = _slice[1:]
        if len(items) < last:
            items = _slice[:]  # If there are not enough, get them all

        edges = [
            edge_type(
                node=node,
                cursor=offset_to_cursor(before - last -1 + i)
            )
            for i, node in enumerate(items)
        ]

    else:  # we are not allowing to pass after/before without first/last
        items = list_slice[:]
        edges = [
            edge_type(
                node=node,
                cursor=offset_to_cursor(i)
            )
            for i, node in enumerate(items)
        ]

    first_edge_cursor = edges[0].cursor if edges else None
    last_edge_cursor = edges[-1].cursor if edges else None
    
    has_previous_page = False
    if (isinstance(last, int) and len(_slice) > last) or after > 0:
        has_previous_page = True

    return connection_type(
        edges=edges,
        page_info=pageinfo_type(
            start_cursor=first_edge_cursor,
            end_cursor=last_edge_cursor,
            has_previous_page=has_previous_page,
            has_next_page=len(_slice) > first if isinstance(first, int) else False
        )
    )

@syrusakbary Not sure if we want this as the default or at least have this as an optional stragey. If this sounds reasonable I can make a PR for this.

sebastiandev avatar Nov 04 '18 22:11 sebastiandev

@syrusakbary @stefanoulis This is our updated version that fixes some bugs

def connection_from_list_slice(list_slice, args=None, connection_type=None,
                               edge_type=None, pageinfo_type=None):
    '''
    Given an iterator it consumes the needed amount based on the pagination
    params, and also tries to figure out if there are more results to be
    consumed. We do so by trying to fetch one more element than the specified
    amount, if we were able to fetch n+1 it means there are more to be consumed.

    This spares the caller passing the total count of results, which
    usually means making an extra query just to find out that number.
    '''
    from graphql_relay.utils import base64, unbase64, is_str
    from graphql_relay.connection.connectiontypes import Connection, PageInfo, Edge

    connection_type = connection_type or Connection
    edge_type = edge_type or Edge
    pageinfo_type = pageinfo_type or PageInfo

    args = args or {}

    before = args.get('before')
    after = args.get('after')
    first = args.get('first')
    last = args.get('last')

    first_is_negative = type(first) == int and first < 0
    last_is_negative = type(last) == int and last < 0

    # Validations
    if first_is_negative or last_is_negative:
        raise ValidationException("first and last can't be negative values")

    if first and last:
        raise ValidationException(
            "Including a value for both first and last is strongly discouraged,"
            " as it is likely to lead to confusing queries and results"
        )

    if before and last is None:
        raise ValidationException(
            "before without last is not supported"
        )

    if after and first is None:
        raise ValidationException(
            "after without first is not supported"
        )

    if before and after:
        raise ValidationException(
            "Mixing before and after is not supported"
        )

    if type(first) == int:
        after_offset = get_offset_with_default(after, -1)
        start_offset = after_offset + 1
        end_offset = start_offset + first

    if type(last) == int:
        if before:
            before_offset = get_offset_with_default(before)
        else:
            try:
                before_offset = int(list_slice.count())
            except (TypeError, AttributeError):
                before_offset = len(list_slice)
        end_offset = before_offset
        start_offset = before_offset - last if before_offset - last > 0 else 0

    _slice = list_slice[start_offset:end_offset]

    edges = [
        edge_type(
            node=node,
            cursor=offset_to_cursor(start_offset + index)
        )
        for index, node in enumerate(_slice)
    ]

    # To know if there are more pages we just check if there is n+1 element
    # In this case it would just be the end_offset element because array indexes
    # start from zero
    try:
        list_slice[end_offset]
        has_next_page = True
    except IndexError:
        has_next_page = False

    has_previous_page = start_offset > 0
    first_edge_cursor = edges[0].cursor if edges else None
    last_edge_cursor = edges[-1].cursor if edges else None

    return connection_type(
        edges=edges,
        page_info=pageinfo_type(
            start_cursor=first_edge_cursor,
            end_cursor=last_edge_cursor,
            has_previous_page=has_previous_page,
            has_next_page=has_next_page
        )
    )

sebastiandev avatar Jul 03 '19 16:07 sebastiandev

@sebastiandev Thanks for the snippets. Just quickly tested and it seems that there is an error when no pagination arguments are passed on, otherwise it seems to work well.

All in all It would be great if you could follow up with a PR - even if it is not merged quickly it would make it much easier for others to integrate a custom version into their code and comment to improve it.

sliverc avatar Jul 08 '19 14:07 sliverc

@sebastiandev @sliverc Keep in mind that the idea of this project is to copy the functionality from graphql-relay-js over to Python. I think it is wise to stick to that idea and just replicate what is there.

First, because it is easier to switch from one language to the other if you need. Second, because it is easier to follow the updates in the JS land. If we deviate too much, it is hard to get the latest changes back into the Python version.

However, I understand sometimes it may be desirable to improve the existing functionality or add stuff that may make more sense in Python. My first attempt would be to try to get improvements into the JS library. If that's not possible we should include them in a way that is easy to maintain as an add-on. E.g. by not changing the function we inherited from JS, but adding another function, with separate tests.

I'm currently preparing v2 and v3 of this library. The goal of v2 is to achieve compatibility with GraphQL-Core v2 and not do much more, it should stay compatible and bot break things. The goal of v3 is to achieve compatibility with GraphQL-Core v3 = GraphQL-Core-next, and also to bring it up to date with graphql-relay-py. We can introduce breaking changes in this version.

Currently, I have v3 more or less finished, but the best Python version of this function is also causing me headaches. Maybe we can discuss how we should proceed here.

Note that v3 also introduces type hints, and I'm wondering what the best equivalent type for an array in JS would be in Python. I think list may be too restrictive. Maybe a typingCollection or a typing.Sequence? Also should we rename the term "list" in the functions and parameters? It looks too restrictive to me. You could just as well pass a tuple or a SQLAlchemy query or numpy array or anything. Maybe just rename it back to "array" like in the JS version?

Cito avatar Jul 08 '19 20:07 Cito

@sliverc created the PR here

sebastiandev avatar Sep 23 '19 15:09 sebastiandev

@sliverc created the PR here

@sliverc Can you check his PR and try to merge? Your code makes QuerySet into list and it will cause problems.

ironyee avatar Jul 06 '20 02:07 ironyee

for the record @ironyee @sliverc added a fix to avoid an extra query when checking the n+1

sebastiandev avatar Jul 06 '20 13:07 sebastiandev

@sebastiandev

Is this still an open issue? Are there new ways to avoid querying all data when paginating with relay.Connection ?

ayyoubelamrani4 avatar Dec 30 '22 17:12 ayyoubelamrani4

@Cito do you know if it's something that will be fixed in the future?

ayyoubelamrani4 avatar Jan 07 '23 10:01 ayyoubelamrani4

@ayyoubelamrani4 Again (see comment above), the scope of this project is to be a replicate of graphql-relay-js in Python. So as long as we faithfully replicate that functionality, nothing is "broken" on this side that needs to be "fixed".

If you are missing features that would be really useful for everyone, then the preferred way to get them into graphql-relay-py is to request them on the issue tracker of graphql-relay-js. This has many advantages:

  • There is a larger developer community that could give feedback for the best implementation and API
  • If it is implemented there in the reference implementation, both communities can benefit from the feature, not only the Python developers, maybe it even gets ported to other languages
  • We make sure that relay-py stays compatible with relay-js when we don't implement something that later relay-js may solve in a different way

That said, I will release a new version of relay-py after core 3.3 has been released, and will then also update relay-py and consider this and some other open issues. Still, if you want to push this or other new features, the most effective way forward is to lobby for it on the relay-js issue tracker.

Cito avatar Jan 08 '23 19:01 Cito