neo4j-apoc-procedures icon indicating copy to clipboard operation
neo4j-apoc-procedures copied to clipboard

[3.2] Implement an ordered, top-n collect aggregation function

Open jexp opened this issue 7 years ago • 7 comments

Implement an ordered, top-n collect

e.g. apoc.agg.collectTop(n, [count], ["^name","age"])

so that we can solve the question below like this:

MATCH (a:A)<-[:IN]-(b:B)
RETURN a, apoc.agg.collectTop(b,3,["num"])

or

MATCH (a:A)<-[:IN]-(b:B)
WITH a, apoc.agg.collectTop(b,3,["num"]) as bs
UNWIND bs as b
RETURN a.num, b.name, b.num

Sergio Murillo [19:49] Hello. I have a seemingly simple query thats evading me. I have node types A, B with A.name , B.name, B.num properties. I have this model

A.name ranges from say 1 - 3 I want to get the top 3 B ORDER BY B.num DESC for each node A

At the end I would like something like got get the top 3 B.num for each A. The reuslt would look like

1   name1   100
1   name2   75
1   name3   50
2   name4   100
2   name5   90
2   name6   40
3   name7   200
3   name8   100
3   name9   50

Any pointers?

Michael Hunger [22:08]

WITH a,b ORDER BY b.num DESC
WITH a, collect(b)[0..3] as bs
UNWIND bs as b
RETURN a.name, b.name, b.num```

jexp avatar Apr 01 '17 20:04 jexp

I'd like to give this one a try

jastax avatar Apr 04 '17 21:04 jastax

Please do, that would be great.

jexp avatar Apr 05 '17 15:04 jexp

Exactly in Neo4j 3.2 there are user defined aggregation functions.

My idea was to use the/something like the TopK Select approach that we also have in Cypher https://github.com/neo4j/neo4j/blob/3.2/community/cypher/cypher-compiler-3.2/src/main/scala/org/neo4j/cypher/internal/compiler/v3_2/pipes/TopPipe.scala#L87

I also have a Java Implementation of that which only works for distinct results, and has to be fixed

On Wed, Apr 5, 2017 at 11:37 PM, Jan St. [email protected] wrote:

I might need some help though ;-)

MATCH (a:A)<-[:IN]-(b:B) WITH a, apoc.agg.collectTop(b,3,["num"]) as bs UNWIND bs as b RETURN a.num, b.name, b.num

wouldn't that need to be

MATCH (a:A)<-[:IN]-(b:B) WITH a, apoc.agg.collectTop(collect(b),3,["num"]) as bs UNWIND bs as b RETURN a.num, b.name, b.num

or is it possible to collect the b's inside the user function?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/358#issuecomment-292004933, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEHYyodZvAI-D9aA22zgTBO47h6TIgCks5rtAmigaJpZM4Mwmeg .

jexp avatar Apr 06 '17 06:04 jexp

package org.neo4j.util;

import java.util.Arrays;
import java.util.Collection;
import java.util.Comparator;
import java.util.List;

/**
 * @author mh
 * @since 02.04.14
 */
public class TopKHeap {
    public interface IntValue<T> {
        int get(T item);
    }

    static class ReverseIntComparator<T> implements Comparator<T> {
        private final IntValue<T> intValue;

        ReverseIntComparator(IntValue<T> intValue) {
            this.intValue = intValue;
        }

        @Override
        public int compare(T o1, T o2) {
            return Integer.compare(intValue.get(o2), intValue.get(o1));
        }
    }
    public static <T> List<T> findTopKHeap(Iterable<T> items, int topK, IntValue<T> intValue) {
        ReverseIntComparator<T> reverseIntComparator = new ReverseIntComparator<T>(intValue);
        T[] heap = (T[])new Object[topK];
        int count=0;
        int minValue=Integer.MAX_VALUE;
        for (T item : items) {
            if (count < topK || intValue.get(item) > minValue) {
                int idx = Arrays.binarySearch(heap, 0, count, item, reverseIntComparator);
                idx = (idx < 0) ? -idx : idx + 1;
                int length = topK - idx;
                if (length > 0 && idx < topK) System.arraycopy(heap,idx-1,heap,idx, length);
                heap[idx-1]=item;
                if (count<topK) count++;
                minValue = intValue.get(heap[count-1]);
            }
        }
        return (count < heap.length) ? Arrays.asList(heap).subList(0,count) : Arrays.asList(heap);
    }
}

jexp avatar Apr 06 '17 06:04 jexp

@jansta did you find any time to look into this?

jexp avatar Apr 25 '17 21:04 jexp

I'm still interested in doing this, but I'll be pretty busy over the next weeks. So this would take me some time

jastax avatar Apr 29 '17 14:04 jastax

@jexp is this still necessary?

conker84 avatar May 12 '21 14:05 conker84