neo4j-apoc-procedures
neo4j-apoc-procedures copied to clipboard
[3.2] Implement an ordered, top-n collect aggregation function
Implement an ordered, top-n collect
e.g. apoc.agg.collectTop(n, [count], ["^name","age"])
so that we can solve the question below like this:
MATCH (a:A)<-[:IN]-(b:B)
RETURN a, apoc.agg.collectTop(b,3,["num"])
or
MATCH (a:A)<-[:IN]-(b:B)
WITH a, apoc.agg.collectTop(b,3,["num"]) as bs
UNWIND bs as b
RETURN a.num, b.name, b.num
Sergio Murillo [19:49] Hello. I have a seemingly simple query thats evading me. I have node types A, B with A.name , B.name, B.num properties. I have this model
A.name ranges from say 1 - 3 I want to get the top 3 B ORDER BY B.num DESC for each node A
At the end I would like something like got get the top 3 B.num for each A. The reuslt would look like
1 name1 100
1 name2 75
1 name3 50
2 name4 100
2 name5 90
2 name6 40
3 name7 200
3 name8 100
3 name9 50
Any pointers?
Michael Hunger [22:08]
WITH a,b ORDER BY b.num DESC
WITH a, collect(b)[0..3] as bs
UNWIND bs as b
RETURN a.name, b.name, b.num```
I'd like to give this one a try
Please do, that would be great.
Exactly in Neo4j 3.2 there are user defined aggregation functions.
My idea was to use the/something like the TopK Select approach that we also have in Cypher https://github.com/neo4j/neo4j/blob/3.2/community/cypher/cypher-compiler-3.2/src/main/scala/org/neo4j/cypher/internal/compiler/v3_2/pipes/TopPipe.scala#L87
I also have a Java Implementation of that which only works for distinct results, and has to be fixed
On Wed, Apr 5, 2017 at 11:37 PM, Jan St. [email protected] wrote:
I might need some help though ;-)
MATCH (a:A)<-[:IN]-(b:B) WITH a, apoc.agg.collectTop(b,3,["num"]) as bs UNWIND bs as b RETURN a.num, b.name, b.num
wouldn't that need to be
MATCH (a:A)<-[:IN]-(b:B) WITH a, apoc.agg.collectTop(collect(b),3,["num"]) as bs UNWIND bs as b RETURN a.num, b.name, b.num
or is it possible to collect the b's inside the user function?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neo4j-contrib/neo4j-apoc-procedures/issues/358#issuecomment-292004933, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEHYyodZvAI-D9aA22zgTBO47h6TIgCks5rtAmigaJpZM4Mwmeg .
package org.neo4j.util;
import java.util.Arrays;
import java.util.Collection;
import java.util.Comparator;
import java.util.List;
/**
* @author mh
* @since 02.04.14
*/
public class TopKHeap {
public interface IntValue<T> {
int get(T item);
}
static class ReverseIntComparator<T> implements Comparator<T> {
private final IntValue<T> intValue;
ReverseIntComparator(IntValue<T> intValue) {
this.intValue = intValue;
}
@Override
public int compare(T o1, T o2) {
return Integer.compare(intValue.get(o2), intValue.get(o1));
}
}
public static <T> List<T> findTopKHeap(Iterable<T> items, int topK, IntValue<T> intValue) {
ReverseIntComparator<T> reverseIntComparator = new ReverseIntComparator<T>(intValue);
T[] heap = (T[])new Object[topK];
int count=0;
int minValue=Integer.MAX_VALUE;
for (T item : items) {
if (count < topK || intValue.get(item) > minValue) {
int idx = Arrays.binarySearch(heap, 0, count, item, reverseIntComparator);
idx = (idx < 0) ? -idx : idx + 1;
int length = topK - idx;
if (length > 0 && idx < topK) System.arraycopy(heap,idx-1,heap,idx, length);
heap[idx-1]=item;
if (count<topK) count++;
minValue = intValue.get(heap[count-1]);
}
}
return (count < heap.length) ? Arrays.asList(heap).subList(0,count) : Arrays.asList(heap);
}
}
@jansta did you find any time to look into this?
I'm still interested in doing this, but I'll be pretty busy over the next weeks. So this would take me some time
@jexp is this still necessary?