jOOL icon indicating copy to clipboard operation
jOOL copied to clipboard

Avoid collecting entire partition(s), when no Comparator is provided with a window

Open lukaseder opened this issue 9 years ago • 9 comments

When using only the frame clause (and possibly the partition function), subsequent window function evaluations can be done lazily, as they do not necessarily depend on the whole partition(s) being collected in advance.

For instance, when doing a sliding sum like this:

Seq.of(1, 2, 3, 4, 5, 6, 7...)
   .window(-1, 1)
   .map(w -> w.sum())
   .limit(3)
   .toList();

The result here is:

[Optional[3], Optional[6], Optional[9]]

There is no need to go beyond item no. 4 (limit + frame upper bound) in collecting the stream.

lukaseder avatar Jan 09 '16 19:01 lukaseder

This would be awesome :)

johnmcclean avatar Jan 10 '16 23:01 johnmcclean

Yes, it would indeed, although further research is needed because the semantics of some ranking functions is independent of the window frame, e.g. rowNumber(), rank(), or denseRank().

Also, thus far, I've documented only the convenience window() method. There is also the possibility of creating up to 16 windows at the same time, in case the stream needs to be buffered as soon as at least one window specification uses ordering.

By the way, I've seen your response on Stack Overflow: http://stackoverflow.com/a/34712153/521799

Great to hear you're building on top of jOOλ! Would you be interested in publishing a guest post about this work on the jOOQ blog?

lukaseder avatar Jan 11 '16 06:01 lukaseder

Yeah sure, that sounds good, cyclops-streams adds reactive-streams type features on top of jOOλ (hotStreams, reactive-streams support, async execution etc) and then simple-react adds concurrency to that. There are a few operators that would probably be at home in Seq though (e.g. single & singleOptional - see http://stackoverflow.com/questions/22694884/filter-java-stream-to-1-and-only-1-element/34715168#34715168)

Your windowing is incredibly feature rich, I'm looking forward to getting a better handle on how to make full use of it.

johnmcclean avatar Jan 11 '16 12:01 johnmcclean

There are a few operators that would probably be at home in Seq though (e.g. single & singleOptional

Pull requests are very welcome! Although, in that case, what's the exact use-case?

Your windowing is incredibly feature rich, I'm looking forward to getting a better handle on how to make full use of it.

Yep, they're very versatile. Usually, there are more generic FP constructs to do the same things, but I have not yet seen anything as concise as SQL window functions

lukaseder avatar Jan 11 '16 12:01 lukaseder

I've put the reason I added it below (manipulating a single value asynchronously), but I think the are more general use cases when you want enforce that there is one element in a dataset that meets a criteria (by throwing an exception if there isn't), or provide a default if there is not.

       List<Footballer> players;
       Goalkeeper goalie = Seq.of(players)
                              .ofType(Goalkeeper.class)
                              .single();

       KeyController critical = Seq.of(suppliedPlugins)
                                   .ofType(KeyController.class)
                                   .singleOptional() //misconfigured if Optional.empty
                                   .orElse(safeModeController);

       Seq.of(host1,host2, host3,host4,host5)
           .filter(host ->memberOfMajorityCluster(host))
           .single(host -> host.isElectedLeader());

The reason we have it : The api in simple-react is in many cases (& at least for the authors) both simpler and more powerful than the direct CompletableFuture api, so it's useful to be able to manipulate a Stream of a single value. If you require that there is absolutely only one result, it is safer to call

        seq.single();

than

     seq.toList().get(0);

Here even unit tests could hide the fact the Stream contains more than one result. Otherwise the equivalent code is something like

   List values = seq.toList();
   if(list.size()==1)
        return values.get();
    else
      throw new Exception();

johnmcclean avatar Jan 11 '16 14:01 johnmcclean

Convinced! Although, I'll implement this subtly differently: #178

lukaseder avatar Jan 11 '16 18:01 lukaseder

I'm curious to learn more from you - if you have any other methods that you can see in jOOλ, just open up feature requests and we'll see if we can add something.

lukaseder avatar Jan 11 '16 18:01 lukaseder

Sure will do, I think onEmptySwitch(Supplier<Seq<T>> switchTo) would also fit well in jOOλ for example.

johnmcclean avatar Jan 11 '16 21:01 johnmcclean

(excuse my edit, needed to visualise the generics)

Hmm, yeah, I can see the point. I've created #179 with some criticism.

lukaseder avatar Jan 11 '16 21:01 lukaseder