eep-js
eep-js copied to clipboard
CEP Language
I think to make this useful, a declarative language and parser is needed. (e.g. compare Esper). Anyway, this is the proper way to start to make such a framework generic. Of course the non-trivial thing is to make stream joins efficient.
EEP currently focuses on window operations with aggregate functions. Branching and combinators (such as join) will likely be a separate project. I've investigated a DSL based on Node.js streams and pipes (beam-js repository) and this provides branch, union with filter and transformation functions. I've also looked into obvious omissions in EEP such as group by and order by - but I disliked the results. I found it easier to arrive at nice abstractions in OO languages and harder in FP languages. Grouping and ordering will be added 'soon'.
As it stands EEP (& beam-js) are useful to me. What I wanted was simple, lightweight, embeddable. If you are looking for a full blown CEP engine, with language, parser, tools and IDE integration then there are many good open source (eg: Esper) and commercial (StreamBase, Apama, Oracle CEP) solutions out there. Arguably one of the Rx implementations would suffice for many algorithms. RxJs and Netflix's Rx for Java are both great projects. I use them myself.
With EEP I wanted something small with a simple model. Something I can run on a server, or run on an embedded chip such as the AVR or an Arduino or mBed. Something easy to port, and easy to fork.
About 80% of the features of a modern CEP engine are used rarely. The more exotic the more baggage and the more varied they are across CEP engines. About 20% is generally useful.
I will check the BeamJS project. Unfortunately my connection with http://doc.beamjs.org/ gets interrupted a.t.m (Error 101).
Having a simple and lightweight thing like EEP already can help many programmers.The only thing is that a DSL may help thinking in an abstract and consequent way (one has to think of a 'rich' language already). On the other side are modern CEP engines with many unused features.
Regarding the simple model: to make it suitable for embedded usage (e.g. beaglebone) memory consumption may be an issue. A few months ago I did a thing similar with an average calculation over a fixed (but moving) window, but I did not include time explicitly. On having an average (3600 numbers) a callback was fired.
That part is straightforward to implement, but the trouble comes when having multiple streams at once. I like your idea of having composite functions to perform calculations efficiently (How much better/ worse than plain object calculations?) . However storing numbers consumes a serious amount of memory (e.g. for 24 hours, sample rate: 1 number / second: 3600*24 = 84000 numbers). So I decided to optimize this thing by having one function that calculates the 1 minute average and another one which calculates the 1 hour average and one which calculates the 24 hour average. Then I only need to store 144 numbers.
Oh, I didn't realize there was a beamjs project and domain, I meant: https://github.com/darach/beam-js. On embedded usage memory will be an issue yes, but memory is always an issue on embedded devices. I might see what makes sense on an AVR or ATTiny, for example, as I use those regularly. Also, you don't need to store the numbers in memory, you could dump/log events and/or use a persistent window backed by flash storage, for example.
Update. Ordering is now a 'solved problem' in EEP windows.