logisland
logisland copied to clipboard
Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex ev...
Also used to store the current kafka offset while writing to the structured stream sink
pipeline objects (aka processors) are intanciated at each micro-batch by spark executors. This leads to too many objects creation and GC purge. we could instead lazily create a pool of...
implement an in-memory cache service based on https://github.com/ben-manes/caffeine - improve LRU caching with eviction policies (useful for web analytics use case) - better performance
This issue #466 has introduced a refactoring of the documentation. This work has been merged into the develop branch with the PR #471. But between the time where this PR...
# Expected behavior and actual behavior. # Steps to reproduce the problem. # Specifications like the version of the project, operating system, or hardware.
# Expected behavior and actual behavior. yaml files become very large and are difficult to read. Having an include mechanism would make things more modular and easy to manage. We...
# Expected behavior and actual behavior. As of today our pipelines have only chained processors (one processor passes records to the next, eventually filtering records). This is a bit of...
# Expected behavior and actual behavior. We need to check the status of all our components with respect to deployments in Kubernetes. State-full components are hard to deploy in Kubernetes....
# Expected behavior and actual behavior. We should have dashboards with the lists of running LogIsland jobs with their progression in real time. We should be able to click on...
# Expected behavior and actual behavior. We need to clean the monitoring of LogIsland so that errors and metrics are properly sent to prometheus and grafana. May be we should...