rubix
rubix copied to clipboard
Cache File System optimized for columnar formats and object stores
Metrics captures only local reads and remote reads. Add a metric to also capture non-local reads.
In README and docs, have a section on resources that point to blog posts and slide decks. Docs should be more comprehensive.
Current README has the following markdown: ### Supported Engines and Cloud Stores - Presto: Amazon S3 - Spark: Amazon S3 - Any engine using hadoop-2 or hadoop-1, e.g. Hive can...
Current README has the following markdown text: ### Monitoring Client side monitoring is set up right now, stats are published to MBean named `rubix:name=stats` Engines which provide interface to view...
@wishnick's experiments have thrown up an interesting dependence between ratio of local/non-local reads relative to the cluster size. We have not studied the effect of cluster size on these metrics....
Historically we have focussed on the performance comparison of local and remote reads. Non-local reads are an important mode especially in large clusters. Study the performance of non-local reads and...
Setup: Rubix Version : 0.3.1 Presto : 0.172 Emr AMI: 4.9.3 I've installed rubix into presto by using a custom build of presto and overriding the configuration in [presto's HdfsConfigurationUpdater](https://github.com/prestodb/presto/blob/0.172/presto-hive/src/main/java/com/facebook/presto/hive/HdfsConfigurationUpdater.java#L163-L165)...
Imagine file1 in the local disk already partially(or completely cached) ls -lrt File1* -rw-rw-rw- 1 yarn yarn 13631488 Feb 8 20:03 File1 -rw-r--r-- 1 yarn yarn 12 Feb 8 20:03...