Oliver Kennedy

Results 161 issues of Oliver Kennedy

Should be evaluated by spark. ``` mimir> select row_number(), cast(h as float)*60*60+cast(m as float)*60+cast(s as float)-75359.239 from extract_warm_start; java.lang.RuntimeException: Error Decoding ROW_NUMBER (int) at mimir.exec.result.LazyRow.apply(LazyRow.scala:22) at mimir.exec.PrettyOutputFormat$$anonfun$print$4$$anonfun$apply$1.apply$mcVI$sp(OutputFormat.scala:81) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160) at...

bug

Challenge: Ordering matters for storage providers. Need to figure out how to specify a priority order for the spark provider. MimirVizier should use a command-line parameter to figure out whether...

bug

As of right now, [CTExplainer](https://github.com/UBOdin/mimir/blob/master/src/main/scala/mimir/ctables/CTExplainer.scala)'s explainRow and explainCell methods rely on a hack to compute statistical metrics for values. Now that we have TupleBundler and compileForSamples() (at least in the...

enhancement
lenses
compiler
explain/analyze

Would be nice if the parser could read in JSON... would help with inline testing, as well as for passing configuration parameters to Lenses.

enhancement
parser/sql

The shape watcher lens currently runs a Count Distinct query during the training phase to discover categorical attributes. This is not great for large datasets. Fortunately, we don't care about...

enhancement

It would be nifty if we could have some way to easily define data validation expressions. For example ``` ASSERT A + B + C = TOTAL IN grades ```

enhancement
lenses
eventually
parser/sql
662 Project

Possibly required for #361 Would subsume #333 Mimir has a much cruder internal type system than [Spark](https://github.com/apache/spark/tree/master/sql/catalyst/src/main/scala/org/apache/spark/sql/types). In addition to lacking collection types, there's a lot of capabilities (e.g., integer...

enhancement
eventually

``` [9] | LOAD DATASET deposits FROM https://odin.cse.buffalo.edu/public_data/Deposits_2018.csv ``` Detect headers is enabled, and indeed, the column headers are all extracted correctly. However, the first row remains included in the...

bug

The following query behaves differently depending on whether `CAST` is evaluated in Mimir or in Spark. ``` SELECT CAST('.' AS int) ``` In Mimir it returns `NULL`, while in Spark...

bug