keystone icon indicating copy to clipboard operation
keystone copied to clipboard

Simplifying robust end-to-end machine learning on Apache Spark.

Results 39 keystone issues
Sort by recently updated
recently updated
newest added

Currently so as to be easily chainable with the rest of the code, block operators (such as block solves and block transformers) take a single complete RDD and manually split...

Both SIFT and Daisy feature extractors output a `DenseMatrix` of a special shape that contain a bag of image descriptors. These image descriptors also have metadata with them (e.g. original...

enhancement

The chosen (x,y,z) addressing scheme for images assumes a C or Matlab multidimensional array coordinate system as opposed to an (arguably more natural) cartesian coordinate system. We should (pictorially!) document...

Right now, Image DataLoaders partition on the file names (which works well for ImageNet style collections). While it's possible to repartition manually after images have been loaded, the API indicates...

Right now it assumes small image sizes and doesn't work when the number of pixels is > blockSize cc @ericmjonas

pipelines

We're running into a common issue where nodes that should be able to support either DenseVector[Double] or DenseVector[Float] generically are not because breeze supports these two datatypes inconsistently and converting...

nodes
math

Call into MLlib

nodes
learning

There are some concepts we've talked about (and have in some cases touched using FunctionNodes #121) that we need to figure out how to cleanly integrate into our pipelines interfaces....

enhancement

We have included a number of data loaders tailored to standard academic datasets with KeystoneML, but it would be good to include general purpose WAV and image loaders in the...

dataloader

We should have OSX builds as part of our CI process. In the current build environment, this is blocked until AMP infra gets an OSX worker set up. CC @shaneknapp...

admin
build