Shivaram Venkataraman comments

Results 87 comments of


                                            Shivaram Venkataraman

Move SparkContext -> SparkSession

Yeah I think that sounds reasonable.

General purpose data loaders

cc @thisisdhaas @sjyk who are also interested in general purpose data loaders for data that comes from SampleClean

Figure out future DAG operators

cc @thisisdhaas @sjyk who have a use case for a `split` operator that goes from a SparkSQL query output to a transformer

Figure out future DAG operators

FYI here is the Spark PR open to convert DataFrames to typed RDDs https://github.com/apache/spark/pull/5713

Add unit test for block weighted solver with unequal class sizes

I don't think this is important for 0.3 - I'm going to keep this bug open but remove the milestone.

Just to add the idea here is that similar to `Pipeline.gather` we can express data augmentation as something like ``` val data = CifarLoader(trainLocation) val featurePipeline = Pipeline.concat { RandomPatch()...

WeightedBlockSolver for multiple lambdas

I misremembered things a little bit - we've never done multi-lambda for the old `BlockWeightedLeastSquares`. But for the new `PerClassWeightedLeastSquares` it should be pretty simple to add it by just...

Image Scaling

So there are a couple of options here -- Do we want to use Bruckner's code ? Or do we want to try to integrate with JMagick ?

Image Scaling

Well its another JNI library like OpenCV so it has its own so, dylib file etc. that we need to carry around. The shell scripts worked fine for experiments but...

Image Scaling

I don't have the latest numbers but it was 1-2% in the runs from 2-3 weeks back. My problem is not about the benchmark per-se (as we have pre-scaled images...