Evan Sparks
Evan Sparks
Removing the 0.3.0 milestone because we optimized standard convolution and we don't yet have use cases that would warrant an FFT based convolution.
Should also contain a link to the spark style guide (which is our coding standard) and something brief about the expectation of how docs are formatted and desired test coverage.
Hi there, There are a couple of weaknesses with our current use of CoreNLP 1) Performance: While CoreNLP is pretty quick, it does take some time to initialize and given...
the ScalaNLP stuff is great - and comes out of David Hall/Dan Klein's work, so I expect it to be quite modern. Kind of a bummer that it doesn't include...
To be fair - I have no problem with `new` - it's just a dislike for boilerplate. Guidelines about where and when to use objects vs. case classes could also...
This looks awesome, thanks so much for the contribution Manish! The big question I have is whether you looked at using MLTable and its API for your input? Were there...
This is awesome, thanks Manish - we'll plan to test your code for scalability on a cluster this week. On Sat, Oct 19, 2013 at 6:43 PM, manishamde [email protected]: >...
This is terrific work! Basic functionality is there and scaling well for large datasets based on my tests. Though, I don't see special logic differentiating between continuous and categorical features....
Thinking through this change today, I'm not so sure it's necessary at the moment. `SparkSession` is part of the SparkSQL namespace and primarily designed to support `Dataset` access. We need...
One thing that makes the block solves tricky is that the blocks are not independent. That is - we pass a `Seq[RDD[T]]` because the solution to the second block depends...