Josh Rosen

Results 51 issues of Josh Rosen

Even though our dataset only classifies images into 10 classes, the examples could be further divided into additional subcategories. For example, consider the images of planes. We have images of...

Imagine that at every timestamp, we had access to the cumulative responses of every component of the network for all images in each category. This would let us could color...

new feature

It would be nice if Vrome supported vim's `incsearch` option for performing [incremental search](https://en.wikipedia.org/wiki/Incremental_search).

Feature

I learned about a neat trick in `sbt-assembly` which allows you to build separate JARs for your application's code and its dependencies: https://github.com/sbt/sbt-assembly#splitting-your-project-and-deps-jars For example, if my `build.sbt` file contains...

enhancement

See report of error at https://stackoverflow.com/questions/32503059/how-to-read-avro-files-generated-from-java-class-using-spark-shell-when-the-so It seems that class cast exceptions can occur when loading Avro files that were generated from Java objects: ``` java A cannot be cast...

bug

Avro 1.7.7 added support for Decimal type: https://issues.apache.org/jira/browse/AVRO-1402 `spark-avro currently writes decimals as strings. We should explore whether we can use Avro's decimal support instead when running against a supported...

enhancement

We should use `Long` instead of `Int` for numeric options in order to support larger values (e.g. huge numbers of records), and verify that the data generators work for huge...

enhancement

When we observe small performance regressions, it would be helpful to have a way to automatically run multiple trials of each test in some random order; this would help us...

enhancement

As noted in #43, PySpark doesn't handle the HDFS persistence type for the KVS tests: https://github.com/databricks/spark-perf/blame/master/pyspark-tests/core_tests.py#L31 We should probably add support for this or raise a louder warning if that...

enhancement

Performing large bisections would be even faster if we used a S3-backed build cache that stored the compressed build archives.

enhancement