Sue Ann Hong

Results 15 comments of Sue Ann Hong

Yeah RDDs are used in TensorFrames which is what is underneath the Featurizer. UDFs can be used as a workaround but we don't have it productionized in this package.

Did you attach the h5py library to the cluster? You can see what libraries are attached by going to the "Clusters" tab and clicking on your cluster, and then clicking...

You shouldn't need to attach py4j for the example notebook to run. What is your cluster configuration - do you know the Spark version or the Databricks Runtime version? Are...

We are looking into adding Scala APIs for readImages and DeepImageFeaturizer, which would enable transfer learning for images in Scala. @Ayush-iitkgp it might work for your Java workflow?

Hi @redsofa, there is currently no Scala API. Are there any particular parts / workflows that would be useful for you in Scala?

That totally makes sense. The reason a Python API was prioritized was because most deep learning uses happen in Python - e.g. Keras, a popular deep learning framework, is only...

Hi @jvmancuso, yeah I think that certainly makes sense. There is a series of PRs out for adding support for 1-d numeric tensor (i.e. vector) inputs (https://github.com/databricks/spark-deep-learning/pull/49, https://github.com/phi-dbq/spark-deep-learning/pull/9, etc). I...

Do you still see nulls if you inspect the DataFrame itself (before writing out)? (e.g. df.show()) cc @thunterdb -- should writing out the binary data as csv work here?

Thanks, @allwefantasy, that makes a lot of sense. In fact, we've been collaborating with the TensorFlowOnSpark (TFoS) team to see how we can bring all these ideas together...! They have...