Nick Pentreath comments

Results 52 comments of


                                            Nick Pentreath

Compilation doesn't work due to forward reference of variable

Ah sorry about that - it was a rush to push it out in time for Spark Summit talk. Should be fixed in https://github.com/MLnick/glint-fm/commit/19de327879b8d2645b60436ac93afd8b16f62dec

Compilation doesn't work due to forward reference of variable

By the way, the code here is really very rough and more a PoC than anything near production ready :)

maven-shade-plugin version issue

Hi I'm afraid I'm not able to really actively maintain this at the moment. But any PR would be welcome :) Nick — Sent from Mailbox On Fri, Jun 27,...

Java heap space issue with large cardinality / bigger datasets

Hmmm will take a look. Could be a memory leak or perhaps with large datasets something around merging all the intermediate HLL instances. Unfortunately Hive UDFs are really tricky to...

Java heap space issue with large cardinality / bigger datasets

Something that might work is to try increasing the split size in Hadoop thus decreasing the effective number of mappers - this could potentially alleviate the intermediate merging pressure (in...

Java heap space issue with large cardinality / bigger datasets

Hi Sorry for the long delay - been swamped and I no longer use Hive these days. I'll try to take a look if I get time. But it seems...

How to scale TFRS?

The basics for Spark-TF interop [here](https://github.com/tensorflow/ecosystem/blob/master/spark/spark-tensorflow-connector/README.md) may be helpful?

How to scale TFRS?

Indeed it's a problem that still has not been solved very well. I think TF-on-spark is arguably one of the better options (still leaves a lot to be desired though)...

How to scale TFRS?

Also : https://github.com/tensorflow/ecosystem/tree/master/spark/spark-tensorflow-distributor?

@dgoldenberg-audiomack I agree, the experience is far from seamless currently. The closest looking thing is actually: https://analytics-zoo.readthedocs.io/en/latest/doc/Orca/Overview/orca.html (from the BigDL team at Intel I think). I plan to give it...