Nicholas Chammas

Results 229 comments of Nicholas Chammas

> Since this currently doesn't work in SQL, can we remove the SQL examples from the public docs for now? https://docs.delta.io/1.0.0/delta-batch.html#syntax +1 on this. It's very confusing to look at...

If you're publishing a Spark library, I would follow the lead of [GraphFrames](https://github.com/graphframes/graphframes). Both Scala and Python code live in the same repo, and people load the library for use...

`--packages` must work with Databricks since it's one of the oldest ways of loading additional libraries for Spark, but I haven't checked myself. Publishing to PyPI should be fine. Perhaps...

> @nchammas - do you know the Python versions that a modern Spark app should be supporting? I think [supporting > 2.7](https://github.com/MrPowers/chispa/blob/main/pyproject.toml#L9) is too lenient cause Python 2 is EOL....

Hello and sorry about the late reply here. I have been working on some improvements to Flintrock that will make cleanup of failed launches much more reliable and complete. I'm...

That looks pretty weird. So instead of the link pointing to an IP address or host name, it literally points to a block of HTML? Can you share your Flintrock...

Is it just the UI that's broken? I would expect something to be wrong with the cluster too. Can you post the full contents of the files under `spark/conf` on...

OK, it sounds like we need to understand how to set `SPARK_PUBLIC_DNS` when launching into a private VPC. Do things work if it's just left unset?

OK great. Maybe we don't need this config at all anymore, or maybe we only need it when launching into a public VPC.

Oh neat. I see that the EC2 API added this ability [in 2017](https://stackoverflow.com/a/43723682/877069), a couple of years after I wrote the initial cuts of `launch()` and `_create_instances()`. I agree, it's...