Rationale for this change

Add make notebook to spin up a jupyter notebook

With spark connect (#2491) and our testing setup, we can quickly spin up a local env with

spark
iceberg rest catalog
hive metastore
minio

make test-integration-exec
make notebook

in the jupyter notebook, connect to spark easily

from pyspark.sql import SparkSession

# Create SparkSession against the remote Spark Connect server
spark = SparkSession.builder.remote("sc://localhost:15002").getOrCreate()
spark.sql("SHOW CATALOGS").show()

Are these changes tested?

Are there any user-facing changes?

Sep 26 '25 02:09 kevinjqliu

With spark connect (https://github.com/apache/iceberg-python/pull/2491) and our testing setup, we can quickly spin up a local env with

I agree, and that's great, but should we also spin up the resources as part of this effort? We could even inject a notebook that imports Spark-connect, etc (which won't be installed from a fresh install? I think this is a dev dependency, we probably want to double check there to avoid scaring newcomers to the project).

Sep 26 '25 09:09 Fokko

Bonus idea: what if make notebook or some other CLI entry point spun up pyspark + catalog configured via pyiceberg.yaml so users could immediately start querying their data?

Sep 26 '25 15:09 jayceslesar

We could even inject a notebook that imports Spark-connect

We could do getting started as a notebook! https://py.iceberg.apache.org/#getting-started-with-pyiceberg

Sep 26 '25 15:09 kevinjqliu

Bonus idea: what if make notebook or some other CLI entry point spun up pyspark + catalog configured via pyiceberg.yaml so users could immediately start querying their data?

yea we could do that. the integration test setup gives us 2 different catalogs (rest and hms)

Sep 26 '25 15:09 kevinjqliu

@kevinjqliu I would keep it simple, and go with the preferred catalog; REST :)

Sep 30 '25 19:09 Fokko

dev: add `make notebook`

Rationale for this change

Are these changes tested?

Are there any user-facing changes?