datafusion-ballista icon indicating copy to clipboard operation
datafusion-ballista copied to clipboard

Ergonomic way to setup/configure `SessionContextExt`

Open milenkovicm opened this issue 1 year ago • 1 comments

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

SessionContextExt at the moment does not expose methods which could be used to set up SessionState and/or configuration.

let ctx = SessionContext::remote(url &str)
let ctx = SessionContext::standalone()

Describe the solution you'd like

Extend SessionStateExt implementation to support configuring SessionContext like DataFusion does. Provide a methods like

let state : SessionState = SessionStateBuilder::build()

let ctx = SessionContext::remote_with_state(url &str)
let ctx = SessionContext::standalone_with_state()

to enable custom session state setup, as well as providing custom ballista related configuration as part of the SessionState.

To provide ballista specific configuration we could provide a SessionConfigExt which would. expose with_ballista_config(&str, &str). Hopefully this way we would not need to expose BallistaConfig to end user (making it private and simplifying it).

An alternative approach would be to provide SessionConfigExt::new_with_ballista(), this way we could set ballista config with helper method which would:

let state = ctx.state_ref();
let mut state = state.write();
let config = state.config_mut().options_mut();

config.set(&key.key, key.value.as_deref().unwrap_or(""));

later approach could be better with python support. We would need to try them out

Describe alternatives you've considered

An alternative approach would be keeping BallistaConfiguration which would slightly increase code base, and have a bit different approach than vanilla datafusion context set up. This would imply slidhtly different user facing SessionContextExt API where additional configuration property config: BallistaConfig should be exposed.

Additional context

If effort is successful we would need to expose similar SessionState configuration API (session state factory), after which we could deprecate ballista_core::object_store::with_object_store_registry(config: RuntimeConfig) giving ability to users to set up their own context. Something very similar to ballista-core::utils::default_session_builder(config: SessionConfig). If we go extra mile, maybe for datafusion api we would be able to provide a cloudpickle serialised session state factory where users would be able to configure their onw SessionsState across cluster

task relates to #1088 and #1091 (kind off)

milenkovicm avatar Oct 20 '24 13:10 milenkovicm

will have a take on it (once #1088 is merged)

milenkovicm avatar Oct 20 '24 13:10 milenkovicm