Proposal: ML Experience Working Group
Hi everyone! 👋
After brainstorming with some community members about how to improve the Kubeflow User/Developer Experience for Data Scientists and ML practitioners, I decided to go one step further and start a formal discussion and propose a new IDE working group and its initial roadmap.
The IDE Working Group (potentially, Kubeflow Jupyter Extension WG) will be responsible for developing and integrating IDE-based tools and extensions to provide a streamlined user experience to data scientists and machine learning practitioners on Kubeflow.
WG IDE Charter
The IDE Working Group is responsible for developing and integrating IDE-based tools and extensions to provide a streamlined user experience to data scientists and machine learning practitioners on Kubeflow.
This charter adheres to the conventions, roles, and organization management outlined in wg-governance.
Scope
The IDE Working Group focuses on developing, maintaining, and improving tools and extensions that support data science and machine learning practitioners workflows within Kubeflow. The group is dedicated to delivering a high-level, seamless experience integrated with the IDE of choice across multiple Kubeflow components.
In scope
Code, Binaries, and Services
-
Development of Kubeflow JupyterLab extensions that provide simple abstractions and UX to interact with the most common Kubeflow components (e.g., pipelines, hyperparameter tuning) and shorten the time to value for practitioners comfortable with Jupyter. These extensions will focus on the most used Kubeflow components, such as:
- Pipelines;
- Training Operator & Katib;
- Model Registry;
- Model Serving (KServe);
- Feast
-
Promote the reusability of UI components from other Kubeflow UIs into the IDE (e.g., rendering a pipeline graph inside the JupyterLab environment) by establishing a shared contract between the IDE WG and the wider Kubeflow community.
-
Develop a Python SDK to simplify operationalization across Kubeflow components and provide a "one-stop-shop" for practitioners who want easy access to Kubeflow services. The SDK also provides the groundwork for the IDE extension automation and workflows.
- Create a single installation and configuration layer for users interacting programmatically with the Kubeflow ecosystem via SDKs.
- The "common" SDK is not meant to replace individual components' SDKs but rather to offer a unified access layer to simplify dependency management and shared configuration (like authorization).
Guiding Principles
- Synergy among Kubeflow Working Groups: Collaborate with other WG to promote reusability of UI components from other Kubeflow UIs to create a single UX between Jupyter IDE and Kubeflow Central Dashboard;
- Collaboration with other open-source IDE projects (like Jupyter and VSCode) to promote the creation and reusability of open standards for AI/ML tools (protocols, communication exchange, file formats, etc.) and plugins. The aim of this group is to actively participate in the development of these standards to include Kubeflow in a broader ecosystem or interoperable tools.
Cross-cutting and Externally Facing Processes
- Collaboration with other Kubeflow WGs, including WG Notebooks, WG Pipelines, WG Training, and WG Serving, ensures that IDE tools are interoperable across different stages of the ML lifecycle.
- Coordination with the release teams to align updates in IDE tools with broader Kubeflow release schedules.
Out of scope
- Building and maintaining Notebook/Workspaces images (this falls under the WG Notebooks).
Working Group Roadmap Proposal
Vision
Development of Kubeflow JupyterLab extensions that provide simple abstractions and UX to interact with the most common Kubeflow components (e.g., pipelines, hyperparameter tuning) and shorten the time to value for practitioners comfortable with Jupyter. These extensions will focus on the most used Kubeflow components, such as Pipelines, Training Operator & Katib, Model Registry, Model Serving (Kserve), Feast, etc.
Phase 1 - Establish baseline (XX Months)
Goal: Baseline/starting point for Kubeflow IDE Extension
This phase will consist of three main tasks:
- Working on the kubeflow-kale/kale to make it functional with KFP v2. The goal is to demo a successful notebook run with the latest version of KFP.
- Re-introduce Elyra add-on support in Kubeflow. The goal is to demo a pipeline visual authoring compatible with the latest version of KFP.
- Explore the synergy between the Kubeflow Jupyter Extension and Jupyter Scheduler. We strive to build a close partnership of this working group with Jupyter upstream and even conciliate our efforts.
Task breakdown:
Kale: Note: @StefanoFioravanzo started this issue https://github.com/kubeflow/community/issues/730 and got great feedback and traction from the community.
- Create a map of existing features and capabilities.
- Upgrade dependencies to resolve CVEs and update deprecated modules
- Align the internal API with KFP v2
- Update jupyter notebook docker images
- Demo!
Elyra Note: This work is already in progress by my group at Red Hat, together with the Elyra community.
- (Done) Upgrade dependencies to resolve CVEs and update deprecated modules on Jupyter 4.x
- (Done) Fix Elyra 4.x build
- (Done) Migrate Elyra extensions to support JupyterLab 4.2.5
- PR and part of Elyra 4
- (WIP) Align Elyra 4.x with KFP v2 (PR)
- https://github.com/elyra-ai/elyra/pull/3273
- As soon as Elyra releases 4.x, update Kubeflow docs to support the add-on https://www.kubeflow.org/docs/external-add-ons/elyra/introduction/
- Integrate Elyra with Jupyter Notebook docker images on Kubeflow Notebooks.
- Demo!
Jupyter Scheduler
- Demonstrate the capability of Jupyter Scheduler extension for Notebook Workflows.
- Discuss how we can consolidate efforts to build a unified solution for Notebook Workflows.
Phase 2 - Code Migration (XX Months)
Goal: code consolidated within the Kubeflow GitHub organization with proper code structure and naming
Phase 1 focused on establishing a baseline by demoing Kale and Elyra integrations successfully. In this phase we want to consolidate the Kale codebase under the Kubeflow organization. This new structure will allow us to work on top of Kale and iteratively build the new IDE experience for Kubeflow. Elyra will continue to be the interim solution for low-code visual pipeline authoring.
- Migrating kubeflow-kale/kale to kubeflow/XXX - naming of the repository to be discussed with Kubeflow community. This new repository will house everything related to Kubeflow IDE plugins and extensions
Phase 3 - Enhance IDE extension (XX Months)
Goal: Add the visual authoring and the runtime pipeline visualization to the Kale baseline. With these new features Kubeflow can provide both a notebook-based and a visual/drag-and-drop-based authoring pipeline experience. We are also planning to provide the same visualization look and feel both on IDE and on the Kubeflow Central Dashboard.
Long-term plan
Goal: Kubeflow JupyterLab Extension MVP will provide a streamlined user experience to data scientists and machine learning practitioners across all components of the Kubeflow ecosystem.
CC @kubeflow/kubeflow-steering-committee @StefanoFioravanzo @andreyvelich
This proposal submission is a collaboration between @StefanoFioravanzo, @andreyvelich, and myself. We also got helpful feedback from multiple other community members.
This proposal is also related to the 'SDK discussion' on https://github.com/kubeflow/training-operator/issues/2402#issuecomment-2619160006
@ederign thanks for migrating our notes and creating the issue! Looking forward to starting these efforts and can't wait to hear feedback from the community
cc @zsailer @bigsur0 @shravan-achar @akshaychitneni
Thanks for the well-written proposal. Some of these align very well with the mission of the Elyra project. Given the synergy, it might be a good idea to explore how we could make some of these in the context of Jupyter/Elyra in particular as we are all projects related to the Linux Foundation. Please let me know if any specific meetings are happening in this area.
cc @caponetto @shalberd @romeokienzler
@lresende absolutely! We still need to wait for broader feedback from the community about the proposal, but if we agree to proceed, I'll make sure to invite Elyra folks to the discussions.
I think this is a great idea and will enhance the overall UX with Kubeflow! I'd be happy to help out with any of the initiatives.
Really detailed proposal, thank you very much for that! From my experience at Pepsico, Data Scientists often struggle to get familiar with Kubeflow, and companies typically need to develop a tool or library to help them use it effectively. Once implemented, this could definitely accelerate adoption.
I think it's really great initiative that will improve Kubeflow usability. And thank you so much for the detailed explanation, great work! I would really like to help in this initiative.
Hi Folks, I propose a new name for this Working Group: ML Experience. Given that we will develop many tools (Jupyter Extensions, SDK, re-usable UI components) that streamline ML Engineer experience. What do community think on this ?
@andreyvelich before focusing on the name itself - do you confirm you are ok with the charter and the proposed action plan? Don't want to get hung up on naming in case there are aspects of the proposal that need to be discussed.
If the proposal looks ok, then let's discussing naming
Sure, that sounds good to me @StefanoFioravanzo!
In any case, let's talk about it at the next Kubeflow Community Call and covert this proposal to the PR in kubeflow/community.
Thank everyone for all the input here. I just submit a proposal for the kubeflow community: https://github.com/kubeflow/community/pull/824
@ederign thank you for working on this proposal. I love the idea of user-centric approach basically when looking into how the different tools can make their journey easily by integrating or building new tools. I'm interested in joining.
@varodrig great! I would love your feedback at https://github.com/kubeflow/community/pull/824
@ederign , could you provide some initial guidance or key resources to help me gain a better understanding of the project?
@ederign I would like to join WG if possible please
@RonakSingh55 @szaher, that is great! We are discussing the official proposal of the working group here: https://github.com/kubeflow/community/pull/824
Let's keep it open until we finalize scope of ML Experience WG. /retitle Proposal: ML Experience Working Group
I just raised a new PR with a FUP of requested changes on https://github.com/kubeflow/community/pull/824.
Hi @ederign , @StefanoFioravanzo
I came across the opportunity to develop a JupyterLab Plugin for Kubeflow, and I’m highly interested in contributing to this project. With my experience in JavaScript, React, Python, and API integrations, I believe I can help create a seamless JupyterLab extension that integrates with Kubeflow Pipelines, Notebooks, Model Registry, and Training Operator.
I have experience in JupyterLab extensions, backend API development, and have worked on projects involving data processing, AI tools, and web applications. I am eager to modernize and consolidate existing solutions like Elyra, Kale, and Jupyter Scheduler into a unified plugin to enhance the Kubeflow ecosystem.
I would love to discuss how I can contribute effectively. Please let me know the next steps or if there’s any documentation I should review to get started.
Looking forward to your response!
Best regards, Abhishek kaul
Hi @Abhsihekkaul! That is great, and I'm looking forward to collaborating with you! We are in the process of setting up a place for us to start gathering! As soon as I have the Slack channel, I'll let everybody here know!
Ok @ederign
By the mean time shall i create a prototype of the implementation and craft my gsoc proposal and get a review from the team.
Hi @ederign , @StefanoFioravanzo
I’m interested in contributing to the JupyterLab Plugin for Kubeflow and have started drafting my proposal for GSoC 2025 on this project.
Currently, I am pursuing my Master’s in Computer Science and am skilled in JavaScript, React, Python, and API integrations, with a strong focus on building scalable applications and intuitive user experiences.
What I have done to learn about Kubeflow:
- [x] Completed reading Kubeflow documentation (Introduction, Architecture, and Components).
- [x] Deployed a Kubernetes cluster locally using kind.
- [x] Deployed Kubeflow using Kubeflow manifests.
Work in progress:
- [x] Developing a sample Jupyter extension & learning about Jupyter widgets.
- [x] Reviewing existing plugins/extensions like Elyra, Kale, and Jupyter Scheduler.
Looking forward to contribute to this project!
Hi @ederign, @StefanoFioravanzo, and everyone,
I’m Abdulrahman Omar, a data science student interested in improving my experience as an ML practitioner. I’ve reviewed the proposal in depth and explored the related technologies "Elyra, Kale, and Jupyter Scheduler" to understand how they currently interact within the Kubeflow ecosystem.
I'm excited about the vision of developing a unified JupyterLab plugin for Kubeflow. I would love to contribute to this effort, particularly in areas related to extension development and integration with Kubeflow components.
Please add me to the Slack channel or mailing list so I can stay in the loop and collaborate with the team.
Looking forward to working with you all!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.