kedro
kedro copied to clipboard
[KED-1148] Kedro-Server
Description
Kedro-Server is a RESTful API for Kedro pipelines allowing you to:
- Trigger a pipeline run using a platform or code agnostic post request
- Get requests on execution status or progress
- Get requests on pipelines available to run
- Trigger tests or linting from other applications or browsers
Context
Our users have struggled with two things:
- Determining the status of Kedro runs beyond looking at logs,
- And, triggering pipeline runs from a front-end when a different language is required.
These changes will have implications for kedro-viz
, our data pipeline visualisation plugin.
Hi Guys,
This looks very exciting and main page already mentions "Coming soon". Any thoughts on when we can have a chance to give it a try?
Hi @BartekSzpak , apologies for the delayed response!! We're in the process of exposing this to our internal users to get targeted feedback to iterate on, so you could say we're in "beta" phase. Timelines are hard to judge at this point, but my guess is it'll be another couple of months at least? @yetudada can keep me honest here.
The alpha release of Kedro-Server goes live this week @BartekSzpak! We'll get focussed user testing to get this one out sooner and will update you here when we're ready for the beta open-source release.
@yetudada Great work! This is a feature I have really been looking forward to use; it will make building applications for clients so much easier!
We're going to be rebuilding Kedro-Server as part of our roadmap. I'll close this ticket in the meanwhile and update the changes. In the meanwhile we recommend that once you have the model, you are free to choose any model serving mechanism to expose it via an API. Popular options include MLflow Serving and rolling your thin API wrapper with fastapi
.
@yetudada Is there any followup on this? It sounds like maybe you shipped a proof of concept privately, but then never got it into a position to be open sourced?
A couple of similar projects are kedro_fast_api and kedro_serving.
Hello @cshaley, everyone,
A long time has passed since there was activity in this issue and I'm trying to figure out what's the status of our internal Kedro-Server. Regardless, it's unlikely that this becomes a priority in the short term, and if you need a solution now, I'd recommend you to look into @Galileo-Galilei's kedro-serving (which seems more up to date than kedro_fast_api).
I'm mindful that this is one of the most upvoted issues of this repository, so I'll keep an eye on it and if you're missing anything from kedro-serving or the alternatives, feel free to ping me.
Hi, my two cents on the topic (as the author of kedro-serving
):
- I won't recommand to use
kedro-serving
for any production application at this stage. This is really an alpha version and will likely remain so for a while since I have other priorities. It is likely that it is hard to use because it is not documented anyway 😅 it can give some ideas on how to serve a pipeline for anyone digging in the code (please forgive the very bad python, this is a one shot prototype 😄 ) - I guess what is at stake here is quite different @astrojuanlu : I understand the "kedro-server" be more an orchestrator specifically designed for kedro pipelines. Eventually one day users will be able to run pipelines interactively with a button in kedro-viz and see the lasts runs from the logs, and this will solve part of the original issue I guess
Thanks @astrojuanlu and @Galileo-Galilei!
I'm looking into using or building something similar. @astrojuanlu - I understand the internal Kedro-Server you have is closed source, but would you or someone on the Kedro team be able to talk through some of the high level aspects of its design?
@Galileo-Galilei Thanks for your work on Kedro-Serving. I was able to put together a brief proof of concept using it and learned a ton. I'll raise an additional PR with at least some improved docs on how to use it.
More evidence that users are trying this https://linen-slack.kedro.org/t/14135700/hello-everyone-kedro-fast-api-works-i-am-testing-it-for-this#688e645c-13d8-48bb-a186-2ca53dc5c2c6
Reopening this issue to make it clear that we're doing something about it, sooner or later. Cannot promise any dates yet.
More https://github.com/kedro-org/kedro/discussions/3368
I am looking for this feature too. This and a logging feature seems crucial
People ask about this over and over again https://linen-slack.kedro.org/t/16597575/hi-is-kedro-fastapi-plugin-the-current-solution-to-create-ap#a50f3124-9d49-4f04-b1d1-a38caf78b7ee
kedro-boot
https://github.com/takikadiri/kedro-boot/ made some improvements to the session to enable this use case
https://github.com/takikadiri/kedro-boot/blob/main/README.md#consuming-kedro-pipeline-through-rest-api
There's a bit more context in #3540 and #2169
OK well it seems like an important feature.
I felt Kedro lacks a few key features such as this so I reluctantly created my own data science pipeline module called Django Flow Forge, which I find more flexible, secure, plays nicely with Celery and Kubernetes, has task logging and monitoring and is now working great in production. It was actually heavily inspired by the great work on Kedro but I wrote it in <1% of the code line count that Kedro + Kedro Viz has (according to github statistics) by using HTMX for the front end instead of a heavy React interface and relying on mature code from the Django framework.