kedro-plugins icon indicating copy to clipboard operation
kedro-plugins copied to clipboard

Install kedro-dataset without the kedro dependency

Open npfp opened this issue 7 months ago • 3 comments

Description

We'll be interested to use the amazing kedro datasets but without the whole kedro dependency in a project so we can benefit from all the dataset implementations.

Context

kedro-dataset has a clear interface and comes with many useful implementations. It would be really nice to use it in non kedro projects.

So I wonder, if it could be possible/doable to remove the kedro dependency.

npfp avatar Sep 04 '25 07:09 npfp

Hi @npfp good question!

As far as I can remember the datasets package needs Kedro core to do imports like this from the io package


from kedro.io.core import AbstractDataset, DatasetError

There's other similar stuff like versioning etc too.

This tight coupling is annoying for your use case but also a little difficult to separate without duplication in both packages.

I guess on a conceptual level, we've tried to make Kedro core quite lightweight when it comes to dependencies, scanning the pyproject.toml of Kedro core , the dependency footprint is mostly configuration utilities and relatively lightweight. Part of the rationale for decoupling kedro-datasets in the first place was that the connectors themselves were the heavy part!

Could you tell us a little about your situation, where this will make an impact etc?

datajoely avatar Sep 04 '25 08:09 datajoely

Hi @npfp, what @datajoely mentioned about the AbstractDataset is correct and moving this to kedro-datasets or duplicating across packages wouldn't be a trivial choice. It would be helpful to hear more about your use case and like Joel asked, where this will make an impact for you?

merelcht avatar Oct 01 '25 14:10 merelcht

Relevant issue/comment: https://github.com/kedro-org/kedro/issues/1936#issuecomment-2264776456

Another request: https://github.com/kedro-org/kedro/issues/1758#issuecomment-3279394202

Anecdotally, this has come up many times over the years (users have mentioned Kedro-Datasets being one of the most useful/powerful parts of Kedro). There seemed to also be general alignment around introducing DatasetProtocol as a next step.

deepyaman avatar Oct 02 '25 10:10 deepyaman

This issue has been closed due to lack of information. Feel free to re-open this issue if you're facing a similar problem. Please provide as much information as possible so we can help resolve your issue.

github-actions[bot] avatar Oct 30 '25 09:10 github-actions[bot]