lakeFS
lakeFS copied to clipboard
Proposal: declarative views
This is pending a few user interviews but based on prior user conversations - spanning 3 different use cases:
- manage inputs for ML and DS experiments at the data consumer side (vs commits that are on the producer side)
- Support use cases where users might want to access the same branch across repos
- could possibly allow a server-side router-fs-like solution where data is composed from external sources
Like the idea, here are some thoughts I had while reading this one (No order or connection between the items):
- Views like images can be static and live - having live views can help integrations or Boto like use cases. Very similar to the idea to handle the "Boto" routing on the server side.
- The suggested idea can be relevant as data tool, not just part of lakeFS. We can copy data and build data, while in the case of lakeFS we will have it with zero copy.
- Instead of Lakefile we can have a manifest like JSON file that can be processed as part of repository creation. Having a fast way to create repository with initial state. Include an option to create an immutable repository, and/or branchless repository (like git without master branch), and we can leverage the current lakeFS capabilities without introducing a new entity. The repository can be accessed without branch information, and it will include the view we built.