lakeFS
lakeFS copied to clipboard
lakeFS - Data version control for your data lake | Git for data
Tracking this thread: https://lakefs.slack.com/archives/C016726JLJW/p1658771052984169 Use case: We're seeing if we can handle data reformating within lakefs - so if someone uploads a CSV we can automatically convert it to parquet...
Signed-off-by: AdamKorcz This PR adds a simple fuzzer and the infrastructure to integrate [ClusterfuzzLite](https://google.github.io/clusterfuzzlite/) into lakeFS. ClusterfuzzLite will run fuzzers when PRs are made.
In this task, we will configure a working lakeFSIceberg setup. The task's output should be: - docs page for how to configure Iceberg to work with lakeFS, and the use...
## lakeFS Hadoop Filesystem Our direct-access Hadoop Filesystem implementation currently depends on some `hadoop-*` libraries in version 2.7.7. This was done with the decision to support (only) Spark version 2.4.7...
When collecting logs from JSON format into Grafana we get logs such as these (reformatted for beauty and legibility): ```json { "file": "usr/local/go/src/net/http/server.go:2047", "func": "net/http.HandlerFunc.ServeHTTP", "host": "treeverse.eu-central-1.lakefscloud.io", "level": "info", "log_audit":...
DO NOT MERGE in current form, it will _remove_ support for Hadoop2 :horror:
In KV, Delete item is an idempotent operation, which succeeds whether the item existed in the store or not. In DB we relied on the postgres DELETE command which failed...
Error should not be wrapped with key information. Instead add logging to module
Though not part of gravler, this functionality is heavily dependent on gravlere and its usage of the underlying store. Having tests for these will add to our coverage and performance...
Follow the [proposal](https://github.com/treeverse/lakeFS/blob/master/design/accepted/metadata_kv/index.md#caching-branch-pointers-and-amortized-reads) to implement the amortized read/write of a branch