lakeFS
lakeFS copied to clipboard
[Documentation] document Hadoop/Spark support matrix on lakeFSFileSystem
lakeFS Hadoop Filesystem
Our direct-access Hadoop Filesystem implementation currently depends on some hadoop-* libraries in version 2.7.7.
This was done with the decision to support (only) Spark version 2.4.7 and 3.0.1 which both depend on Hadoop 2.7.7.
Metadata client
Metadata client is compiled and published twice, once for Spark 2.4.7 and one for Spark 3.0.1.
The problem
Times are changing and these Spark versions are getting old. Research how we can support more versions. The user needs to be able to understand easily which version to use and how to get this version.
Related issues
#2336 - describes the need to support Hadoop 3.1 specifically. #2731 - caused due to a missing feature in hadoop 2.7.7. #2597 - task to test flows based on the metadata client (export/GC) in Spark 3.1 on EMR.
duplicate of #2800