hdfs-deprecated
hdfs-deprecated copied to clipboard
Support for non-CDH HDFS distributions
Currently the framework uses the hadoop/hdfs 2.4.0 dependencies in the pom, but the build-hdfs packaging script packages up hadoop-2.3.0-cdh5.1.0 from cloudera.
- The pom dependency and packaged binary versions should match
- We should support Apache and other arbitrary hdfs distros/versions, as long as the CLI is compatible.
Note that the dependencies are apache license, the packaged scripts from hadoop-2.3.0-cdh5.1.0 are also apache license, and the framework source code is apache license as well.
Perhaps to match versions, we should download hadoop-2.4.0-cdh5.1.0 instead of 2.3.
Is it necessary for hdfs to be present in the POM?
Note, we may want to pull down the packaged hadoop scripts and binaries from a pure apache source. We also may want to modify the pom to allow support for multiple versions.