hadoop-lzo-packager
hadoop-lzo-packager copied to clipboard
Packaging utilities for GPL compression libraries in Hadoop
Overview
This project is a convenient mechanism of packaging the hadoop-gpl-compression project from Google Code.
It has three basic steps:
- Perform an svn export of the most recent revision of hadoop-gpl-compression
- Create and build an RPM
- Create an build a Debian package
Requirements:
- subversion
- java (preferably sun's JDK)
- JAVA_HOME must be set in your environment
- appropriate package building tools and lzo libs for your platform
- yum install rpm-build lzo-devel (RedHat based)
- apt-get install dev-scripts liblzo2-dev (Debian based)
- ant version 1.7.0 or greater (RedHat will require some fiddling[1])
When you try to build for your platform, build dependency errors will also inform you of any other packages you may need to install (eg lzo2 devel packages, ant, etc)
[1]: First you'll need to install "ant" and "ant-nodeps" with yum. Then You'll need to download the binary Apache Ant distribution from their website, and extract the tarball somewhere. You'll have to set ANT_HOME in your environment to point to the newly archived directory. You'll also have to put $ANT_HOME/bin in your path, before /usr/bin. Running "ant" on the command line should run $ANT_HOME/bin/ant, not /usr/bin/ant.
Usage:
To build packages, simply run the included shell script.
./run.sh
We recommend you run this on the same platform as your tasktrackers so as to be sure the built libraries are compatible.
Various options are available, to get help do:
./run.sh -h
If you would like to skip building debian or rpm, you can do:
./run.sh --no-rpm or ./run.sh --no-deb
If you'd like to check out a particular revision, you can do:
./run.sh --svn-rev=46
If the downloads fail because of certificate problems, you can do:
WGET_OPTS=--no-check-certificate ./run.sh
If the build fails and you find a file build/master then you have a version of wget which does not use the filename from the redirected URL. You can work around it with:
WGET_OPTS=--trust-server-names=on ./run.sh
Or with both options:
WGET_OPTS="--no-check-certificate --trust-server-names=on" ./run.sh
There are some other variables that can be overridden - simply look at the top section of run.sh to learn what they are.
After running the script, you should be able to find debs in the build/deb directory and RPMs in the build/topdir/RPMS directory.
Contributing
To contribute to this project, please clone its repository from http://github.com/toddlipcon/hadoop-lzo-packager/ and commit patches to your github repository. When you would like to submit your contribution for inclusion, send a Pull Request to the Cloudera repository.