data-science
data-science copied to clipboard
Transaction Check Error During EMR Bootstrapping
If you attempt to run activities/common-crawl
on Amazon EMR, it will terminate with an error:
On the master instance (i-xxxxxx), bootstrap action 1 returned a non-zero return code
This is related to Amazon's changes in the package environment and its dependencies.
Log:
Transaction check error:
file /etc/init.d from install of chkconfig-1.3.49.3-2.14.amzn1.x86_64 conflicts with file from package EmrMetrics-1.0-1.noarch
file /etc/init.d from install of chkconfig-1.3.49.3-2.14.amzn1.x86_64 conflicts with file from package service-nanny-1.0-1.noarch
file /etc/init.d from install of chkconfig-1.3.49.3-2.14.amzn1.x86_64 conflicts with file from package instance-controller-1.0-1.noarch
file /etc/init.d from install of chkconfig-1.3.49.3-2.14.amzn1.x86_64 conflicts with file from package hadoop-state-pusher-1.0-1.noarch
Here are two workarounds - both in mrjob.conf
:
- Change the EMR AMI version to 3.5.0 or later.
- Change the YUM installation line to use this format:
yum --releasever=2014.09 install <package_name>