mrjob parse EMR controller step log

parse EMR controller step log

Open coyotemarin opened this issue 6 years ago • 1 comments

Sometimes the error is in the controller log, for example:

2018-11-02T20:47:28.392Z INFO Ensure step 1 jar file /home/hadoop/hadoop-examples.jar
INFO Failed to download: /home/hadoop/hadoop-examples.jar
java.io.FileNotFoundException: File '/home/hadoop/hadoop-examples.jar' does not exist on local filesystem
        at aws157.instancecontroller.util.S3Wrapper.getLocalHadoopFile(S3Wrapper.java:382)
        at aws157.instancecontroller.util.S3Wrapper.fetchHadoopFileToLocal(S3Wrapper.java:349)
        at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner$Runner.<init>(HadoopJarStepRunner.java:243)
        at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner.createRunner(HadoopJarStepRunner.java:152)
        at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner.createRunner(HadoopJarStepRunner.java:146)
        at aws157.instancecontroller.master.steprunner.StepExecutor.runStep(StepExecutor.java:136)
        at aws157.instancecontroller.master.steprunner.StepExecutor.run(StepExecutor.java:70)
        at aws157.instancecontroller.master.steprunner.StepExecutionManager.enqueueStep(StepExecutionManager.java:248)
        at aws157.instancecontroller.master.steprunner.StepExecutionManager.doRun(StepExecutionManager.java:195)
        at aws157.instancecontroller.master.steprunner.StepExecutionManager.access$000(StepExecutionManager.java:33)
        at aws157.instancecontroller.master.steprunner.StepExecutionManager$1.run(StepExecutionManager.java:94)

Nov 02 '18 21:11 coyotemarin

Even stdout can be useful in a pinch. For example, here's what happened when I misplaced a command-line switch when running hadoop-mapreduce-examples.jar:

Unknown program '-D' chosen.
Valid program names are:
  aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
  aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
  bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
  dbcount: An example job that count the pageview counts from a database.
  distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
  grep: A map/reduce program that counts the matches of a regex in the input.
  join: A job that effects a join over sorted, equally partitioned datasets
  multifilewc: A job that counts words from several files.
  pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
  pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
  randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
  randomwriter: A map/reduce program that writes 10GB of random data per node.
  secondarysort: An example defining a secondary sort to the reduce.
  sort: A map/reduce program that sorts the data written by the random writer.
  sudoku: A sudoku solver.
  teragen: Generate data for the terasort
  terasort: Run the terasort
  teravalidate: Checking results of terasort
  wordcount: A map/reduce program that counts the words in the input files.
  wordmean: A map/reduce program that counts the average length of the words in the input files.
  wordmedian: A map/reduce program that counts the median length of the words in the input files.
  wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.

Nov 02 '18 21:11 coyotemarin

mrjob mrjob copied to clipboard

parse EMR controller step log

mrjob
mrjob copied to clipboard