org.apache.pig.backend.hadoop.executionengine.mapreduceExec
Class PigMapReduce
java.lang.Object
org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce
- All Implemented Interfaces:
- Closeable, org.apache.hadoop.mapred.JobConfigurable, org.apache.hadoop.mapred.MapRunnable<org.apache.hadoop.io.WritableComparable,Tuple,org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable>, org.apache.hadoop.mapred.Reducer<Tuple,IndexedTuple,org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable>
public class PigMapReduce
- extends Object
- implements org.apache.hadoop.mapred.MapRunnable<org.apache.hadoop.io.WritableComparable,Tuple,org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable>, org.apache.hadoop.mapred.Reducer<Tuple,IndexedTuple,org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable>
This class is a wrapper of sorts for Pig Map/Reduce jobs. Both the Mapper and the Reducer are
implemented by this class. The methods of this class are driven by job configuration variables:
- pig.inputs
- A semi-colon separated list of inputs. If an input uses a special parser, it will be
specified by adding a colon and the name of the parser to the input. For example:
/tmp/names.txt;/tmp/logs.dat:com.yahoo.research.pig.parser.LogParser will parse /tmp/names.txt
using the default parser and /tmp/logs.dat using com.yahoo.research.pig.parser.LogParser.
- pig.mapFuncs
- A semi-colon separated list of functions-specification to be applied to the inputs in the
Map phase. This list must have the same number of items as pig.inputs because the each
functions-spectification will be matched to the corresponding input.
- pig.groupFuncs
- A semi-colon separated list of group functions. As with pig.mapFuncs, this list must have
the same number of items as pig.inputs because the each group function will be matched to the
corresponding input.
- pig.reduceFuncs
- functions-specification to be applied to the tuples passed into the Reduce phase.
- Author:
- breed
Field Summary |
static org.apache.hadoop.mapred.Reporter |
reporter
|
Method Summary |
void |
close()
Nothing happens here. |
void |
closeSideFiles()
|
void |
configure(org.apache.hadoop.mapred.JobConf jobConf)
|
static PigContext |
getPigContext()
|
void |
reduce(Tuple key,
Iterator<IndexedTuple> values,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable> output,
org.apache.hadoop.mapred.Reporter reporter)
|
void |
run(org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.WritableComparable,Tuple> input,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable> output,
org.apache.hadoop.mapred.Reporter reporter)
This function is called in MapTask by Hadoop as the Mapper.run() method. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
reporter
public static org.apache.hadoop.mapred.Reporter reporter
PigMapReduce
public PigMapReduce()
run
public void run(org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.WritableComparable,Tuple> input,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable> output,
org.apache.hadoop.mapred.Reporter reporter)
throws IOException
- This function is called in MapTask by Hadoop as the Mapper.run() method. We basically pull
the tuples from our PigRecordReader (see ugly ThreadLocal hack), pipe the tuples through the
function pipeline and then close the writer.
- Specified by:
run
in interface org.apache.hadoop.mapred.MapRunnable<org.apache.hadoop.io.WritableComparable,Tuple,org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable>
- Throws:
IOException
reduce
public void reduce(Tuple key,
Iterator<IndexedTuple> values,
org.apache.hadoop.mapred.OutputCollector<org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable> output,
org.apache.hadoop.mapred.Reporter reporter)
throws IOException
- Specified by:
reduce
in interface org.apache.hadoop.mapred.Reducer<Tuple,IndexedTuple,org.apache.hadoop.io.WritableComparable,org.apache.hadoop.io.Writable>
- Throws:
IOException
configure
public void configure(org.apache.hadoop.mapred.JobConf jobConf)
- Specified by:
configure
in interface org.apache.hadoop.mapred.JobConfigurable
close
public void close()
throws IOException
- Nothing happens here.
- Specified by:
close
in interface Closeable
- Throws:
IOException
getPigContext
public static PigContext getPigContext()
closeSideFiles
public void closeSideFiles()
Copyright © ${year} The Apache Software Foundation