Class CompositeInputFormat<K extends WritableComparable>
java.lang.Object
org.apache.hadoop.mapreduce.InputFormat<K,TupleWritable>
org.apache.hadoop.mapreduce.lib.join.CompositeInputFormat<K>
@Public
@Stable
public class CompositeInputFormat<K extends WritableComparable>
extends InputFormat<K,TupleWritable>
An InputFormat capable of performing joins over a set of data sources sorted
and partitioned the same way.
A user may define new join types by setting the property
mapreduce.join.define.<ident> to a classname.
In the expression mapreduce.join.expr, the identifier will be
assumed to be a ComposableRecordReader.
mapreduce.join.keycomparator can be a classname used to compare
keys in the join.-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected voidAdds the default set of identifiers to the parser.static Stringcompose(Class<? extends InputFormat> inf, String path) Convenience method for constructing composite formats.static Stringcompose(String op, Class<? extends InputFormat> inf, String... path) Convenience method for constructing composite formats.static Stringcompose(String op, Class<? extends InputFormat> inf, Path... path) Convenience method for constructing composite formats.createRecordReader(InputSplit split, TaskAttemptContext taskContext) Construct a CompositeRecordReader for the children of this InputFormat as defined in the init expression.getSplits(JobContext job) Build a CompositeInputSplit from the child InputFormats by assigning the ith split from each child to the ith composite split.voidsetFormat(Configuration conf) Interpret a given string as a composite expression.
-
Field Details
-
JOIN_EXPR
- See Also:
-
JOIN_COMPARATOR
- See Also:
-
-
Constructor Details
-
CompositeInputFormat
public CompositeInputFormat()
-
-
Method Details
-
setFormat
Interpret a given string as a composite expression.func ::= <ident>([<func>,]*<func>) func ::= tbl(<class>,"<path>") class ::= @see java.lang.Class#forName(java.lang.String) path ::= @see org.apache.hadoop.fs.Path#Path(java.lang.String)Reads expression from themapreduce.join.exprproperty and user-supplied join types frommapreduce.join.define.<ident>types. Paths supplied totblare given as input paths to the InputFormat class listed.- Throws:
IOException- See Also:
-
addDefaults
protected void addDefaults()Adds the default set of identifiers to the parser. -
getSplits
Build a CompositeInputSplit from the child InputFormats by assigning the ith split from each child to the ith composite split.- Specified by:
getSplitsin classInputFormat<K extends WritableComparable,TupleWritable> - Parameters:
job- job configuration.- Returns:
- an array of
InputSplits for the job. - Throws:
IOExceptionInterruptedException
-
createRecordReader
public RecordReader<K,TupleWritable> createRecordReader(InputSplit split, TaskAttemptContext taskContext) throws IOException, InterruptedException Construct a CompositeRecordReader for the children of this InputFormat as defined in the init expression. The outermost join need only be composable, not necessarily a composite. Mandating TupleWritable isn't strictly correct.- Specified by:
createRecordReaderin classInputFormat<K extends WritableComparable,TupleWritable> - Parameters:
split- the split to be readtaskContext- the information about the task- Returns:
- a new record reader
- Throws:
IOExceptionInterruptedException
-
compose
Convenience method for constructing composite formats. Given InputFormat class (inf), path (p) return:tbl(<inf>, <p>) -
compose
Convenience method for constructing composite formats. Given operation (op), Object class (inf), set of paths (p) return:<op>(tbl(<inf>,<p1>),tbl(<inf>,<p2>),...,tbl(<inf>,<pn>)) -
compose
Convenience method for constructing composite formats. Given operation (op), Object class (inf), set of paths (p) return:<op>(tbl(<inf>,<p1>),tbl(<inf>,<p2>),...,tbl(<inf>,<pn>))
-