|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.hadoop.mapred.FileOutputFormat<K,V> org.apache.hadoop.mapred.lib.MultipleOutputFormat<K,V>
@InterfaceAudience.Public @InterfaceStability.Stable public abstract class MultipleOutputFormat<K,V>
This abstract class extends the FileOutputFormat, allowing to write the output data to different output files. There are three basic use cases for this class. Case one: This class is used for a map reduce job with at least one reducer. The reducer wants to write data to different files depending on the actual keys. It is assumed that a key (or value) encodes the actual key (value) and the desired location for the actual key (value). Case two: This class is used for a map only job. The job wants to use an output file name that is either a part of the input file name of the input data, or some derivation of it. Case three: This class is used for a map only job. The job wants to use an output file name that depends on both the keys and the input file name,
Constructor Summary | |
---|---|
MultipleOutputFormat()
|
Method Summary | |
---|---|
protected K |
generateActualKey(K key,
V value)
Generate the actual key from the given key/value. |
protected V |
generateActualValue(K key,
V value)
Generate the actual value from the given key and value. |
protected String |
generateFileNameForKeyValue(K key,
V value,
String name)
Generate the file output file name based on the given key and the leaf file name. |
protected String |
generateLeafFileName(String name)
Generate the leaf name for the output file name. |
protected abstract RecordWriter<K,V> |
getBaseRecordWriter(FileSystem fs,
JobConf job,
String name,
Progressable arg3)
|
protected String |
getInputFileBasedOutputFileName(JobConf job,
String name)
Generate the outfile name based on a given anme and the input file name. |
RecordWriter<K,V> |
getRecordWriter(FileSystem fs,
JobConf job,
String name,
Progressable arg3)
Create a composite record writer that can write key/value data to different output files |
Methods inherited from class org.apache.hadoop.mapred.FileOutputFormat |
---|
checkOutputSpecs, getCompressOutput, getOutputCompressorClass, getOutputPath, getPathForCustomFile, getTaskOutputPath, getUniqueName, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputPath |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public MultipleOutputFormat()
Method Detail |
---|
public RecordWriter<K,V> getRecordWriter(FileSystem fs, JobConf job, String name, Progressable arg3) throws IOException
getRecordWriter
in interface OutputFormat<K,V>
getRecordWriter
in class FileOutputFormat<K,V>
fs
- the file system to usejob
- the job conf for the jobname
- the leaf file name for the output file (such as part-00000")arg3
- a progressable for reporting progress.
IOException
protected String generateLeafFileName(String name)
name
- the leaf file name for the output file
protected String generateFileNameForKeyValue(K key, V value, String name)
key
- the key of the output dataname
- the leaf file name
protected K generateActualKey(K key, V value)
key
- the key of the output datavalue
- the value of the output data
protected V generateActualValue(K key, V value)
key
- the key of the output datavalue
- the value of the output data
protected String getInputFileBasedOutputFileName(JobConf job, String name)
MRJobConfig.MAP_INPUT_FILE
does not exists (i.e. this is not for a map only job),
the given name is returned unchanged. If the config value for
"num.of.trailing.legs.to.use" is not set, or set 0 or negative, the given
name is returned unchanged. Otherwise, return a file name consisting of the
N trailing legs of the input file name where N is the config value for
"num.of.trailing.legs.to.use".
job
- the job configname
- the output file name
protected abstract RecordWriter<K,V> getBaseRecordWriter(FileSystem fs, JobConf job, String name, Progressable arg3) throws IOException
fs
- the file system to usejob
- a job conf objectname
- the name of the file over which a record writer object will be
constructedarg3
- a progressable object
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |