org.apache.hadoop.examples.terasort
Class TeraGen
java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.hadoop.examples.terasort.TeraGen
- All Implemented Interfaces:
- Configurable, Tool
public class TeraGen
- extends Configured
- implements Tool
Generate the official terasort input data set.
The user specifies the number of rows and the output directory and this
class runs a map/reduce program to generate the data.
The format of the data is:
- (10 bytes key) (10 bytes rowid) (78 bytes filler) \r \n
- The keys are random characters from the set ' ' .. '~'.
- The rowid is the right justified row id as a int.
- The filler consists of 7 runs of 10 characters from 'A' to 'Z'.
To run the program:
bin/hadoop jar hadoop-examples-*.jar teragen 10000000000 in-dir
Nested Class Summary |
static class |
TeraGen.SortGenMapper
The Mapper class that given a row number, will generate the appropriate
output line. |
Method Summary |
static void |
main(String[] args)
|
int |
run(String[] args)
Execute the command with the given arguments. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TeraGen
public TeraGen()
run
public int run(String[] args)
throws IOException
- Description copied from interface:
Tool
- Execute the command with the given arguments.
- Specified by:
run
in interface Tool
- Parameters:
args
- the cli arguments
- Returns:
- exit code.
- Throws:
IOException
main
public static void main(String[] args)
throws Exception
- Throws:
Exception
Copyright © 2009 The Apache Software Foundation