org.apache.hadoop.util
Class GenericOptionsParser

java.lang.Object
  extended by org.apache.hadoop.util.GenericOptionsParser

public class GenericOptionsParser
extends Object

GenericOptionsParser is a utility to parse command line arguments generic to the Hadoop framework. GenericOptionsParser recognizes several standarad command line arguments, enabling applications to easily specify a namenode, a jobtracker, additional configuration resources etc.

Generic Options

The supported generic options are:

     -conf <configuration file>     specify a configuration file
     -D <property=value>            use value for given property
     -fs <local|namenode:port>      specify a namenode
     -jt <local|jobtracker:port>    specify a job tracker
     -files <comma separated list of files>    specify comma separated
                            files to be copied to the map reduce cluster
     -libjars <comma separated list of jars>   specify comma separated
                            jar files to include in the classpath.
     -archives <comma separated list of archives>    specify comma
             separated archives to be unarchived on the compute machines.

 

The general command line syntax is:

 bin/hadoop command [genericOptions] [commandOptions]
 

Generic command line arguments might modify Configuration objects, given to constructors.

The functionality is implemented using Commons CLI.

Examples:

 $ bin/hadoop dfs -fs darwin:8020 -ls /data
 list /data directory in dfs with namenode darwin:8020
 
 $ bin/hadoop dfs -D fs.default.name=darwin:8020 -ls /data
 list /data directory in dfs with namenode darwin:8020
     
 $ bin/hadoop dfs -conf hadoop-site.xml -ls /data
 list /data directory in dfs with conf specified in hadoop-site.xml
     
 $ bin/hadoop job -D mapred.job.tracker=darwin:50020 -submit job.xml
 submit a job to job tracker darwin:50020
     
 $ bin/hadoop job -jt darwin:50020 -submit job.xml
 submit a job to job tracker darwin:50020
     
 $ bin/hadoop job -jt local -submit job.xml
 submit a job to local runner
 
 $ bin/hadoop jar -libjars testlib.jar 
 -archives test.tgz -files file.txt inputjar args
 job submission with libjars, files and archives
 

See Also:
Tool, ToolRunner

Constructor Summary
GenericOptionsParser(Configuration conf, org.apache.commons.cli.Options options, String[] args)
          Create a GenericOptionsParser to parse given options as well as generic Hadoop options.
GenericOptionsParser(Configuration conf, String[] args)
          Create a GenericOptionsParser to parse only the generic Hadoop arguments.
GenericOptionsParser(org.apache.commons.cli.Options opts, String[] args)
          Create an options parser with the given options to parse the args.
GenericOptionsParser(String[] args)
          Create an options parser to parse the args.
 
Method Summary
 org.apache.commons.cli.CommandLine getCommandLine()
          Returns the commons-cli CommandLine object to process the parsed arguments.
 Configuration getConfiguration()
          Get the modified configuration
static URL[] getLibJars(Configuration conf)
          If libjars are set in the conf, parse the libjars.
 String[] getRemainingArgs()
          Returns an array of Strings containing only application-specific arguments.
static void printGenericCommandUsage(PrintStream out)
          Print the usage message for generic command-line options supported.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

GenericOptionsParser

public GenericOptionsParser(org.apache.commons.cli.Options opts,
                            String[] args)
                     throws IOException
Create an options parser with the given options to parse the args.

Parameters:
opts - the options
args - the command line arguments
Throws:
IOException

GenericOptionsParser

public GenericOptionsParser(String[] args)
                     throws IOException
Create an options parser to parse the args.

Parameters:
args - the command line arguments
Throws:
IOException

GenericOptionsParser

public GenericOptionsParser(Configuration conf,
                            String[] args)
                     throws IOException
Create a GenericOptionsParser to parse only the generic Hadoop arguments. The array of string arguments other than the generic arguments can be obtained by getRemainingArgs().

Parameters:
conf - the Configuration to modify.
args - command-line arguments.
Throws:
IOException

GenericOptionsParser

public GenericOptionsParser(Configuration conf,
                            org.apache.commons.cli.Options options,
                            String[] args)
                     throws IOException
Create a GenericOptionsParser to parse given options as well as generic Hadoop options. The resulting CommandLine object can be obtained by getCommandLine().

Parameters:
conf - the configuration to modify
options - options built by the caller
args - User-specified arguments
Throws:
IOException
Method Detail

getRemainingArgs

public String[] getRemainingArgs()
Returns an array of Strings containing only application-specific arguments.

Returns:
array of Strings containing the un-parsed arguments or empty array if commandLine was not defined.

getConfiguration

public Configuration getConfiguration()
Get the modified configuration

Returns:
the configuration that has the modified parameters.

getCommandLine

public org.apache.commons.cli.CommandLine getCommandLine()
Returns the commons-cli CommandLine object to process the parsed arguments. Note: If the object is created with GenericOptionsParser(Configuration, String[]), then returned object will only contain parsed generic options.

Returns:
CommandLine representing list of arguments parsed against Options descriptor.

getLibJars

public static URL[] getLibJars(Configuration conf)
                        throws IOException
If libjars are set in the conf, parse the libjars.

Parameters:
conf -
Returns:
libjar urls
Throws:
IOException

printGenericCommandUsage

public static void printGenericCommandUsage(PrintStream out)
Print the usage message for generic command-line options supported.

Parameters:
out - stream to print the usage message to.


Copyright © 2009 The Apache Software Foundation