org.apache.hadoop.mapreduce
Interface JobContext

All Superinterfaces:
org.apache.hadoop.mapreduce.MRJobConfig
All Known Subinterfaces:
JobContext, MapContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT>, ReduceContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT>, TaskAttemptContext, TaskAttemptContext, TaskInputOutputContext<KEYIN,VALUEIN,KEYOUT,VALUEOUT>
All Known Implementing Classes:
Job, org.apache.hadoop.mapreduce.task.JobContextImpl

@InterfaceAudience.Public
@InterfaceStability.Evolving
public interface JobContext
extends org.apache.hadoop.mapreduce.MRJobConfig

A read-only view of the job that is provided to the tasks while they are running.


Field Summary
 
Fields inherited from interface org.apache.hadoop.mapreduce.MRJobConfig
APPLICATION_ATTEMPT_ID, APPLICATION_MASTER_CLASS, CACHE_ARCHIVES, CACHE_ARCHIVES_SIZES, CACHE_ARCHIVES_TIMESTAMPS, CACHE_ARCHIVES_VISIBILITIES, CACHE_FILE_TIMESTAMPS, CACHE_FILE_VISIBILITIES, CACHE_FILES, CACHE_FILES_SIZES, CACHE_LOCALARCHIVES, CACHE_LOCALFILES, CACHE_SYMLINK, CLASSPATH_ARCHIVES, CLASSPATH_FILES, COMBINE_CLASS_ATTR, COMBINE_RECORDS_BEFORE_PROGRESS, COMPLETED_MAPS_FOR_REDUCE_SLOWSTART, COUNTER_GROUP_NAME_MAX_DEFAULT, COUNTER_GROUP_NAME_MAX_KEY, COUNTER_GROUPS_MAX_DEFAULT, COUNTER_GROUPS_MAX_KEY, COUNTER_NAME_MAX_DEFAULT, COUNTER_NAME_MAX_KEY, COUNTERS_MAX_DEFAULT, COUNTERS_MAX_KEY, DEFAULT_JOB_ACL_MODIFY_JOB, DEFAULT_JOB_ACL_VIEW_JOB, DEFAULT_JOB_AM_ACCESS_DISABLED, DEFAULT_JOB_TOKEN_TRACKING_IDS_ENABLED, DEFAULT_LOG_LEVEL, DEFAULT_MAP_CPU_VCORES, DEFAULT_MAP_MEMORY_MB, DEFAULT_MAPRED_ADMIN_JAVA_OPTS, DEFAULT_MAPRED_ADMIN_USER_ENV, DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH, DEFAULT_MAX_SHUFFLE_FETCH_RETRY_DELAY, DEFAULT_MR_AM_ADMIN_COMMAND_OPTS, DEFAULT_MR_AM_COMMAND_OPTS, DEFAULT_MR_AM_COMMIT_WINDOW_MS, DEFAULT_MR_AM_COMMITTER_CANCEL_TIMEOUT_MS, DEFAULT_MR_AM_CONTAINERLAUNCHER_THREAD_COUNT_LIMIT, DEFAULT_MR_AM_CPU_VCORES, DEFAULT_MR_AM_HISTORY_COMPLETE_EVENT_FLUSH_TIMEOUT_MS, DEFAULT_MR_AM_HISTORY_JOB_COMPLETE_UNFLUSHED_MULTIPLIER, DEFAULT_MR_AM_HISTORY_MAX_UNFLUSHED_COMPLETE_EVENTS, DEFAULT_MR_AM_HISTORY_USE_BATCHED_FLUSH_QUEUE_SIZE_THRESHOLD, DEFAULT_MR_AM_IGNORE_BLACKLISTING_BLACKLISTED_NODE_PERCENT, DEFAULT_MR_AM_JOB_CLIENT_THREAD_COUNT, DEFAULT_MR_AM_JOB_REDUCE_PREEMPTION_LIMIT, DEFAULT_MR_AM_JOB_REDUCE_RAMP_UP_LIMIT, DEFAULT_MR_AM_LOG_LEVEL, DEFAULT_MR_AM_MAX_ATTEMPTS, DEFAULT_MR_AM_NUM_PROGRESS_SPLITS, DEFAULT_MR_AM_STAGING_DIR, DEFAULT_MR_AM_TASK_ESTIMATOR_SMOOTH_LAMBDA_MS, DEFAULT_MR_AM_TASK_LISTENER_THREAD_COUNT, DEFAULT_MR_AM_TO_RM_HEARTBEAT_INTERVAL_MS, DEFAULT_MR_AM_TO_RM_WAIT_INTERVAL_MS, DEFAULT_MR_AM_VMEM_MB, DEFAULT_MR_CLIENT_MAX_RETRIES, DEFAULT_MR_CLIENT_TO_AM_IPC_MAX_RETRIES, DEFAULT_MR_JOB_END_NOTIFICATION_TIMEOUT, DEFAULT_REDUCE_CPU_VCORES, DEFAULT_REDUCE_MEMORY_MB, DEFAULT_SHELL, DEFAULT_SPLIT_METAINFO_MAXSIZE, GROUP_COMPARATOR_CLASS, HADOOP_WORK_DIR, ID, INDEX_CACHE_MEMORY_LIMIT, INPUT_FORMAT_CLASS_ATTR, IO_SORT_FACTOR, IO_SORT_MB, JAR, JAR_UNPACK_PATTERN, JOB_ACL_MODIFY_JOB, JOB_ACL_VIEW_JOB, JOB_AM_ACCESS_DISABLED, JOB_CANCEL_DELEGATION_TOKEN, JOB_CONF_FILE, JOB_JAR, JOB_JOBTRACKER_ID, JOB_LOCAL_DIR, JOB_NAME, JOB_NAMENODES, JOB_SPLIT, JOB_SPLIT_METAINFO, JOB_SUBMIT_DIR, JOB_SUBMITHOST, JOB_SUBMITHOSTADDR, JOB_TOKEN_TRACKING_IDS, JOB_TOKEN_TRACKING_IDS_ENABLED, JOB_UBERTASK_ENABLE, JOB_UBERTASK_MAXBYTES, JOB_UBERTASK_MAXMAPS, JOB_UBERTASK_MAXREDUCES, JVM_NUMTASKS_TORUN, KEY_COMPARATOR, MAP_CLASS_ATTR, MAP_COMBINE_MIN_SPILLS, MAP_CPU_VCORES, MAP_DEBUG_SCRIPT, MAP_ENV, MAP_FAILURES_MAX_PERCENT, MAP_INPUT_FILE, MAP_INPUT_PATH, MAP_INPUT_START, MAP_JAVA_OPTS, MAP_LOG_LEVEL, MAP_MAX_ATTEMPTS, MAP_MEMORY_MB, MAP_OUTPUT_COLLECTOR_CLASS_ATTR, MAP_OUTPUT_COMPRESS, MAP_OUTPUT_COMPRESS_CODEC, MAP_OUTPUT_KEY_CLASS, MAP_OUTPUT_KEY_FIELD_SEPERATOR, MAP_OUTPUT_VALUE_CLASS, MAP_SKIP_INCR_PROC_COUNT, MAP_SKIP_MAX_RECORDS, MAP_SORT_SPILL_PERCENT, MAP_SPECULATIVE, MAPRED_ADMIN_USER_ENV, MAPRED_ADMIN_USER_SHELL, MAPRED_MAP_ADMIN_JAVA_OPTS, MAPRED_REDUCE_ADMIN_JAVA_OPTS, MAPREDUCE_APPLICATION_CLASSPATH, MAPREDUCE_JOB_CLASSLOADER, MAPREDUCE_JOB_CLASSLOADER_SYSTEM_CLASSES, MAPREDUCE_JOB_CREDENTIALS_BINARY, MAPREDUCE_JOB_DIR, MAPREDUCE_JOB_USER_CLASSPATH_FIRST, MAPREDUCE_V2_CHILD_CLASS, MAX_SHUFFLE_FETCH_RETRY_DELAY, MAX_TASK_FAILURES_PER_TRACKER, MR_AM_ADMIN_COMMAND_OPTS, MR_AM_ADMIN_USER_ENV, MR_AM_COMMAND_OPTS, MR_AM_COMMIT_WINDOW_MS, MR_AM_COMMITTER_CANCEL_TIMEOUT_MS, MR_AM_CONTAINERLAUNCHER_THREAD_COUNT_LIMIT, MR_AM_CPU_VCORES, MR_AM_CREATE_JH_INTERMEDIATE_BASE_DIR, MR_AM_ENV, MR_AM_HISTORY_COMPLETE_EVENT_FLUSH_TIMEOUT_MS, MR_AM_HISTORY_JOB_COMPLETE_UNFLUSHED_MULTIPLIER, MR_AM_HISTORY_MAX_UNFLUSHED_COMPLETE_EVENTS, MR_AM_HISTORY_USE_BATCHED_FLUSH_QUEUE_SIZE_THRESHOLD, MR_AM_IGNORE_BLACKLISTING_BLACKLISTED_NODE_PERECENT, MR_AM_JOB_CLIENT_PORT_RANGE, MR_AM_JOB_CLIENT_THREAD_COUNT, MR_AM_JOB_NODE_BLACKLISTING_ENABLE, MR_AM_JOB_RECOVERY_ENABLE, MR_AM_JOB_RECOVERY_ENABLE_DEFAULT, MR_AM_JOB_REDUCE_PREEMPTION_LIMIT, MR_AM_JOB_REDUCE_RAMPUP_UP_LIMIT, MR_AM_JOB_SPECULATOR, MR_AM_LOG_LEVEL, MR_AM_MAX_ATTEMPTS, MR_AM_NUM_PROGRESS_SPLITS, MR_AM_PREFIX, MR_AM_SECURITY_SERVICE_AUTHORIZATION_CLIENT, MR_AM_SECURITY_SERVICE_AUTHORIZATION_TASK_UMBILICAL, MR_AM_STAGING_DIR, MR_AM_TASK_ESTIMATOR, MR_AM_TASK_ESTIMATOR_EXPONENTIAL_RATE_ENABLE, MR_AM_TASK_ESTIMATOR_SMOOTH_LAMBDA_MS, MR_AM_TASK_LISTENER_THREAD_COUNT, MR_AM_TO_RM_HEARTBEAT_INTERVAL_MS, MR_AM_TO_RM_WAIT_INTERVAL_MS, MR_AM_VMEM_MB, MR_APPLICATION_TYPE, MR_CLIENT_MAX_RETRIES, MR_CLIENT_TO_AM_IPC_MAX_RETRIES, MR_JOB_END_NOTIFICATION_MAX_ATTEMPTS, MR_JOB_END_NOTIFICATION_MAX_RETRY_INTERVAL, MR_JOB_END_NOTIFICATION_PROXY, MR_JOB_END_NOTIFICATION_TIMEOUT, MR_JOB_END_NOTIFICATION_URL, MR_JOB_END_RETRY_ATTEMPTS, MR_JOB_END_RETRY_INTERVAL, MR_PREFIX, NUM_MAP_PROFILES, NUM_MAPS, NUM_REDUCE_PROFILES, NUM_REDUCES, OUTPUT, OUTPUT_FORMAT_CLASS_ATTR, OUTPUT_KEY_CLASS, OUTPUT_VALUE_CLASS, PARTITIONER_CLASS_ATTR, PRESERVE_FAILED_TASK_FILES, PRESERVE_FILES_PATTERN, PRIORITY, QUEUE_NAME, RECORDS_BEFORE_PROGRESS, REDUCE_CLASS_ATTR, REDUCE_CPU_VCORES, REDUCE_DEBUG_SCRIPT, REDUCE_ENV, REDUCE_FAILURES_MAXPERCENT, REDUCE_INPUT_BUFFER_PERCENT, REDUCE_JAVA_OPTS, REDUCE_LOG_LEVEL, REDUCE_MARKRESET_BUFFER_PERCENT, REDUCE_MARKRESET_BUFFER_SIZE, REDUCE_MAX_ATTEMPTS, REDUCE_MEMORY_MB, REDUCE_MEMORY_TOTAL_BYTES, REDUCE_MEMTOMEM_ENABLED, REDUCE_MEMTOMEM_THRESHOLD, REDUCE_MERGE_INMEM_THRESHOLD, REDUCE_SKIP_INCR_PROC_COUNT, REDUCE_SKIP_MAXGROUPS, REDUCE_SPECULATIVE, SETUP_CLEANUP_NEEDED, SHUFFLE_CONNECT_TIMEOUT, SHUFFLE_FETCH_FAILURES, SHUFFLE_INPUT_BUFFER_PERCENT, SHUFFLE_MEMORY_LIMIT_PERCENT, SHUFFLE_MERGE_PERCENT, SHUFFLE_NOTIFY_READERROR, SHUFFLE_PARALLEL_COPIES, SHUFFLE_READ_TIMEOUT, SKIP_OUTDIR, SKIP_RECORDS, SKIP_START_ATTEMPTS, SPECULATIVE_SLOWNODE_THRESHOLD, SPECULATIVE_SLOWTASK_THRESHOLD, SPECULATIVECAP, SPLIT_FILE, SPLIT_METAINFO_MAXSIZE, STDERR_LOGFILE_ENV, STDOUT_LOGFILE_ENV, TASK_ATTEMPT_ID, TASK_CLEANUP_NEEDED, TASK_DEBUGOUT_LINES, TASK_ID, TASK_ISMAP, TASK_MAP_PROFILE_PARAMS, TASK_OUTPUT_DIR, TASK_PARTITION, TASK_PROFILE, TASK_PROFILE_PARAMS, TASK_REDUCE_PROFILE_PARAMS, TASK_TEMP_DIR, TASK_TIMEOUT, TASK_TIMEOUT_CHECK_INTERVAL_MS, TASK_USERLOG_LIMIT, USER_LOG_RETAIN_HOURS, USER_NAME, WORKDIR, WORKFLOW_ADJACENCY_PREFIX_PATTERN, WORKFLOW_ADJACENCY_PREFIX_STRING, WORKFLOW_ID, WORKFLOW_NAME, WORKFLOW_NODE_NAME, WORKFLOW_TAGS, WORKING_DIR
 
Method Summary
 Path[] getArchiveClassPaths()
          Get the archive entries in classpath as an array of Path
 String[] getArchiveTimestamps()
          Get the timestamps of the archives.
 URI[] getCacheArchives()
          Get cache archives set in the Configuration
 URI[] getCacheFiles()
          Get cache files set in the Configuration
 Class<? extends Reducer<?,?,?,?>> getCombinerClass()
          Get the combiner class for the job.
 Configuration getConfiguration()
          Return the configuration for the job.
 org.apache.hadoop.security.Credentials getCredentials()
          Get credentials for the job.
 Path[] getFileClassPaths()
          Get the file entries in classpath as an array of Path
 String[] getFileTimestamps()
          Get the timestamps of the files.
 RawComparator<?> getGroupingComparator()
          Get the user defined RawComparator comparator for grouping keys of inputs to the reduce.
 Class<? extends InputFormat<?,?>> getInputFormatClass()
          Get the InputFormat class for the job.
 String getJar()
          Get the pathname of the job's jar.
 JobID getJobID()
          Get the unique ID for the job.
 String getJobName()
          Get the user-specified job name.
 boolean getJobSetupCleanupNeeded()
          Get whether job-setup and job-cleanup is needed for the job
 Path[] getLocalCacheArchives()
          Deprecated. the array returned only includes the items the were downloaded. There is no way to map this to what is returned by getCacheArchives().
 Path[] getLocalCacheFiles()
          Deprecated. the array returned only includes the items the were downloaded. There is no way to map this to what is returned by getCacheFiles().
 Class<?> getMapOutputKeyClass()
          Get the key class for the map output data.
 Class<?> getMapOutputValueClass()
          Get the value class for the map output data.
 Class<? extends Mapper<?,?,?,?>> getMapperClass()
          Get the Mapper class for the job.
 int getMaxMapAttempts()
          Get the configured number of maximum attempts that will be made to run a map task, as specified by the mapred.map.max.attempts property.
 int getMaxReduceAttempts()
          Get the configured number of maximum attempts that will be made to run a reduce task, as specified by the mapred.reduce.max.attempts property.
 int getNumReduceTasks()
          Get configured the number of reduce tasks for this job.
 Class<? extends OutputFormat<?,?>> getOutputFormatClass()
          Get the OutputFormat class for the job.
 Class<?> getOutputKeyClass()
          Get the key class for the job output data.
 Class<?> getOutputValueClass()
          Get the value class for job outputs.
 Class<? extends Partitioner<?,?>> getPartitionerClass()
          Get the Partitioner class for the job.
 boolean getProfileEnabled()
          Get whether the task profiling is enabled.
 String getProfileParams()
          Get the profiler configuration arguments.
 org.apache.hadoop.conf.Configuration.IntegerRanges getProfileTaskRange(boolean isMap)
          Get the range of maps or reduces to profile.
 Class<? extends Reducer<?,?,?,?>> getReducerClass()
          Get the Reducer class for the job.
 RawComparator<?> getSortComparator()
          Get the RawComparator comparator used to compare keys.
 boolean getSymlink()
          Deprecated. 
 boolean getTaskCleanupNeeded()
          Get whether task-cleanup is needed for the job
 String getUser()
          Get the reported username for this job.
 Path getWorkingDirectory()
          Get the current working directory for the default file system.
 

Method Detail

getConfiguration

Configuration getConfiguration()
Return the configuration for the job.

Returns:
the shared configuration object

getCredentials

org.apache.hadoop.security.Credentials getCredentials()
Get credentials for the job.

Returns:
credentials for the job

getJobID

JobID getJobID()
Get the unique ID for the job.

Returns:
the object with the job id

getNumReduceTasks

int getNumReduceTasks()
Get configured the number of reduce tasks for this job. Defaults to 1.

Returns:
the number of reduce tasks for this job.

getWorkingDirectory

Path getWorkingDirectory()
                         throws IOException
Get the current working directory for the default file system.

Returns:
the directory name.
Throws:
IOException

getOutputKeyClass

Class<?> getOutputKeyClass()
Get the key class for the job output data.

Returns:
the key class for the job output data.

getOutputValueClass

Class<?> getOutputValueClass()
Get the value class for job outputs.

Returns:
the value class for job outputs.

getMapOutputKeyClass

Class<?> getMapOutputKeyClass()
Get the key class for the map output data. If it is not set, use the (final) output key class. This allows the map output key class to be different than the final output key class.

Returns:
the map output key class.

getMapOutputValueClass

Class<?> getMapOutputValueClass()
Get the value class for the map output data. If it is not set, use the (final) output value class This allows the map output value class to be different than the final output value class.

Returns:
the map output value class.

getJobName

String getJobName()
Get the user-specified job name. This is only used to identify the job to the user.

Returns:
the job's name, defaulting to "".

getInputFormatClass

Class<? extends InputFormat<?,?>> getInputFormatClass()
                                                      throws ClassNotFoundException
Get the InputFormat class for the job.

Returns:
the InputFormat class for the job.
Throws:
ClassNotFoundException

getMapperClass

Class<? extends Mapper<?,?,?,?>> getMapperClass()
                                                throws ClassNotFoundException
Get the Mapper class for the job.

Returns:
the Mapper class for the job.
Throws:
ClassNotFoundException

getCombinerClass

Class<? extends Reducer<?,?,?,?>> getCombinerClass()
                                                   throws ClassNotFoundException
Get the combiner class for the job.

Returns:
the combiner class for the job.
Throws:
ClassNotFoundException

getReducerClass

Class<? extends Reducer<?,?,?,?>> getReducerClass()
                                                  throws ClassNotFoundException
Get the Reducer class for the job.

Returns:
the Reducer class for the job.
Throws:
ClassNotFoundException

getOutputFormatClass

Class<? extends OutputFormat<?,?>> getOutputFormatClass()
                                                        throws ClassNotFoundException
Get the OutputFormat class for the job.

Returns:
the OutputFormat class for the job.
Throws:
ClassNotFoundException

getPartitionerClass

Class<? extends Partitioner<?,?>> getPartitionerClass()
                                                      throws ClassNotFoundException
Get the Partitioner class for the job.

Returns:
the Partitioner class for the job.
Throws:
ClassNotFoundException

getSortComparator

RawComparator<?> getSortComparator()
Get the RawComparator comparator used to compare keys.

Returns:
the RawComparator comparator used to compare keys.

getJar

String getJar()
Get the pathname of the job's jar.

Returns:
the pathname

getGroupingComparator

RawComparator<?> getGroupingComparator()
Get the user defined RawComparator comparator for grouping keys of inputs to the reduce.

Returns:
comparator set by the user for grouping values.
See Also:
for details.

getJobSetupCleanupNeeded

boolean getJobSetupCleanupNeeded()
Get whether job-setup and job-cleanup is needed for the job

Returns:
boolean

getTaskCleanupNeeded

boolean getTaskCleanupNeeded()
Get whether task-cleanup is needed for the job

Returns:
boolean

getProfileEnabled

boolean getProfileEnabled()
Get whether the task profiling is enabled.

Returns:
true if some tasks will be profiled

getProfileParams

String getProfileParams()
Get the profiler configuration arguments. The default value for this property is "-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s"

Returns:
the parameters to pass to the task child to configure profiling

getProfileTaskRange

org.apache.hadoop.conf.Configuration.IntegerRanges getProfileTaskRange(boolean isMap)
Get the range of maps or reduces to profile.

Parameters:
isMap - is the task a map?
Returns:
the task ranges

getUser

String getUser()
Get the reported username for this job.

Returns:
the username

getSymlink

@Deprecated
boolean getSymlink()
Deprecated. 

Originally intended to check if symlinks should be used, but currently symlinks cannot be disabled.

Returns:
true

getArchiveClassPaths

Path[] getArchiveClassPaths()
Get the archive entries in classpath as an array of Path


getCacheArchives

URI[] getCacheArchives()
                       throws IOException
Get cache archives set in the Configuration

Returns:
A URI array of the caches set in the Configuration
Throws:
IOException

getCacheFiles

URI[] getCacheFiles()
                    throws IOException
Get cache files set in the Configuration

Returns:
A URI array of the files set in the Configuration
Throws:
IOException

getLocalCacheArchives

@Deprecated
Path[] getLocalCacheArchives()
                             throws IOException
Deprecated. the array returned only includes the items the were downloaded. There is no way to map this to what is returned by getCacheArchives().

Return the path array of the localized caches

Returns:
A path array of localized caches
Throws:
IOException

getLocalCacheFiles

@Deprecated
Path[] getLocalCacheFiles()
                          throws IOException
Deprecated. the array returned only includes the items the were downloaded. There is no way to map this to what is returned by getCacheFiles().

Return the path array of the localized files

Returns:
A path array of localized files
Throws:
IOException

getFileClassPaths

Path[] getFileClassPaths()
Get the file entries in classpath as an array of Path


getArchiveTimestamps

String[] getArchiveTimestamps()
Get the timestamps of the archives. Used by internal DistributedCache and MapReduce code.

Returns:
a string array of timestamps
Throws:
IOException

getFileTimestamps

String[] getFileTimestamps()
Get the timestamps of the files. Used by internal DistributedCache and MapReduce code.

Returns:
a string array of timestamps
Throws:
IOException

getMaxMapAttempts

int getMaxMapAttempts()
Get the configured number of maximum attempts that will be made to run a map task, as specified by the mapred.map.max.attempts property. If this property is not already set, the default is 4 attempts.

Returns:
the max number of attempts per map task.

getMaxReduceAttempts

int getMaxReduceAttempts()
Get the configured number of maximum attempts that will be made to run a reduce task, as specified by the mapred.reduce.max.attempts property. If this property is not already set, the default is 4 attempts.

Returns:
the max number of attempts per reduce task.


Copyright © 2013 Apache Software Foundation. All Rights Reserved.