org.apache.hadoop.mapred
Class JobHistory

java.lang.Object
  extended by org.apache.hadoop.mapred.JobHistory

public class JobHistory
extends Object

Provides methods for writing to and reading from job history. Job History works in an append mode, JobHistory and its inner classes provide methods to log job events. JobHistory is split into multiple files, format of each file is plain text where each line is of the format [type (key=value)*], where type identifies the type of the record. Type maps to UID of one of the inner classes of this class. Job history is maintained in a master index which contains star/stop times of all jobs with a few other job level properties. Apart from this each job's history is maintained in a seperate history file. name of job history files follows the format jobtrackerId_jobid For parsing the job history it supports a listener based interface where each line is parsed and passed to listener. The listener can create an object model of history or look for specific events and discard rest of the history. CHANGE LOG : Version 0 : The history has the following format : TAG KEY1="VALUE1" KEY2="VALUE2" and so on. TAG can be Job, Task, MapAttempt or ReduceAttempt. Note that a '"' is the line delimiter. Version 1 : Changes the line delimiter to '.' Values are now escaped for unambiguous parsing. Added the Meta tag to store version info.


Nested Class Summary
static class JobHistory.HistoryCleaner
          Delete history files older than one month.
static class JobHistory.JobInfo
          Helper class for logging or reading back events related to job start, finish or failure.
static class JobHistory.Keys
          Job history files contain key="value" pairs, where keys belong to this enum.
static interface JobHistory.Listener
          Callback interface for reading back log events from JobHistory.
static class JobHistory.MapAttempt
          Helper class for logging or reading back events related to start, finish or failure of a Map Attempt on a node.
static class JobHistory.RecordTypes
          Record types are identifiers for each line of log in history files.
static class JobHistory.ReduceAttempt
          Helper class for logging or reading back events related to start, finish or failure of a Map Attempt on a node.
static class JobHistory.Task
          Helper class for logging or reading back events related to Task's start, finish or failure.
static class JobHistory.TaskAttempt
          Base class for Map and Reduce TaskAttempts.
static class JobHistory.Values
          This enum contains some of the values commonly used by history log events.
 
Field Summary
static Pattern CONF_FILENAME_REGEX
           
protected static Path DONE
           
protected static FileSystem DONEDIR_FS
           
static int JOB_NAME_TRIM_LENGTH
           
static Pattern JOBHISTORY_FILENAME_REGEX
           
static org.apache.commons.logging.Log LOG
           
 
Constructor Summary
JobHistory()
           
 
Method Summary
static String getHistoryFilePath(JobID jobId)
          Given the job id, return the history file path from the cache
static String getTaskLogsUrl(JobHistory.TaskAttempt attempt)
          Return the TaskLogsUrl of a particular TaskAttempt
static void init(JobTracker jobTracker, JobConf conf, String hostname, long jobTrackerStartTime)
          Initialize JobHistory files.
static void parseHistoryFromFS(String path, JobHistory.Listener l, FileSystem fs)
          Parses history file and invokes Listener.handle() for each line of history.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

JOB_NAME_TRIM_LENGTH

public static final int JOB_NAME_TRIM_LENGTH
See Also:
Constant Field Values

DONEDIR_FS

protected static FileSystem DONEDIR_FS

DONE

protected static Path DONE

JOBHISTORY_FILENAME_REGEX

public static final Pattern JOBHISTORY_FILENAME_REGEX

CONF_FILENAME_REGEX

public static final Pattern CONF_FILENAME_REGEX
Constructor Detail

JobHistory

public JobHistory()
Method Detail

getHistoryFilePath

public static String getHistoryFilePath(JobID jobId)
Given the job id, return the history file path from the cache


init

public static void init(JobTracker jobTracker,
                        JobConf conf,
                        String hostname,
                        long jobTrackerStartTime)
                 throws IOException
Initialize JobHistory files.

Parameters:
conf - Jobconf of the job tracker.
hostname - jobtracker's hostname
jobTrackerStartTime - jobtracker's start time
Throws:
IOException

parseHistoryFromFS

public static void parseHistoryFromFS(String path,
                                      JobHistory.Listener l,
                                      FileSystem fs)
                               throws IOException
Parses history file and invokes Listener.handle() for each line of history. It can be used for looking through history files for specific items without having to keep whole history in memory.

Parameters:
path - path to history file
l - Listener for history events
fs - FileSystem where history file is present
Throws:
IOException

getTaskLogsUrl

public static String getTaskLogsUrl(JobHistory.TaskAttempt attempt)
Return the TaskLogsUrl of a particular TaskAttempt

Parameters:
attempt -
Returns:
the taskLogsUrl. null if http-port or tracker-name or task-attempt-id are unavailable.


Copyright © 2009 The Apache Software Foundation