org.apache.hadoop.util
Class QueueProcessingStatistics

java.lang.Object
  extended by org.apache.hadoop.util.QueueProcessingStatistics

public abstract class QueueProcessingStatistics
extends Object

Hadoop has several work queues, such as FSNamesystem.neededReplications With a properly throttled queue, a worker thread cycles repeatedly, doing a chunk of work each cycle then resting a bit, until the queue is empty. This class is intended to collect statistics about the behavior of such queues and consumers. It reports the amount of work done and how long it took, for the first cycle after collection starts, and for the total number of cycles needed to flush the queue. We use a state machine to detect when the queue has been flushed and then we log the stats; see QueueProcessingStatistics.State for enumeration of the states and their meanings.


Nested Class Summary
static class QueueProcessingStatistics.State
          This enum provides the "states" of a state machine for QueueProcessingStatistics.
 
Constructor Summary
QueueProcessingStatistics(String queueName, String workItemsName, org.apache.commons.logging.Log logObject)
           
 
Method Summary
 void checkRestart()
           
 void endCycle(int workFound)
           
abstract  boolean postCheckIsLastCycle(int workFound)
          See preCheckIsLastCycle(int).
abstract  boolean preCheckIsLastCycle(int maxWorkToProcess)
          The termination condition is to identify the last cycle that will empty the queue.
 void startCycle(int maxWorkToProcess)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

QueueProcessingStatistics

public QueueProcessingStatistics(String queueName,
                                 String workItemsName,
                                 org.apache.commons.logging.Log logObject)
Parameters:
queueName - - Human-readable name of the queue being monitored, used as first word in the log messages.
workItemsName - - what kind of work items are being managed on the queue? A plural word is best here, for logging.
logObject - - What log do you want the log messages to be sent to?
Method Detail

startCycle

public void startCycle(int maxWorkToProcess)

endCycle

public void endCycle(int workFound)

checkRestart

public void checkRestart()

preCheckIsLastCycle

public abstract boolean preCheckIsLastCycle(int maxWorkToProcess)
The termination condition is to identify the last cycle that will empty the queue. Two abstract APIs are called: preCheckIsLastCycle is called at the beginning of each cycle, and postCheckIsLastCycle(int) is called at the end of each cycle. At least one of them must correctly provide the termination condition. The other may always return 'false'. If either of them returns 'true' in a given cycle, then at the end of that cycle the stats will be output to log, and stats collection will end.

Parameters:
maxWorkToProcess - - if this number is greater than the amount of work remaining at the start of a cycle, then it will be the last cycle.
Returns:
- true if last cycle detected, else false

postCheckIsLastCycle

public abstract boolean postCheckIsLastCycle(int workFound)
See preCheckIsLastCycle(int).

Parameters:
workFound - - may not be useful
Returns:
- true if remaining work is zero at end of cycle, else false


Copyright © 2009 The Apache Software Foundation