org.apache.hadoop.mapreduce.lib.partition
Class InputSampler.IntervalSampler<K,V>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.lib.partition.InputSampler.IntervalSampler<K,V>
All Implemented Interfaces:
InputSampler.Sampler<K,V>
Enclosing class:
InputSampler<K,V>

public static class InputSampler.IntervalSampler<K,V>
extends Object
implements InputSampler.Sampler<K,V>

Sample from s splits at regular intervals. Useful for sorted data.


Constructor Summary
InputSampler.IntervalSampler(double freq)
          Create a new IntervalSampler sampling all splits.
InputSampler.IntervalSampler(double freq, int maxSplitsSampled)
          Create a new IntervalSampler.
 
Method Summary
 K[] getSample(InputFormat<K,V> inf, Job job)
          For each split sampled, emit when the ratio of the number of records retained to the total record count is less than the specified frequency.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

InputSampler.IntervalSampler

public InputSampler.IntervalSampler(double freq)
Create a new IntervalSampler sampling all splits.

Parameters:
freq - The frequency with which records will be emitted.

InputSampler.IntervalSampler

public InputSampler.IntervalSampler(double freq,
                                    int maxSplitsSampled)
Create a new IntervalSampler.

Parameters:
freq - The frequency with which records will be emitted.
maxSplitsSampled - The maximum number of splits to examine.
See Also:
getSample(org.apache.hadoop.mapreduce.InputFormat, org.apache.hadoop.mapreduce.Job)
Method Detail

getSample

public K[] getSample(InputFormat<K,V> inf,
                     Job job)
              throws IOException,
                     InterruptedException
For each split sampled, emit when the ratio of the number of records retained to the total record count is less than the specified frequency.

Specified by:
getSample in interface InputSampler.Sampler<K,V>
Throws:
IOException
InterruptedException


Copyright © 2009 The Apache Software Foundation