Class PathOutputCommitterFactory

java.lang.Object
org.apache.hadoop.conf.Configured
org.apache.hadoop.mapreduce.lib.output.PathOutputCommitterFactory
All Implemented Interfaces:
Configurable
Direct Known Subclasses:
ManifestCommitterFactory

@Public @Evolving public class PathOutputCommitterFactory extends Configured
A factory for committers implementing the PathOutputCommitter methods, and so can be used from FileOutputFormat. The base implementation returns FileOutputCommitter instances. Algorithm:
  1. If an explicit committer factory is named, it is used.
  2. The output path is examined. If is non null and there is an explicit schema for that filesystem, its factory is instantiated.
  3. Otherwise, an instance of FileOutputCommitter is created.
In FileOutputFormat, the created factory has its method createOutputCommitter(Path, TaskAttemptContext) with a task attempt context and a possibly null path.
  • Field Details

    • COMMITTER_FACTORY_CLASS

      public static final String COMMITTER_FACTORY_CLASS
      Name of the configuration option used to configure the output committer factory to use unless there is a specific one for a schema.
      See Also:
    • COMMITTER_FACTORY_SCHEME

      public static final String COMMITTER_FACTORY_SCHEME
      Scheme prefix for per-filesystem scheme committers.
      See Also:
    • COMMITTER_FACTORY_SCHEME_PATTERN

      public static final String COMMITTER_FACTORY_SCHEME_PATTERN
      String format pattern for per-filesystem scheme committers.
      See Also:
    • FILE_COMMITTER_FACTORY

      public static final String FILE_COMMITTER_FACTORY
      The FileOutputCommitter factory.
      See Also:
    • NAMED_COMMITTER_FACTORY

      public static final String NAMED_COMMITTER_FACTORY
      The FileOutputCommitter factory.
      See Also:
    • NAMED_COMMITTER_CLASS

      public static final String NAMED_COMMITTER_CLASS
      The named output committer. Creates any committer listed in
      See Also:
    • COMMITTER_FACTORY_DEFAULT

      public static final String COMMITTER_FACTORY_DEFAULT
      Default committer factory name: "org.apache.hadoop.mapreduce.lib.output.FileOutputCommitterFactory".
      See Also:
  • Constructor Details

    • PathOutputCommitterFactory

      public PathOutputCommitterFactory()
  • Method Details

    • createOutputCommitter

      public PathOutputCommitter createOutputCommitter(Path outputPath, TaskAttemptContext context) throws IOException
      Create an output committer for a task attempt.
      Parameters:
      outputPath - output path. This may be null.
      context - context
      Returns:
      a new committer
      Throws:
      IOException - problems instantiating the committer
    • createFileOutputCommitter

      protected final PathOutputCommitter createFileOutputCommitter(Path outputPath, TaskAttemptContext context) throws IOException
      Create an instance of the default committer, a FileOutputCommitter for a task.
      Parameters:
      outputPath - the task's output path, or or null if no output path has been defined.
      context - the task attempt context
      Returns:
      the committer to use
      Throws:
      IOException - problems instantiating the committer
    • getCommitterFactory

      public static PathOutputCommitterFactory getCommitterFactory(Path outputPath, Configuration conf)
      Get the committer factory for a configuration.
      Parameters:
      outputPath - the job's output path. If null, it means that the schema is unknown and a per-schema factory cannot be determined.
      conf - configuration
      Returns:
      an instantiated committer factory
    • createCommitter

      public static PathOutputCommitter createCommitter(Path outputPath, TaskAttemptContext context) throws IOException
      Create the committer factory for a task attempt and destination, then create the committer from it.
      Parameters:
      outputPath - the task's output path, or or null if no output path has been defined.
      context - the task attempt context
      Returns:
      the committer to use
      Throws:
      IOException - problems instantiating the committer