Class CompressionCodecFactory

java.lang.Object
org.apache.hadoop.io.compress.CompressionCodecFactory

@Public @Evolving public class CompressionCodecFactory extends Object
A factory that will find the correct codec for a given filename.
  • Field Details

    • LOG

      public static final org.slf4j.Logger LOG
  • Constructor Details

    • CompressionCodecFactory

      public CompressionCodecFactory(Configuration conf)
      Find the codecs specified in the config value io.compression.codecs and register them. Defaults to gzip and deflate.
      Parameters:
      conf - configuration.
  • Method Details

    • toString

      public String toString()
      Print the extension map out as a string.
      Overrides:
      toString in class Object
    • getCodecClasses

      public static List<Class<? extends CompressionCodec>> getCodecClasses(Configuration conf)
      Get the list of codecs discovered via a Java ServiceLoader, or listed in the configuration. Codecs specified in configuration come later in the returned list, and are considered to override those from the ServiceLoader.
      Parameters:
      conf - the configuration to look in
      Returns:
      a list of the CompressionCodec classes
    • setCodecClasses

      public static void setCodecClasses(Configuration conf, List<Class> classes)
      Sets a list of codec classes in the configuration. In addition to any classes specified using this method, CompressionCodec classes on the classpath are discovered using a Java ServiceLoader.
      Parameters:
      conf - the configuration to modify
      classes - the list of classes to set
    • getCodec

      public CompressionCodec getCodec(Path file)
      Find the relevant compression codec for the given file based on its filename suffix.
      Parameters:
      file - the filename to check
      Returns:
      the codec object
    • getCodecByClassName

      public CompressionCodec getCodecByClassName(String classname)
      Find the relevant compression codec for the codec's canonical class name.
      Parameters:
      classname - the canonical class name of the codec
      Returns:
      the codec object
    • getCodecByName

      public CompressionCodec getCodecByName(String codecName)
      Find the relevant compression codec for the codec's canonical class name or by codec alias.

      Codec aliases are case insensitive.

      The code alias is the short class name (without the package name). If the short class name ends with 'Codec', then there are two aliases for the codec, the complete short class name and the short class name without the 'Codec' ending. For example for the 'GzipCodec' codec class name the alias are 'gzip' and 'gzipcodec'.

      Parameters:
      codecName - the canonical class name of the codec
      Returns:
      the codec object
    • getCodecClassByName

      public Class<? extends CompressionCodec> getCodecClassByName(String codecName)
      Find the relevant compression codec for the codec's canonical class name or by codec alias and returns its implemetation class.

      Codec aliases are case insensitive.

      The code alias is the short class name (without the package name). If the short class name ends with 'Codec', then there are two aliases for the codec, the complete short class name and the short class name without the 'Codec' ending. For example for the 'GzipCodec' codec class name the alias are 'gzip' and 'gzipcodec'.

      Parameters:
      codecName - the canonical class name of the codec
      Returns:
      the codec class
    • removeSuffix

      public static String removeSuffix(String filename, String suffix)
      Removes a suffix from a filename, if it has it.
      Parameters:
      filename - the filename to strip
      suffix - the suffix to remove
      Returns:
      the shortened filename
    • main

      public static void main(String[] args) throws Exception
      A little test program.
      Parameters:
      args - arguments.
      Throws:
      Exception - exception.