java.lang.Object
org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>
org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.ManifestSuccessData
All Implemented Interfaces:
Serializable, org.apache.hadoop.fs.statistics.IOStatisticsSource

@Public @Unstable public class ManifestSuccessData extends org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>
Summary data saved into a _SUCCESS marker file. This is a copy of the S3A committer success data format, with a goal of being/remaining compatible. This makes it easier for tests in downstream modules to be able to parse the success files from any of the committers. This should be considered public; it is based on the S3A format, which has proven stable over time. The JSON format SHOULD be considered public and evolving with compatibility across versions. All the Java serialization data is different and may change across versions with no stability guarantees other than "manifest summaries MAY be serialized between processes with the exact same version of this binary on their classpaths." That is sufficient for testing in Spark. To aid with Java serialization, the maps and lists are exclusively those which serialize well. IOStatisticsSnapshot has a lot of complexity in marshalling there; this class doesn't worry about concurrent access so is simpler.
See Also:
  • Field Details

    • VERSION

      public static final int VERSION
      Supported version value: 1. If this is changed the value of serialVersionUID will change, to avoid deserialization problems.
      See Also:
    • NAME

      public static final String NAME
      Name to include in persisted data, so as to differentiate from any other manifests: "org.apache.hadoop.fs.s3a.commit.files.SuccessData/1".
      See Also:
  • Constructor Details

    • ManifestSuccessData

      public ManifestSuccessData()
  • Method Details

    • validate

      public ManifestSuccessData validate() throws IOException
      Description copied from class: org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData
      Validate the data: those fields which must be non empty, must be set.
      Specified by:
      validate in class org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>
      Returns:
      the validated instance.
      Throws:
      IOException - if the data is invalid
    • createSerializer

      public org.apache.hadoop.util.JsonSerialization<ManifestSuccessData> createSerializer()
      Description copied from class: org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData
      Get a (usually shared) JSON serializer.
      Specified by:
      createSerializer in class org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>
      Returns:
      a serializer. Call
    • toBytes

      public byte[] toBytes() throws IOException
      Description copied from class: org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData
      Serialize to JSON and then to a byte array, after performing a preflight validation of the data to be saved.
      Specified by:
      toBytes in class org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>
      Returns:
      the data in a persistable form.
      Throws:
      IOException - serialization problem or validation failure.
    • toJson

      public String toJson() throws IOException
      To JSON.
      Returns:
      json string value.
      Throws:
      IOException - failure
    • save

      public void save(FileSystem fs, Path path, boolean overwrite) throws IOException
      Description copied from class: org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData
      Save to a hadoop filesystem.
      Specified by:
      save in class org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>
      Parameters:
      fs - filesystem
      path - path
      overwrite - should any existing file be overwritten
      Throws:
      IOException - IO exception
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • dumpMetrics

      public String dumpMetrics(String prefix, String middle, String suffix)
      Dump the metrics (if any) to a string. The metrics are sorted for ease of viewing.
      Parameters:
      prefix - prefix before every entry
      middle - string between key and value
      suffix - suffix to each entry
      Returns:
      the dumped string
    • dumpDiagnostics

      public String dumpDiagnostics(String prefix, String middle, String suffix)
      Dump the diagnostics (if any) to a string.
      Parameters:
      prefix - prefix before every entry
      middle - string between key and value
      suffix - suffix to each entry
      Returns:
      the dumped string
    • joinMap

      protected static String joinMap(Map<String,?> map, String prefix, String middle, String suffix)
      Join any map of string to value into a string, sorting the keys first.
      Parameters:
      map - map to join
      prefix - prefix before every entry
      middle - string between key and value
      suffix - suffix to each entry
      Returns:
      a string for reporting.
    • load

      public static ManifestSuccessData load(FileSystem fs, Path path) throws IOException
      Load an instance from a file, then validate it.
      Parameters:
      fs - filesystem
      path - path
      Returns:
      the loaded instance
      Throws:
      IOException - IO failure
    • serializer

      public static org.apache.hadoop.util.JsonSerialization<ManifestSuccessData> serializer()
      Get a JSON serializer for this class.
      Returns:
      a serializer.
    • getName

      public String getName()
    • setName

      public void setName(String name)
    • getTimestamp

      public long getTimestamp()
      Returns:
      timestamp of creation.
    • setTimestamp

      public void setTimestamp(long timestamp)
    • getDate

      public String getDate()
      Returns:
      timestamp as date; no expectation of parseability.
    • setDate

      public void setDate(String date)
    • getHostname

      public String getHostname()
      Returns:
      host which created the file (implicitly: committed the work).
    • setHostname

      public void setHostname(String hostname)
    • getCommitter

      public String getCommitter()
      Returns:
      committer name.
    • setCommitter

      public void setCommitter(String committer)
    • getDescription

      public String getDescription()
      Returns:
      any description text.
    • setDescription

      public void setDescription(String description)
    • getMetrics

      public Map<String,Long> getMetrics()
      Returns:
      any metrics.
    • setMetrics

      public void setMetrics(TreeMap<String,Long> metrics)
    • getFilenames

      public List<String> getFilenames()
      Returns:
      a list of filenames in the commit.
    • getFilenamePaths

      public List<Path> getFilenamePaths()
      Get the list of filenames as paths.
      Returns:
      the paths.
    • setFilenamePaths

      public void setFilenamePaths(List<Path> paths)
      Set the list of filename paths.
    • setFilenames

      public void setFilenames(ArrayList<String> filenames)
    • getDiagnostics

      public Map<String,String> getDiagnostics()
    • setDiagnostics

      public void setDiagnostics(TreeMap<String,String> diagnostics)
    • putDiagnostic

      public void putDiagnostic(String key, String value)
      Add a diagnostics entry.
      Parameters:
      key - name
      value - value
    • getJobId

      public String getJobId()
      Returns:
      Job ID, if known.
    • setJobId

      public void setJobId(String jobId)
    • getJobIdSource

      public String getJobIdSource()
    • setJobIdSource

      public void setJobIdSource(String jobIdSource)
    • getIOStatistics

      public IOStatisticsSnapshot getIOStatistics()
      Description copied from interface: org.apache.hadoop.fs.statistics.IOStatisticsSource
      Return a statistics instance.

      It is not a requirement that the same instance is returned every time. IOStatisticsSource.

      If the object implementing this is Closeable, this method may return null if invoked on a closed object, even if it returns a valid instance when called earlier.

      Returns:
      an IOStatistics instance or null
    • setIOStatistics

      public void setIOStatistics(IOStatisticsSnapshot ioStatistics)
    • snapshotIOStatistics

      public void snapshotIOStatistics(IOStatistics iostats)
      Set the IOStatistics to a snapshot of the source.
      Parameters:
      iostats - . Statistics; may be null.
    • setSuccess

      public void setSuccess(boolean success)
      Set the success flag.
      Parameters:
      success - did the job succeed?
    • getSuccess

      public boolean getSuccess()
      Get the success flag.
      Returns:
      did the job succeed?
    • getState

      public String getState()
    • setState

      public void setState(String state)
    • getStage

      public String getStage()
    • recordJobFailure

      public void recordJobFailure(Throwable thrown)
      Note a failure by setting success flag to false, then add the exception to the diagnostics.
      Parameters:
      thrown - throwable