Class ManifestSuccessData
java.lang.Object
org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>
org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.ManifestSuccessData
- All Implemented Interfaces:
Serializable,org.apache.hadoop.fs.statistics.IOStatisticsSource
@Public
@Unstable
public class ManifestSuccessData
extends org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>
Summary data saved into a
_SUCCESS marker file.
This is a copy of the S3A committer success data format, with
a goal of being/remaining compatible.
This makes it easier for tests in downstream modules to
be able to parse the success files from any of the committers.
This should be considered public; it is based on the S3A
format, which has proven stable over time.
The JSON format SHOULD be considered public and evolving
with compatibility across versions.
All the Java serialization data is different and may change
across versions with no stability guarantees other than
"manifest summaries MAY be serialized between processes with
the exact same version of this binary on their classpaths."
That is sufficient for testing in Spark.
To aid with Java serialization, the maps and lists are
exclusively those which serialize well.
IOStatisticsSnapshot has a lot of complexity in marshalling
there; this class doesn't worry about concurrent access
so is simpler.- See Also:
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.hadoop.util.JsonSerialization<ManifestSuccessData>Get a (usually shared) JSON serializer.dumpDiagnostics(String prefix, String middle, String suffix) Dump the diagnostics (if any) to a string.dumpMetrics(String prefix, String middle, String suffix) Dump the metrics (if any) to a string.getDate()Get the list of filenames as paths.Return a statistics instance.getJobId()getName()getStage()getState()booleanGet the success flag.longprotected static StringJoin any map of string to value into a string, sorting the keys first.static ManifestSuccessDataload(FileSystem fs, Path path) Load an instance from a file, then validate it.voidputDiagnostic(String key, String value) Add a diagnostics entry.voidrecordJobFailure(Throwable thrown) Note a failure by setting success flag to false, then add the exception to the diagnostics.voidsave(FileSystem fs, Path path, boolean overwrite) Save to a hadoop filesystem.static org.apache.hadoop.util.JsonSerialization<ManifestSuccessData>Get a JSON serializer for this class.voidsetCommitter(String committer) voidvoidsetDescription(String description) voidsetDiagnostics(TreeMap<String, String> diagnostics) voidsetFilenamePaths(List<Path> paths) Set the list of filename paths.voidsetFilenames(ArrayList<String> filenames) voidsetHostname(String hostname) voidsetIOStatistics(IOStatisticsSnapshot ioStatistics) voidvoidsetJobIdSource(String jobIdSource) voidsetMetrics(TreeMap<String, Long> metrics) voidvoidvoidsetSuccess(boolean success) Set the success flag.voidsetTimestamp(long timestamp) voidsnapshotIOStatistics(IOStatistics iostats) Set the IOStatistics to a snapshot of the source.byte[]toBytes()Serialize to JSON and then to a byte array, after performing a preflight validation of the data to be saved.toJson()To JSON.toString()validate()Validate the data: those fields which must be non empty, must be set.Methods inherited from class org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData
marshallPath, unmarshallPath
-
Field Details
-
VERSION
public static final int VERSIONSupported version value: 1. If this is changed the value ofserialVersionUIDwill change, to avoid deserialization problems.- See Also:
-
NAME
Name to include in persisted data, so as to differentiate from any other manifests: "org.apache.hadoop.fs.s3a.commit.files.SuccessData/1".- See Also:
-
-
Constructor Details
-
ManifestSuccessData
public ManifestSuccessData()
-
-
Method Details
-
validate
Description copied from class:org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestDataValidate the data: those fields which must be non empty, must be set.- Specified by:
validatein classorg.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>- Returns:
- the validated instance.
- Throws:
IOException- if the data is invalid
-
createSerializer
Description copied from class:org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestDataGet a (usually shared) JSON serializer.- Specified by:
createSerializerin classorg.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>- Returns:
- a serializer. Call
-
toBytes
Description copied from class:org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestDataSerialize to JSON and then to a byte array, after performing a preflight validation of the data to be saved.- Specified by:
toBytesin classorg.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>- Returns:
- the data in a persistable form.
- Throws:
IOException- serialization problem or validation failure.
-
toJson
To JSON.- Returns:
- json string value.
- Throws:
IOException- failure
-
save
Description copied from class:org.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestDataSave to a hadoop filesystem.- Specified by:
savein classorg.apache.hadoop.mapreduce.lib.output.committer.manifest.files.AbstractManifestData<ManifestSuccessData>- Parameters:
fs- filesystempath- pathoverwrite- should any existing file be overwritten- Throws:
IOException- IO exception
-
toString
-
dumpMetrics
Dump the metrics (if any) to a string. The metrics are sorted for ease of viewing.- Parameters:
prefix- prefix before every entrymiddle- string between key and valuesuffix- suffix to each entry- Returns:
- the dumped string
-
dumpDiagnostics
Dump the diagnostics (if any) to a string.- Parameters:
prefix- prefix before every entrymiddle- string between key and valuesuffix- suffix to each entry- Returns:
- the dumped string
-
joinMap
Join any map of string to value into a string, sorting the keys first.- Parameters:
map- map to joinprefix- prefix before every entrymiddle- string between key and valuesuffix- suffix to each entry- Returns:
- a string for reporting.
-
load
Load an instance from a file, then validate it.- Parameters:
fs- filesystempath- path- Returns:
- the loaded instance
- Throws:
IOException- IO failure
-
serializer
Get a JSON serializer for this class.- Returns:
- a serializer.
-
getName
-
setName
-
getTimestamp
public long getTimestamp()- Returns:
- timestamp of creation.
-
setTimestamp
public void setTimestamp(long timestamp) -
getDate
- Returns:
- timestamp as date; no expectation of parseability.
-
setDate
-
getHostname
- Returns:
- host which created the file (implicitly: committed the work).
-
setHostname
-
getCommitter
- Returns:
- committer name.
-
setCommitter
-
getDescription
- Returns:
- any description text.
-
setDescription
-
getMetrics
- Returns:
- any metrics.
-
setMetrics
-
getFilenames
- Returns:
- a list of filenames in the commit.
-
getFilenamePaths
Get the list of filenames as paths.- Returns:
- the paths.
-
setFilenamePaths
Set the list of filename paths. -
setFilenames
-
getDiagnostics
-
setDiagnostics
-
putDiagnostic
Add a diagnostics entry.- Parameters:
key- namevalue- value
-
getJobId
- Returns:
- Job ID, if known.
-
setJobId
-
getJobIdSource
-
setJobIdSource
-
getIOStatistics
Description copied from interface:org.apache.hadoop.fs.statistics.IOStatisticsSourceReturn a statistics instance.It is not a requirement that the same instance is returned every time.
IOStatisticsSource.If the object implementing this is Closeable, this method may return null if invoked on a closed object, even if it returns a valid instance when called earlier.
- Returns:
- an IOStatistics instance or null
-
setIOStatistics
-
snapshotIOStatistics
Set the IOStatistics to a snapshot of the source.- Parameters:
iostats- . Statistics; may be null.
-
setSuccess
public void setSuccess(boolean success) Set the success flag.- Parameters:
success- did the job succeed?
-
getSuccess
public boolean getSuccess()Get the success flag.- Returns:
- did the job succeed?
-
getState
-
setState
-
getStage
-
recordJobFailure
Note a failure by setting success flag to false, then add the exception to the diagnostics.- Parameters:
thrown- throwable
-