Package org.apache.hadoop.fs.s3a.commit
Class CommitConstants
java.lang.Object
org.apache.hadoop.fs.s3a.commit.CommitConstants
Constants for working with committers.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringMarker file to create on success: "_SUCCESS".static final StringMarker of the start of a directory tree for calculating the final path names: "__base".static final StringOption forFS_S3A_COMMITTER_NAME: directory output committer: "directory".static final StringOption forFS_S3A_COMMITTER_NAME: classic/file output committer: "file".static final StringOption forFS_S3A_COMMITTER_NAME: magic output committer: "magic".static final StringOption forFS_S3A_COMMITTER_NAME: partition output committer: "partitioned".static final StringConflict mode: "append".static final StringConflict mode: "fail".static final StringConflict mode: "replace".static final StringFlag to trigger creation of a marker file on job completion.static final intDefault value forFS_S3A_COMMITTER_THREADS: 32.static final StringDefault conflict mode: "append".static final booleanDefault job marker option: true.static final booleanDefault configuration value forFS_S3A_COMMITTER_ABORT_PENDING_UPLOADS.static final booleanIs the committer enabled by default: true.static final booleanDefault value forFS_S3A_COMMITTER_GENERATE_UUID: false.static final booleanDefault value forFS_S3A_COMMITTER_REQUIRE_UUID: false.static final booleanDefault value forFS_S3A_COMMITTER_STAGING_UNIQUE_FILENAMES: true.static final StringShould committers abort all pending uploads to the destination directory?static final StringGenerate a UUID in job setup rather than fall back to YARN Application attempt ID.static final StringShould Magic committer cleanup all the staging dirs.static final booleanDefault value forFS_S3A_COMMITTER_MAGIC_CLEANUP_ENABLED: true.static final StringShould Magic committer track all the pending commits in memory?static final booleanDefault value forFS_S3A_COMMITTER_MAGIC_TRACK_COMMITS_IN_MEMORY_ENABLED: false.static final StringOption to identify the S3A committer: "fs.s3a.committer.name".static final StringRequire the spark UUID to be passed down: "fs.s3a.committer.require.uuid".static final StringDeprecated.static final StringStaging committer conflict resolution policy: "fs.s3a.committer.staging.conflict-mode".static final StringPath in the cluster filesystem for temporary data: "fs.s3a.committer.staging.tmp.path".static final StringOption for final files to have a uniqueness name through job attempt info, falling back to a new UUID if there is no job attempt information to use.static final StringNumber of threads in committers for parallel operations on files (upload, commit, abort, delete...): "fs.s3a.committer.threads".static final Stringstatic final StringPath for "magic" writes: path andPENDING_SUFFIXfiles: "__magic".static final StringFlag to indicate whether support for the Magic committer is enabled in the filesystem.static final StringEtag name to be returned on non-committed S3 object: "pending".static final StringFlag to indicate whether support for the Magic committer is enabled in the filesystem.static final Stringstatic final StringPrefix to use for config options: "fs.s3a.committer.".static final StringDirectory for saving job summary reports.static final StringTask Attempt ID query header: "ta".static final StringSuffix applied to pending commit metadata: ".pending".static final StringSuffix applied to multiple pending commit metadata: ".pendingset".static final StringExperimental feature to collect thread level IO statistics.static final booleanDefault value forS3A_COMMITTER_EXPERIMENTAL_COLLECT_IOSTATISTICS.static final StringS3 Committer factory: "org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory".static final StringKey to set for the S3A schema to use the specific committer.static final StringFlag to indicate that a store supports magic committers.static final StringDeprecated.static final StringFlag to indicate whether a stream is a magic output stream; returned inStreamCapabilitiesValue: "fs.s3a.capability.magic.output.stream".static final StringDeprecated.static final intThe limit to the number of committed objects tracked during job commits and saved to the _SUCCESS file.static final StringExtra Data key for task attempt in pendingset files.static final StringTemp data which is not auto-committed: "_temporary".static final StringThis is the "Pending" directory of theFileOutputCommitter; data written here is, in that algorithm, renamed into place.static final StringMagic Marker header to declare final file length on magic uploads marker objects: "x-hadoop-s3a-magic-data-length".static final StringXAttr name of magic marker, with "header." prefix: "header.x-hadoop-s3a-magic-data-length". -
Method Summary
-
Field Details
-
MAGIC
Path for "magic" writes: path andPENDING_SUFFIXfiles: "__magic".- See Also:
-
JOB_ID_PREFIX
- See Also:
-
MAGIC_PATH_PREFIX
- See Also:
-
BASE
Marker of the start of a directory tree for calculating the final path names: "__base".- See Also:
-
PENDING_SUFFIX
Suffix applied to pending commit metadata: ".pending".- See Also:
-
PENDINGSET_SUFFIX
Suffix applied to multiple pending commit metadata: ".pendingset".- See Also:
-
MAGIC_COMMITTER_PENDING_OBJECT_ETAG_NAME
Etag name to be returned on non-committed S3 object: "pending".- See Also:
-
OPT_PREFIX
Prefix to use for config options: "fs.s3a.committer.".- See Also:
-
MAGIC_COMMITTER_PREFIX
Flag to indicate whether support for the Magic committer is enabled in the filesystem. Value: "fs.s3a.committer.magic".- See Also:
-
MAGIC_COMMITTER_ENABLED
Flag to indicate whether support for the Magic committer is enabled in the filesystem. Value: "fs.s3a.committer.magic.enabled".- See Also:
-
STREAM_CAPABILITY_MAGIC_OUTPUT
Flag to indicate whether a stream is a magic output stream; returned inStreamCapabilitiesValue: "fs.s3a.capability.magic.output.stream".- See Also:
-
STORE_CAPABILITY_MAGIC_COMMITTER
Flag to indicate that a store supports magic committers. returned inPathCapabilitiesValue: "fs.s3a.capability.magic.committer".- See Also:
-
STREAM_CAPABILITY_MAGIC_OUTPUT_OLD
Deprecated.Flag to indicate whether a stream is a magic output stream; returned inStreamCapabilitiesValue: "s3a:magic.output.stream".- See Also:
-
STORE_CAPABILITY_MAGIC_COMMITTER_OLD
Deprecated.Flag to indicate that a store supports magic committers. returned inPathCapabilitiesValue: "s3a:magic.committer".- See Also:
-
DEFAULT_MAGIC_COMMITTER_ENABLED
public static final boolean DEFAULT_MAGIC_COMMITTER_ENABLEDIs the committer enabled by default: true.- See Also:
-
TEMPORARY
This is the "Pending" directory of theFileOutputCommitter; data written here is, in that algorithm, renamed into place. Value: "_temporary".- See Also:
-
TEMP_DATA
Temp data which is not auto-committed: "_temporary".- See Also:
-
CREATE_SUCCESSFUL_JOB_OUTPUT_DIR_MARKER
Flag to trigger creation of a marker file on job completion.- See Also:
-
_SUCCESS
Marker file to create on success: "_SUCCESS".- See Also:
-
DEFAULT_CREATE_SUCCESSFUL_JOB_DIR_MARKER
public static final boolean DEFAULT_CREATE_SUCCESSFUL_JOB_DIR_MARKERDefault job marker option: true.- See Also:
-
S3A_COMMITTER_FACTORY_KEY
Key to set for the S3A schema to use the specific committer. -
S3A_COMMITTER_FACTORY
S3 Committer factory: "org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory". This uses the value ofFS_S3A_COMMITTER_NAMEto choose the final committer.- See Also:
-
FS_S3A_COMMITTER_NAME
Option to identify the S3A committer: "fs.s3a.committer.name".- See Also:
-
COMMITTER_NAME_FILE
Option forFS_S3A_COMMITTER_NAME: classic/file output committer: "file".- See Also:
-
COMMITTER_NAME_MAGIC
Option forFS_S3A_COMMITTER_NAME: magic output committer: "magic".- See Also:
-
COMMITTER_NAME_DIRECTORY
Option forFS_S3A_COMMITTER_NAME: directory output committer: "directory".- See Also:
-
COMMITTER_NAME_PARTITIONED
Option forFS_S3A_COMMITTER_NAME: partition output committer: "partitioned".- See Also:
-
FS_S3A_COMMITTER_STAGING_UNIQUE_FILENAMES
Option for final files to have a uniqueness name through job attempt info, falling back to a new UUID if there is no job attempt information to use. "fs.s3a.committer.staging.unique-filenames". When writing data with the "append" conflict option, this guarantees that new data will not overwrite any existing data.- See Also:
-
DEFAULT_STAGING_COMMITTER_UNIQUE_FILENAMES
public static final boolean DEFAULT_STAGING_COMMITTER_UNIQUE_FILENAMESDefault value forFS_S3A_COMMITTER_STAGING_UNIQUE_FILENAMES: true.- See Also:
-
FS_S3A_COMMITTER_STAGING_CONFLICT_MODE
Staging committer conflict resolution policy: "fs.s3a.committer.staging.conflict-mode". Supported: fail, append, replace.- See Also:
-
CONFLICT_MODE_FAIL
Conflict mode: "fail".- See Also:
-
CONFLICT_MODE_APPEND
Conflict mode: "append".- See Also:
-
CONFLICT_MODE_REPLACE
Conflict mode: "replace".- See Also:
-
DEFAULT_CONFLICT_MODE
Default conflict mode: "append".- See Also:
-
FS_S3A_COMMITTER_THREADS
Number of threads in committers for parallel operations on files (upload, commit, abort, delete...): "fs.s3a.committer.threads". Two thread pools this size are created, one for the outer task-level parallelism, and one for parallel execution within tasks (POSTs to commit individual uploads) If the value is negative, it is inverted and then multiplied by the number of cores in the CPU.- See Also:
-
DEFAULT_COMMITTER_THREADS
public static final int DEFAULT_COMMITTER_THREADSDefault value forFS_S3A_COMMITTER_THREADS: 32.- See Also:
-
FS_S3A_COMMITTER_MAGIC_TRACK_COMMITS_IN_MEMORY_ENABLED
Should Magic committer track all the pending commits in memory?- See Also:
-
FS_S3A_COMMITTER_MAGIC_TRACK_COMMITS_IN_MEMORY_ENABLED_DEFAULT
public static final boolean FS_S3A_COMMITTER_MAGIC_TRACK_COMMITS_IN_MEMORY_ENABLED_DEFAULTDefault value forFS_S3A_COMMITTER_MAGIC_TRACK_COMMITS_IN_MEMORY_ENABLED: false.- See Also:
-
FS_S3A_COMMITTER_MAGIC_CLEANUP_ENABLED
Should Magic committer cleanup all the staging dirs.- See Also:
-
FS_S3A_COMMITTER_MAGIC_CLEANUP_ENABLED_DEFAULT
public static final boolean FS_S3A_COMMITTER_MAGIC_CLEANUP_ENABLED_DEFAULTDefault value forFS_S3A_COMMITTER_MAGIC_CLEANUP_ENABLED: true.- See Also:
-
FS_S3A_COMMITTER_STAGING_TMP_PATH
Path in the cluster filesystem for temporary data: "fs.s3a.committer.staging.tmp.path". This is for HDFS, not the local filesystem. It is only for the summary data of each file, not the actual data being committed.- See Also:
-
FS_S3A_COMMITTER_STAGING_ABORT_PENDING_UPLOADS
Deprecated.Should committers abort all pending uploads to the destination directory?Deprecated: switch to
FS_S3A_COMMITTER_ABORT_PENDING_UPLOADS.- See Also:
-
FS_S3A_COMMITTER_ABORT_PENDING_UPLOADS
Should committers abort all pending uploads to the destination directory?Value: "fs.s3a.committer.abort.pending.uploads".
Change this is if more than one committer is writing to the same destination tree simultaneously; otherwise the first job to complete will cancel all outstanding uploads from the others. If disabled, configure the bucket lifecycle to remove uploads after a time period, and/or set up a workflow to explicitly delete entries. Otherwise there is a risk that uncommitted uploads may run up bills.
- See Also:
-
DEFAULT_FS_S3A_COMMITTER_ABORT_PENDING_UPLOADS
public static final boolean DEFAULT_FS_S3A_COMMITTER_ABORT_PENDING_UPLOADSDefault configuration value forFS_S3A_COMMITTER_ABORT_PENDING_UPLOADS. It is disabled by default to support concurrent writes on the same parent directory but different partition/sub directory. Value: false.- See Also:
-
SUCCESS_MARKER_FILE_LIMIT
public static final int SUCCESS_MARKER_FILE_LIMITThe limit to the number of committed objects tracked during job commits and saved to the _SUCCESS file.- See Also:
-
TASK_ATTEMPT_ID
Extra Data key for task attempt in pendingset files.- See Also:
-
FS_S3A_COMMITTER_REQUIRE_UUID
Require the spark UUID to be passed down: "fs.s3a.committer.require.uuid". This is to verify that SPARK-33230 has been applied to spark, and thatInternalCommitterConstants.SPARK_WRITE_UUIDis set.MUST ONLY BE SET WITH SPARK JOBS.
- See Also:
-
DEFAULT_S3A_COMMITTER_REQUIRE_UUID
public static final boolean DEFAULT_S3A_COMMITTER_REQUIRE_UUIDDefault value forFS_S3A_COMMITTER_REQUIRE_UUID: false.- See Also:
-
FS_S3A_COMMITTER_GENERATE_UUID
Generate a UUID in job setup rather than fall back to YARN Application attempt ID.MUST ONLY BE SET WITH SPARK JOBS.
- See Also:
-
DEFAULT_S3A_COMMITTER_GENERATE_UUID
public static final boolean DEFAULT_S3A_COMMITTER_GENERATE_UUIDDefault value forFS_S3A_COMMITTER_GENERATE_UUID: false.- See Also:
-
X_HEADER_MAGIC_MARKER
Magic Marker header to declare final file length on magic uploads marker objects: "x-hadoop-s3a-magic-data-length".- See Also:
-
XA_MAGIC_MARKER
XAttr name of magic marker, with "header." prefix: "header.x-hadoop-s3a-magic-data-length".- See Also:
-
PARAM_TASK_ATTEMPT_ID
Task Attempt ID query header: "ta".- See Also:
-
OPT_SUMMARY_REPORT_DIR
Directory for saving job summary reports. These are the _SUCCESS files, but are saved even on job failures. Value: "fs.s3a.committer.summary.report.directory".- See Also:
-
S3A_COMMITTER_EXPERIMENTAL_COLLECT_IOSTATISTICS
Experimental feature to collect thread level IO statistics. When set the committers will reset the statistics in task setup and propagate to the job committer. The job comitter will include those and its own statistics. Do not use if the execution engine is collecting statistics, as the multiple reset() operations will result in incomplete statistics. Value: "fs.s3a.committer.experimental.collect.iostatistics".- See Also:
-
S3A_COMMITTER_EXPERIMENTAL_COLLECT_IOSTATISTICS_DEFAULT
public static final boolean S3A_COMMITTER_EXPERIMENTAL_COLLECT_IOSTATISTICS_DEFAULTDefault value forS3A_COMMITTER_EXPERIMENTAL_COLLECT_IOSTATISTICS. Value: false.- See Also:
-