Class ContainerRetryContext

java.lang.Object
org.apache.hadoop.yarn.api.records.ContainerRetryContext

@Public @Unstable public abstract class ContainerRetryContext extends Object
ContainerRetryContext indicates how container retry after it fails to run.

It provides details such as:

  • ContainerRetryPolicy : - NEVER_RETRY(DEFAULT value): no matter what error code is when container fails to run, just do not retry. - RETRY_ON_ALL_ERRORS: no matter what error code is, when container fails to run, just retry. - RETRY_ON_SPECIFIC_ERROR_CODES: when container fails to run, do retry if the error code is one of errorCodes, otherwise do not retry. Note: if error code is 137(SIGKILL) or 143(SIGTERM), it will not retry because it is usually killed on purpose.
  • maxRetries specifies how many times to retry if need to retry. If the value is -1, it means retry forever.
  • retryInterval specifies delaying some time before relaunch container, the unit is millisecond.
  • failuresValidityInterval: default value is -1. When failuresValidityInterval in milliseconds is set to > 0, the failure number will not take failures which happen out of the failuresValidityInterval into failure count. If failure count reaches to maxRetries, the container will be failed.
  • Field Details

  • Constructor Details

    • ContainerRetryContext

      public ContainerRetryContext()
  • Method Details

    • newInstance

      @Private @Unstable public static ContainerRetryContext newInstance(ContainerRetryPolicy retryPolicy, Set<Integer> errorCodes, int maxRetries, int retryInterval, long failuresValidityInterval)
    • newInstance

      @Private @Unstable public static ContainerRetryContext newInstance(ContainerRetryPolicy retryPolicy, Set<Integer> errorCodes, int maxRetries, int retryInterval)
    • getRetryPolicy

      public abstract ContainerRetryPolicy getRetryPolicy()
    • setRetryPolicy

      public abstract void setRetryPolicy(ContainerRetryPolicy retryPolicy)
    • getErrorCodes

      public abstract Set<Integer> getErrorCodes()
    • setErrorCodes

      public abstract void setErrorCodes(Set<Integer> errorCodes)
    • getMaxRetries

      public abstract int getMaxRetries()
    • setMaxRetries

      public abstract void setMaxRetries(int maxRetries)
    • getRetryInterval

      public abstract int getRetryInterval()
    • setRetryInterval

      public abstract void setRetryInterval(int retryInterval)
    • getFailuresValidityInterval

      public abstract long getFailuresValidityInterval()
    • setFailuresValidityInterval

      public abstract void setFailuresValidityInterval(long failuresValidityInterval)