A formal notation such as The Z Notation would be the strictest way to define Hadoop FileSystem behavior, and could even be used to prove some axioms.
However, it has a number of practical flaws:
Such notations are not as widely used as they should be, so the broader software development community is not going to have practical experience of it.
It’s very hard to work with without dropping into tools such as LaTeX and add-on libraries.
Such notations are difficult to understand, even for experts.
Given that the target audience of this specification is FileSystem developers, formal notations are not appropriate. Instead, broad comprehensibility, ease of maintenance, and ease of deriving tests take priority over mathematically-pure formal notation.
This document does use a subset of the notation in the Z syntax, but in an ASCII form and the use of Python list notation for manipulating lists and sets.
∩ : ^: Set Intersection
∃ : exists Exists predicate
:= : = :
`` : # : Python-style comments
happens-before : happens-before : Lamport’s ordering relationship as defined in Time, Clocks and the Ordering of Events in a Distributed System
The python data structures are used as the basis for this syntax as it is both plain ASCII and well-known.
Sets are an extension of the List notation, adding the restrictions that there can be no duplicate entries in the set, and there is no defined order.
Maps resemble Python dictionaries; {“key”:value, “key2”,value2}
In classic specification languages, the preconditions define the predicates that MUST be satisfied else some failure condition is raised.
For Hadoop, we need to be able to specify what failure condition results if a specification is not met (usually what exception is to be raised).
The notation raise <exception-name> is used to indicate that an exception is to be raised.
It can be used in the if-then-else sequence to define an action if a precondition is not met.
Example:
if not exists(FS, Path) : raise IOException
If implementations may raise any one of a set of exceptions, this is denoted by providing a set of exceptions:
if not exists(FS, Path) : raise {FileNotFoundException, IOException}
If a set of exceptions is provided, the earlier elements of the set are preferred to the later entries, on the basis that they aid diagnosis of problems.
We also need to distinguish predicates that MUST be satisfied, along with those that SHOULD be met. For this reason a function specification MAY include a section in the preconditions marked ‘Should:’ All predicates declared in this section SHOULD be met, and if there is an entry in that section which specifies a stricter outcome, it SHOULD BE preferred. Here is an example of a should-precondition:
Should:
if not exists(FS, Path) : raise FileNotFoundException
There are further conditions used in precondition and postcondition declarations.
This condition declares that a subclass implements the named method -some subclasses of the verious FileSystem classes do not, and instead raise UnsupportedOperation
As an example, one precondition of FSDataInputStream.seek is that the implementation must support Seekable.seek :
supported(FDIS, Seekable.seek) else raise UnsupportedOperation