org.apache.hadoop.contrib.index.mapred
Class IntermediateForm

java.lang.Object
  extended by org.apache.hadoop.contrib.index.mapred.IntermediateForm
All Implemented Interfaces:
Writable

public class IntermediateForm
extends Object
implements Writable

An intermediate form for one or more parsed Lucene documents and/or delete terms. It actually uses Lucene file format as the format for the intermediate form by using RAM dir files. Note: If process(*) is ever called, closeWriter() should be called. Otherwise, no need to call closeWriter().


Constructor Summary
IntermediateForm()
          Constructor
 
Method Summary
 void closeWriter()
          Close the Lucene index writer associated with the intermediate form, if created.
 void configure(IndexUpdateConfiguration iconf)
          Configure using an index update configuration.
 Iterator<org.apache.lucene.index.Term> deleteTermIterator()
          Get an iterator for the delete terms in the intermediate form.
 org.apache.lucene.store.Directory getDirectory()
          Get the ram directory of the intermediate form.
 void process(DocumentAndOp doc, org.apache.lucene.analysis.Analyzer analyzer)
          This method is used by the index update mapper and process a document operation into the current intermediate form.
 void process(IntermediateForm form)
          This method is used by the index update combiner and process an intermediate form into the current intermediate form.
 void readFields(DataInput in)
          Deserialize the fields of this object from in.
 String toString()
           
 void write(DataOutput out)
          Serialize the fields of this object to out.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

IntermediateForm

public IntermediateForm()
                 throws IOException
Constructor

Throws:
IOException
Method Detail

configure

public void configure(IndexUpdateConfiguration iconf)
Configure using an index update configuration.

Parameters:
iconf - the index update configuration

getDirectory

public org.apache.lucene.store.Directory getDirectory()
Get the ram directory of the intermediate form.

Returns:
the ram directory

deleteTermIterator

public Iterator<org.apache.lucene.index.Term> deleteTermIterator()
Get an iterator for the delete terms in the intermediate form.

Returns:
an iterator for the delete terms

process

public void process(DocumentAndOp doc,
                    org.apache.lucene.analysis.Analyzer analyzer)
             throws IOException
This method is used by the index update mapper and process a document operation into the current intermediate form.

Parameters:
doc - input document operation
analyzer - the analyzer
Throws:
IOException

process

public void process(IntermediateForm form)
             throws IOException
This method is used by the index update combiner and process an intermediate form into the current intermediate form. More specifically, the input intermediate forms are a single-document ram index and/or a single delete term.

Parameters:
form - the input intermediate form
Throws:
IOException

closeWriter

public void closeWriter()
                 throws IOException
Close the Lucene index writer associated with the intermediate form, if created. Do not close the ram directory. In fact, there is no need to close a ram directory.

Throws:
IOException

toString

public String toString()
Overrides:
toString in class Object

write

public void write(DataOutput out)
           throws IOException
Description copied from interface: Writable
Serialize the fields of this object to out.

Specified by:
write in interface Writable
Parameters:
out - DataOuput to serialize this object into.
Throws:
IOException

readFields

public void readFields(DataInput in)
                throws IOException
Description copied from interface: Writable
Deserialize the fields of this object from in.

For efficiency, implementations should attempt to re-use storage in the existing object where possible.

Specified by:
readFields in interface Writable
Parameters:
in - DataInput to deseriablize this object from.
Throws:
IOException


Copyright © 2009 The Apache Software Foundation