UpdateIndex (Hadoop 1.0.4 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.contrib.index.main
Class UpdateIndex

java.lang.Object
  org.apache.hadoop.contrib.index.main.UpdateIndex

public class UpdateIndex
extends Object
extends Object

A distributed "index" is partitioned into "shards". Each shard corresponds to a Lucene instance. This class contains the main() method which uses a Map/Reduce job to analyze documents and update Lucene instances in parallel. The main() method in UpdateIndex requires the following information for updating the shards: - Input formatter. This specifies how to format the input documents. - Analysis. This defines the analyzer to use on the input. The analyzer determines whether a document is being inserted, updated, or deleted. For inserts or updates, the analyzer also converts each input document into a Lucene document. - Input paths. This provides the location(s) of updated documents, e.g., HDFS files or directories, or HBase tables. - Shard paths, or index path with the number of shards. Either specify the path for each shard, or specify an index path and the shards are the sub-directories of the index directory. - Output path. When the update to a shard is done, a message is put here. - Number of map tasks. All of the information can be specified in a configuration file. All but the first two can also be specified as command line options. Check out conf/index-config.xml.template for other configurable parameters. Note: Because of the parallel nature of Map/Reduce, the behaviour of multiple inserts, deletes or updates to the same document is undefined.

Field Summary
`static org.apache.commons.logging.Log`	`LOG`

Constructor Summary
`UpdateIndex()`

Method Summary
`static void`	`main(String[] argv)` The main() method

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

Constructor Detail

UpdateIndex

public UpdateIndex()

Method Detail

main

public static void main(String[] argv)

The main() method

Parameters:: argv -