StreamXmlRecordReader (Hadoop 1.0.4 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.streaming
Class StreamXmlRecordReader

java.lang.Object
  org.apache.hadoop.streaming.StreamBaseRecordReader
      org.apache.hadoop.streaming.StreamXmlRecordReader

All Implemented Interfaces:: RecordReader<Text,Text>

public class StreamXmlRecordReader
extends StreamBaseRecordReader
extends StreamBaseRecordReader

A way to interpret XML fragments as Mapper input records. Values are XML subtrees delimited by configurable tags. Keys could be the value of a certain attribute in the XML subtree, but this is left to the stream processor application. The name-value properties that StreamXmlRecordReader understands are: String begin (chars marking beginning of record) String end (chars marking end of record) int maxrec (maximum record size) int lookahead(maximum lookahead to sync CDATA) boolean slowmatch

Field Summary

Fields inherited from class org.apache.hadoop.streaming.StreamBaseRecordReader
`LOG`

Constructor Summary
`StreamXmlRecordReader(FSDataInputStream in, FileSplit split, Reporter reporter, JobConf job, FileSystem fs)`

Method Summary
`void`	`init()`
`boolean`	`next(Text key, Text value)` Read a record.
`void`	`seekNextRecordBoundary()` Implementation should seek forward in_ to the first byte of the next record.

Methods inherited from class org.apache.hadoop.streaming.StreamBaseRecordReader
`close, createKey, createValue, getPos, getProgress`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail