org.apache.pig.impl.builtin
Class RandomSampleLoader
java.lang.Object
org.apache.pig.builtin.BinStorage
org.apache.pig.impl.builtin.RandomSampleLoader
- All Implemented Interfaces:
- LoadFunc, ReversibleLoadStoreFunc, StoreFunc
public class RandomSampleLoader
- extends BinStorage
defaultNumSamples
public static int defaultNumSamples
RandomSampleLoader
public RandomSampleLoader()
bindTo
public void bindTo(String fileName,
BufferedPositionedInputStream is,
long offset,
long end)
throws IOException
- Description copied from interface:
LoadFunc
- Specifies a portion of an InputStream to read tuples. Because the
starting and ending offsets may not be on record boundaries it is up to
the implementor to deal with figuring out the actual starting and ending
offsets in such a way that an arbitrarily sliced up file will be processed
in its entirety.
A common way of handling slices in the middle of records is to start at
the given offset and, if the offset is not zero, skip to the end of the
first record (which may be a partial record) before reading tuples.
Reading continues until a tuple has been read that ends at an offset past
the ending offset.
The load function should not do any buffering on the input stream. Buffering will
cause the offsets returned by is.getPos() to be unreliable.
- Specified by:
bindTo
in interface LoadFunc
- Overrides:
bindTo
in class BinStorage
- Parameters:
fileName
- the name of the file to be readis
- the stream representing the file to be processed, and which can also provide its position.offset
- the offset to start reading tuples.end
- the ending offset for reading.
- Throws:
IOException
getNext
public Tuple getNext()
throws IOException
- Description copied from interface:
LoadFunc
- Retrieves the next tuple to be processed.
- Specified by:
getNext
in interface LoadFunc
- Overrides:
getNext
in class BinStorage
- Returns:
- the next tuple to be processed or null if there are no more tuples
to be processed.
- Throws:
IOException
bindTo
public void bindTo(OutputStream os)
throws IOException
- Description copied from interface:
StoreFunc
- Specifies the OutputStream to write to. This will be called before
store(Tuple) is invoked.
- Specified by:
bindTo
in interface StoreFunc
- Overrides:
bindTo
in class BinStorage
- Parameters:
os
- The stream to write tuples to.
- Throws:
IOException
Copyright © ${year} The Apache Software Foundation