org.apache.pig
Interface Slicer

All Known Implementing Classes:
PigSlicer

public interface Slicer

Produces independent slices of data from a given location to be processed in parallel by Pig.

If a class implementing this interface is given as the LoadFunc in a Pig script, it will be used to make slices for that load statement.


Method Summary
 Slice[] slice(DataStorage store, String location)
          Creates slices of data from store at location.
 void validate(DataStorage store, String location)
          Checks that location is parsable by this Slicer, and that if the DataStorage is used by the Slicer, it's readable from there.
 

Method Detail

validate

void validate(DataStorage store,
              String location)
              throws IOException
Checks that location is parsable by this Slicer, and that if the DataStorage is used by the Slicer, it's readable from there. If it isn't, an IOException with a message explaining why will be thrown.

This does not ensure that all the data in location is valid. It's a preflight check that there's some chance of the Slicer working before actual Slices are created and sent off for processing.

Throws:
IOException

slice

Slice[] slice(DataStorage store,
              String location)
              throws IOException
Creates slices of data from store at location.

Returns:
the Slices to be serialized and sent out to nodes for processing.
Throws:
IOException


Copyright © ${year} The Apache Software Foundation