org.apache.solr.handler.dataimport
Class FileListEntityProcessor
java.lang.Object
org.apache.solr.handler.dataimport.EntityProcessor
org.apache.solr.handler.dataimport.EntityProcessorBase
org.apache.solr.handler.dataimport.FileListEntityProcessor
public class FileListEntityProcessor
- extends EntityProcessorBase
An EntityProcessor instance which can stream file names found in a given base
directory matching patterns and returning rows containing file information.
It supports querying a give base directory by matching:
- regular expressions to file names
- excluding certain files based on regular expression
- last modification date (newer or older than a given date or time)
- size (bigger or smaller than size given in bytes)
- recursively iterating through sub-directories
Its output can be used along with FileDataSource to read from files in file
systems.
Refer to http://wiki.apache.org/solr/DataImportHandler
for more details.
This API is experimental and may change in the future.
- Since:
- solr 1.3
- Version:
- $Id: FileListEntityProcessor.java 681182 2008-07-30 19:35:58Z shalin $
Fields inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase |
cachePk, cacheVariableName, cacheWithWhereClause, context, dataSourceRowCache, entityName, query, resolver, rowcache, rowIterator, simpleCache, SKIP_DOC, TRANSFORM_ROW, TRANSFORMER, transformers |
Method Summary |
void |
init(Context context)
This method is called when it starts processing an entity. |
Map<String,Object> |
nextRow()
For a simple implementation, this is the only method that the sub-class
should implement. |
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessorBase |
applyTransformer, cacheInit, clearSession, destroy, getAllNonCachedRows, getFromRowCache, getFromRowCacheTransformed, getIdCacheData, getNext, getSessionAttribute, getSimpleCacheData, nextDeletedRowKey, nextModifiedParentRowKey, nextModifiedRowKey, setSessionAttribute |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PLACE_HOLDER_PATTERN
public static final Pattern PLACE_HOLDER_PATTERN
DIR
public static final String DIR
- See Also:
- Constant Field Values
FILE
public static final String FILE
- See Also:
- Constant Field Values
ABSOLUTE_FILE
public static final String ABSOLUTE_FILE
- See Also:
- Constant Field Values
SIZE
public static final String SIZE
- See Also:
- Constant Field Values
LAST_MODIFIED
public static final String LAST_MODIFIED
- See Also:
- Constant Field Values
FILE_NAME
public static final String FILE_NAME
- See Also:
- Constant Field Values
BASE_DIR
public static final String BASE_DIR
- See Also:
- Constant Field Values
EXCLUDES
public static final String EXCLUDES
- See Also:
- Constant Field Values
NEWER_THAN
public static final String NEWER_THAN
- See Also:
- Constant Field Values
OLDER_THAN
public static final String OLDER_THAN
- See Also:
- Constant Field Values
BIGGER_THAN
public static final String BIGGER_THAN
- See Also:
- Constant Field Values
SMALLER_THAN
public static final String SMALLER_THAN
- See Also:
- Constant Field Values
RECURSIVE
public static final String RECURSIVE
- See Also:
- Constant Field Values
FileListEntityProcessor
public FileListEntityProcessor()
init
public void init(Context context)
- Description copied from class:
EntityProcessor
- This method is called when it starts processing an entity. When it comes
back to the entity it is called again. So it can reset anything at that point.
For a rootmost entity this is called only once for an ingestion. For sub-entities , this
is called multiple once for each row from its parent entity
- Overrides:
init
in class EntityProcessorBase
- Parameters:
context
- The current context
nextRow
public Map<String,Object> nextRow()
- Description copied from class:
EntityProcessorBase
- For a simple implementation, this is the only method that the sub-class
should implement. This is intended to stream rows one-by-one. Return null
to signal end of rows
- Overrides:
nextRow
in class EntityProcessorBase
- Returns:
- a row where the key is the name of the field and value can be any
Object or a Collection of objects. Return null to signal end of
rows
Copyright © 2008 Apache Software Foundation. All Rights Reserved.