Apache UIMA (Unstructured Information Management Architecture) v2.2.2 Hotfix 1 -------------------------------------------------------------------------------- Apache UIMA v2.2.2 Hotfix 1 is a hotfix for Apache UIMA release v.2.2.2. With the hotfix two memory issues have been fixed. These are: - UIMA-1067: Remove char heap/ref heap in StringHeap of the CAS - UIMA-1068: Use of the JCas cache should be configurable To apply the hotfix, just replace the provided uima-core.jar in your Apache UIMA v2.2.2 installation. The documentation below explains how to disable the JCas cache by using UIMA performance tuning options. Performance Tuning Options -------------------------- There is a small number of performance tuning options available to influence the runtime behavior of UIMA applications. Performance tuning options need to be set programmatically when an analysis engine is created. You simply create a Java Properties object with the relevant options and pass it to the UIMA framework on the call to create an analysis engine. Below is an example. XMLParser parser = UIMAFramework.getXMLParser(); ResourceSpecifier spec = parser.parseResourceSpecifier( new XMLInputSource(descriptorFile)); // Create a new properties object to hold the settings. Properties performanceTuningSettings = new Properties(); // Set the initial CAS heap size. performanceTuningSettings.setProperty( UIMAFramework.CAS_INITIAL_HEAP_SIZE, "1000000"); // Disable JCas cache. performanceTuningSettings.setProperty( UIMAFramework.JCAS_CACHE_ENABLED, "false"); // Create a wrapper properties object that can // be passed to the framework. Properties additionalParams = new Properties(); // Set the performance tuning properties as value to // the appropriate parameter. additionalParams.put( Resource.PARAM_PERFORMANCE_TUNING_SETTINGS, performanceTuningSettings); // Create the analysis engine with the parameters. // The second, unused argument here is a custom // resource manager. this.ae = UIMAFramework.produceAnalysisEngine( spec, null, additionalParams); The following options are supported: * UIMAFramework.JCAS_CACHE_ENABLED: allows you to disable the JCas cache (true/false). The JCas cache is an internal datastructure that caches any JCas object created by the CAS. This may result in better performance for applications that make extensive use of the JCas, but also incurs a steep memory overhead. If you're processing large documents and have memory issues, you should disable this option. In general, just try running a few experiments to see what setting works better for your application. The JCas cache is enabled by default. * UIMAFramework.CAS_INITIAL_HEAP_SIZE: set the initial CAS heap size in number of cells (integer valued). The CAS uses 32bit integer cells, so four times the initial size is the approximate minimum size of the CAS in bytes. This is another space/time trade-off as growing the CAS heap is relatively expensive. On the other hand, setting the initial size too high is wasting memory. Unless you know you are processing very small or very large documents, you should probably leave this option unchanged. * UIMAFramework.PROCESS_TRACE_ENABLED: enable the process trace mechanism (true/false). When enabled, UIMA tracks the time spent in individual components of an aggregate AE or CPE. For more information, see the API documentation of org.apache.uima.util.ProcessTrace. * UIMAFramework.SOCKET_KEEPALIVE_ENABLED: enable socket KeepAlive (true/false). This setting is currently only supported by Vinci clients. Defaults to true. Disclaimer ----------- Apache UIMA is an effort undergoing incubation at The Apache Software Foundation (ASF). Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.