Table of Contents
OpenJPA utilizes several configurable caches to maximize performance. This chapter explores OpenJPA's data cache, query cache, and query compilation cache.
The OpenJPA data cache is an optional cache of persistent object data that
operates at the EntityManagerFactory
level. This cache is
designed to significantly increase performance while remaining in full
compliance with the JPA standard. This means that turning on the caching option
can transparently increase the performance of your application, with no changes
to your code.
OpenJPA's data cache is not related to the EntityManager
cache dictated by the JPA specification. The JPA specification mandates behavior
for the EntityManager
cache aimed at guaranteeing
transaction isolation when operating on persistent objects.
OpenJPA's data cache is designed to provide significant performance increases over cacheless operation, while guaranteeing that behavior will be identical in both cache-enabled and cacheless operation.
There are five ways to access data via the OpenJPA APIs: standard relation
traversal, large result set relation traversal, queries, looking up an object by
id, and iteration over an Extent
. OpenJPA's cache plugin
accelerates three of these mechanisms. It does not provide any caching of large
result set relations or Extent
iterators. If you find
yourself in need of higher-performance Extent
iteration,
see Example 10.15, “
Query Replaces Extent
”.
Table 10.1. Data access methods
Access method | Uses cache |
---|---|
Standard relation traversal | Yes |
Large result set relation traversal | No |
Query | Yes |
Lookups by object id | Yes |
Iteration over an Extent
| No |
When enabled, the cache is checked before making a trip to the datastore. Data is stored in the cache when objects are committed and when persistent objects are loaded from the datastore.
OpenJPA's data cache can in both single-JVM and multi-JVM environments. Multi-JVM caching is achieved through the use of the distributed event notification framework described in Section 2, “ Remote Event Notification Framework ”, or through custom integrations with a third-party distributed cache.
The single JVM mode of operation maintains and shares a data cache across all
EntityManager
instances obtained from a particular
EntityManagerFactory
. This is not appropriate for use in
a distributed environment, as caches in different JVMs or created from different
EntityManagerFactory
objects will not be synchronized.
To enable the basic single-factory cache set the
openjpa.DataCache
property to true
, and set the
openjpa.RemoteCommitProvider
property to sjvm
:
Example 10.1. Single-JVM Data Cache
<property name="openjpa.DataCache" value="true"/> <property name="openjpa.RemoteCommitProvider" value="sjvm"/>
To configure the data cache to remain up-to-date in a distributed environment,
set the
openjpa.RemoteCommitProvider
property appropriately, or
integrate OpenJPA with a third-party caching solution. Remote commit providers
are described in Section 2, “
Remote Event Notification Framework
”.
OpenJPA's default implementation maintains a map of object
ids to cache data. By default, 1000 elements are kept in cache. When the cache
overflows, random entries are evicted. The maximum cache size can be
adjusted by setting the CacheSize
property in your plugin
string - see below for an example. Objects that are pinned into the cache are
not counted when determining if the cache size exceeds its maximum size.
Expired objects are moved to a soft reference map, so they may stick around for
a little while longer. You can control the number of soft references OpenJPA
keeps with the SoftReferenceSize
property. Soft references
are unlimited by default. Set to 0 to disable soft references completely.
Example 10.2. Data Cache Size
<property name="openjpa.DataCache" value="true(CacheSize=5000, SoftReferenceSize=0)"/>
You can specify a cache timeout value for a class by setting the timeout metadata extension to the amount of time in milliseconds a class's data is valid. Use a value of -1 for no expiration. This is the default value.
Example 10.3. Data Cache Timeout
Timeout Employee
objects after 10 seconds.
@Entity @DataCache(timeout=10000) public class Employee { ... }
Entities may be explicitly excluded from the cache by providing a list of fully qualified class names in the ExcludedTypes argument. The entities provided via ExcludedTypes will not be cached regardless of the @DataCache annotation.
Example 10.4. Excluding entities
Exclude entities foo.bar.Person and foo.bar.Employee from the cache.
<property name="openjpa.DataCache" value="true(ExcludedTypes=foo.bar.Person;foo.bar.Employee)"/>
Entities may be explicitly included from the cache by providing a list of fully qualified class names in the Types argument. The entities provided via ExcludedTypes will not cached regardless of the @DataCache annotation. Any entities which are not included in this list will not be cached.
Example 10.5. Including entities
Include only entity foo.bar.FullTimeEmployee from the cache.
<property name="openjpa.DataCache" value="true(Types=foo.bar.FullTimeEmployee)"/>
See the
org.apache.openjpa.persistence.DataCache
Javadoc
for more information on the DataCache
annotation.
A cache can specify that it should be cleared at certain times rather than using
data timeouts. The EvictionSchedule
property of OpenJPA's
cache implementation accepts a cron
style eviction schedule.
The format of this property is a whitespace-separated list of five tokens, where
the *
symbol (asterisk), indicates match all. The tokens are,
in order:
Minute
Hour of Day
Day of Month
Month
Day of Week
For example, the following openjpa.DataCache
setting
schedules the default cache to evict values from the cache at 15 and 45 minutes
past 3 PM on Sunday.
true(EvictionSchedule='15,45 15 * * 1')
The org.apache.openjpa.datacache
package defines OpenJPA's
data caching framework. While you may use this framework directly (see its
Javadoc for details), its APIs are meant primarily for service
providers. In fact, Section 1.4, “
Cache Extension
” below has
tips on how to use this package to extend OpenJPA's caching service yourself.
Rather than use the low-level org.apache.openjpa.datacache
package APIs, JPA users should typically access the data cache through OpenJPA's
high-level
org.apache.openjpa.persistence.StoreCache
facade.
This facade has methods to pin and unpin records, evict data from the cache, and
more.
public StoreCache getStoreCache();
You obtain the StoreCache
through the
OpenJPAEntityManagerFactory.getStoreCache
method.
Example 10.6. Accessing the StoreCache
import org.apache.openjpa.persistence.*; ... OpenJPAEntityManagerFactory oemf = OpenJPAPersistence.cast(emf); StoreCache cache = oemf.getStoreCache(); ...
public void evict(Class cls, Object oid); public void evictAll(); public void evictAll(Class cls, Object... oids); public void evictAll(Class cls, Collection oids);
The evict
methods tell the cache to release data. Each
method takes an entity class and one or more identity values, and releases the
cached data for the corresponding persistent instances. The
evictAll
method with no arguments clears the cache. Eviction is
useful when the datastore is changed by a separate process outside OpenJPA's
control. In this scenario, you typically have to manually evict the data from
the datastore cache; otherwise the OpenJPA runtime, oblivious to the changes,
will maintain its stale copy.
public void pin(Class cls, Object oid); public void pinAll(Class cls, Object... oids); public void pinAll(Class cls, Collection oids); public void unpin(Class cls, Object oid); public void unpinAll(Class cls, Object... oids); public void unpinAll(Class cls, Collection oids);
Most caches are of limited size. Pinning an identity to the cache ensures that
the cache will will not kick the data for the corresponding instance out of the
cache, unless you manually evict it. Note that even after manual eviction, the
data will get pinned again the next time it is fetched from the store. You can
only remove a pin and make the data once again available for normal cache
overflow eviction through the unpin
methods. Use
pinning when you want a guarantee that a certain object will always be available
from cache, rather than requiring a datastore trip.
Example 10.7. StoreCache Usage
import org.apache.openjpa.persistence.*; ... OpenJPAEntityManagerFactory oemf = OpenJPAPersistence.cast(emf); StoreCache cache = oemf.getStoreCache(); cache.pin(Magazine.class, popularMag.getId()); cache.evict(Magazine.class, changedMag.getId());
See the StoreCache
Javadoc for information on additional functionality it provides. Also,
Chapter 9,
Runtime Extensions
discusses OpenJPA's other extensions
to the standard set of JPA runtime interfaces.
The examples above include calls to evict
to manually
remove data from the data cache. Rather than evicting objects from the data
cache directly, you can also configure OpenJPA to automatically evict objects
from the data cache when you use the
OpenJPAEntityManager
's eviction APIs.
Example 10.8. Automatic Data Cache Eviction
<property name="openjpa.BrokerImpl" value="EvictFromDataCache=true"/>
import org.apache.openjpa.persistence.*; ... OpenJPAEntityManager oem = OpenJPAPersistence.cast(em); oem.evict(changedMag); // will evict from data cache also
In addition to the data cache, the org.apache.openjpa.datacache
package defines service provider interfaces for a query cache. The
query cache is enabled by default when the data cache is enabled. The query
cache stores the object ids returned by query executions. When you run a query,
OpenJPA assembles a key based on the query properties and the parameters used at
execution time, and checks for a cached query result. If one is found, the
object ids in the cached result are looked up, and the resultant
persistence-capable objects are returned. Otherwise, the query is executed
against the database, and the object ids loaded by the query are put into the
cache. The object id list is not cached until the list returned at query
execution time is fully traversed.
OpenJPA exposes a high-level interface to the query cache through the
org.apache.openjpa.persistence.QueryResultCache
class. You can access this class through the
OpenJPAEntityManagerFactory
.
Example 10.9. Accessing the QueryResultCache
import org.apache.openjpa.persistence.*; ... OpenJPAEntityManagerFactory oemf = OpenJPAPersistence.cast(emf); QueryResultCache qcache = oemf.getQueryResultCache();
The default query cache implementation caches 100 query executions in a
least-recently-used cache. This can be changed by setting the cache size in the
CacheSize
plugin property. Like the data cache, the query
cache also has a backing soft reference map. The SoftReferenceSize
property controls the size of this map. It is disabled by default.
Example 10.10. Query Cache Size
<property name="openjpa.QueryCache" value="CacheSize=1000, SoftReferenceSize=100"/>
To disable the query cache completely, set the openjpa.QueryCache
property to false
:
There are certain situations in which the query cache is bypassed:
Caching is not used for in-memory queries (queries in which the candidates are a
collection instead of a class or Extent
).
Caching is not used in transactions that have IgnoreChanges
set to false
and in which modifications to classes in the
query's access path have occurred. If none of the classes in the access path
have been touched, then cached results are still valid and are used.
Caching is not used in pessimistic transactions, since OpenJPA must go to the database to lock the appropriate rows.
Caching is not used when the the data cache does not have any cached data for an id in a query result.
Queries that use persistence-capable objects as parameters are only cached if the parameter is directly compared to field, as in:
select e from Employee e where e.company.address = :addr
If you extract field values from the parameter in your query string, or if the parameter is used in collection element comparisons, the query is not cached.
Queries that result in projections of custom field types or
BigDecimal
or BigInteger
fields are not
cached.
Cache results are removed from the cache when instances of classes in a cached
query's access path are touched. That is, if a query accesses data in class
A
, and instances of class A
are
modified, deleted, or inserted, then the cached query result is dropped from the
cache.
It is possible to tell the query cache that a class has been altered. This is only necessary when the changes occur via direct modification of the database outside of OpenJPA's control. You can also evict individual queries, or clear the entire cache.
public void evict(Query q); public void evictAll(Class cls); public void evictAll();
For JPA queries with parameters, set the desired parameter values into the
Query
instance before calling the above methods.
Example 10.12. Evicting Queries
import org.apache.openjpa.persistence.*; ... OpenJPAEntityManagerFactory oemf = OpenJPAPersistence.cast(emf); QueryResultCache qcache = oemf.getQueryResultCache(); // evict all queries that can be affected by changes to Magazines qcache.evictAll(Magazine.class); // evict an individual query with parameters EntityManager em = emf.createEntityManager(); Query q = em.createQuery(...). setParameter(0, paramVal0). setParameter(1, paramVal1); qcache.evict (q);
When using one of OpenJPA's distributed cache implementations, it is necessary to perform this in every JVM - the change notification is not propagated automatically. When using a third-party coherent caching solution, it is not necessary to do this in every JVM (although it won't hurt to do so), as the cache results are stored directly in the coherent cache.
Queries can also be pinned and unpinned through the
QueryResultCache
. The semantics of these operations are the same
as pinning and unpinning data from the data cache.
public void pin(Query q); public void unpin(Query q);
For JPA queries with parameters, set the desired parameter values into the
Query
instance before calling the above methods.
The following example shows these APIs in action.
Example 10.13. Pinning, and Unpinning Query Results
import org.apache.openjpa.persistence.*; ... OpenJPAEntityManagerFactory oemf = OpenJPAPersistence.cast(emf); QueryResultCache qcache = oemf.getQueryResultCache(); EntityManager em = emf.createEntityManager(); Query pinQuery = em.createQuery(...). setParameter(0, paramVal0). setParameter(1, paramVal1); qcache.pin(pinQuery); Query unpinQuery = em.createQuery(...). setParameter(0, paramVal0). setParameter(1, paramVal1); qcache.unpin(unpinQuery);
Pinning data into the cache instructs the cache to not expire the pinned results when cache flushing occurs. However, pinned results will be removed from the cache if an event occurs that invalidates the results.
You can disable caching on a per-EntityManager
or
per-Query
basis:
Example 10.14. Disabling and Enabling Query Caching
import org.apache.openjpa.persistence.*; ... // temporarily disable query caching for all queries created from em OpenJPAEntityManager oem = OpenJPAPersistence.cast(em); oem.getFetchPlan ().setQueryResultCache(false); // re-enable caching for a particular query OpenJPAQuery oq = oem.createQuery(...); oq.getFetchPlan().setQueryResultCache(true);
The provided data cache classes can be easily extended to add additional
functionality. If you are adding new behavior, you should extend
org.apache.openjpa.datacache.DataCacheImpl
. To use your own storage
mechanism, extend org.apache.openjpa.datacache.AbstractDataCache
, or implement org.apache.openjpa.datacache.DataCache
directly. If you want to implement a distributed cache that uses an
unsupported method for communications, create an implementation of
org.apache.openjpa.event.RemoteCommitProvider
. This process is
described in greater detail in
Section 2.2, “
Customization
”.
The query cache is just as easy to extend. Add functionality by extending the
default org.apache.openjpa.datacache.QueryCacheImpl
.
Implement your own storage mechanism for query results by extending
org.apache.openjpa.datacache.AbstractQueryCache
or implementing the
org.apache.openjpa.datacache.QueryCache
interface
directly.
The default cache implementations do not automatically
refresh objects in other EntityManager
s when the cache
is updated or invalidated. This behavior would not be compliant with the JPA
specification.
Invoking OpenJPAEntityManager.evict
does not
result in the corresponding data being dropped from the data cache,
unless you have set the proper configuration options as explained above (see
Example 10.8, “
Automatic Data Cache Eviction
”). Other methods related to the
EntityManager
cache also do not affect the data cache.
The data cache assumes that it is up-to-date with respect to the datastore, so it is effectively an in-memory extension of the database. To manipulate the data cache, you should generally use the data cache facades presented in this chapter.
You must specify a org.apache.openjpa.event.RemoteCommitProvider
(via the
openjpa.RemoteCommitProvider
property) in order to use the data
cache, even when using the cache in a single-JVM mode. When using it in a
single-JVM context, set this property to sjvm
.
When using datastore (pessimistic) transactions in concert with the distributed caching implementations, it is possible to read stale data when reading data outside a transaction.
For example, if you have two JVMs (JVM A and JVM B) both communicating with each other, and JVM A obtains a data store lock on a particular object's underlying data, it is possible for JVM B to load the data from the cache without going to the datastore, and therefore load data that should be locked. This will only happen if JVM B attempts to read data that is already in its cache during the period between when JVM A locked the data and JVM B received and processed the invalidation notification.
This problem is impossible to solve without putting together a two-phase commit system for cache notifications, which would add significant overhead to the caching implementation. As a result, we recommend that people use optimistic locking when using data caching. If you do not, then understand that some of your non-transactional data may not be consistent with the datastore.
Note that when loading objects in a transaction, the appropriate datastore transactions will be obtained. So, transactional code will maintain its integrity.
Extent
s are not cached. So, if you plan on iterating
over a list of all the objects in an Extent
on a regular
basis, you will only benefit from caching if you do so with a Query
instead:
Example 10.15. Query Replaces Extent
import org.apache.openjpa.persistence.*; ... OpenJPAEntityManager oem = OpenJPAPersistence.cast(em); Extent extent = oem.getExtent(Magazine.class, false); // This iterator does not benefit from caching... Iterator uncachedIterator = extent.iterator(); // ... but this one does. OpenJPAQuery extentQuery = oem.createQuery(...); extentQuery.setSubclasses(false); Iterator cachedIterator = extentQuery.getResultList().iterator();