Eager fetching is the ability to efficiently load subclass data and related
objects along with the base instances being queried. Typically, OpenJPA has to
make a trip to the database whenever a relation is loaded, or when you first
access data that is mapped to a table other than the least-derived superclass
table. If you perform a query that returns 100 Person
objects, and then you have to retrieve the Address
for
each person, OpenJPA may make as many as 101 queries (the initial query, plus
one for the address of each person returned). Or if some of the
Person
instances turn out to be Employee
s,
where Employee
has additional data in its own joined
table, OpenJPA once again might need to make extra database trips to access the
additional employee data. With eager fetching, OpenJPA can reduce these cases to
a single query.
Eager fetching only affects relations in the active fetch groups, and is limited by the declared maximum fetch depth and field recursion depth (see Section 6, “ Fetch Groups ”). In other words, relations that would not normally be loaded immediately when retrieving an object or accessing a field are not affected by eager fetching. In our example above, the address of each person would only be eagerly fetched if the query were configured to include the address field or its fetch group, or if the address were in the default fetch group. This allows you to control exactly which fields are eagerly fetched in different situations. Similarly, queries that exclude subclasses aren't affected by eager subclass fetching, described below.
Eager fetching has three modes:
none
: No eager fetching is performed. Related objects are
always loaded in an independent select statement. No joined subclass data is
loaded unless it is in the table(s) for the base type being queried. Unjoined
subclass data is loaded using separate select statements rather than a SQL UNION
operation.
join
: In this mode, OpenJPA joins to to-one relations in the
configured fetch groups. If OpenJPA is loading data for a single instance, then
OpenJPA will also join to any collection field in the configured fetch groups.
When loading data for multiple instances, though, (such as when executing a
Query
) OpenJPA will not join to collections by default.
Instead, OpenJPA defaults to parallel
mode for collections,
as described below. You can force OpenJPA use a join rather than parallel mode
for a collection field using the metadata extension described in
Section 9.2.1, “
Eager Fetch Mode
”.
Under join
mode, OpenJPA uses a left outer join (or inner
join, if the relations' field metadata declares the relation non-nullable) to
select the related data along with the data for the target objects. This process
works recursively for to-one joins, so that if Person
has
an Address
, and Address
has a
TelephoneNumber
, and the fetch groups are configured
correctly, OpenJPA might issue a single select that joins across the tables for
all three classes. To-many joins can not recursively spawn other to-many joins,
but they can spawn recursive to-one joins.
Under the join
subclass fetch mode, subclass data in joined
tables is selected by outer joining to all possible subclass tables of the type
being queried. As you'll see below, subclass data fetching is configured
separately from relation fetching, and can be disabled for specific classes.
Some databases may not support outer joins. Also, OpenJPA can not use
outer joins if you have set the
DBDictionary
's JoinSyntax
to
traditional
. See Section 6, “
Setting the SQL Join Syntax
”.
parallel
: Under this mode, OpenJPA selects to-one relations
and joined collections as outlined in the join
mode
description above. Unjoined collection fields, however, are eagerly fetched
using a separate select statement for each collection, executed in parallel with
the select statement for the target objects. The parallel selects use the
WHERE
conditions from the primary select, but add their own
joins to reach the related data. Thus, if you perform a query that returns 100
Company
objects, where each company has a list of
Employee
objects and Department
objects, OpenJPA will make 3 queries. The first will select the company objects,
the second will select the employees for those companies, and the third will
select the departments for the same companies. Just as for joins, this process
can be recursively applied to the objects in the relations being eagerly
fetched. Continuing our example, if the Employee
class
had a list of Projects
in one of the fetch groups being
loaded, OpenJPA would execute a single additional select in parallel to load the
projects of all employees of the matching companies.
Using an additional select to load each collection avoids transferring more data than necessary from the database to the application. If eager joins were used instead of parallel select statements, each collection added to the configured fetch groups would cause the amount of data being transferred to rise dangerously, to the point that you could easily overwhelm the network.
Polymorphic to-one relations to table-per-class mappings use parallel eager fetching because proper joins are impossible. You can force other to-one relations to use parallel rather than join mode eager fetching using the metadata extension described in Section 9.2.1, “ Eager Fetch Mode ”.
Parallel subclass fetch mode only applies to queries on joined inheritance hierarchies. Rather than outer-joining to subclass tables, OpenJPA will issue the query separately for each subclass. In all other situations, parallel subclass fetch mode acts just like join mode in regards to vertically-mapped subclasses.
When OpenJPA knows that it is selecting for a single object only, it never uses
parallel
mode, because the additional selects can be made
lazily just as efficiently. This mode only increases efficiency over
join
mode when multiple objects with eager relations are being loaded,
or when multiple selects might be faster than joining to all possible
subclasses.
You can control OpenJPA's default eager fetch mode through the
openjpa.jdbc.EagerFetchMode
and
openjpa.jdbc.SubclassFetchMode
configuration properties. Set
each of these properties to one of the mode names described in the previous
section: none, join, parallel
. If left unset, the eager
fetch mode defaults to parallel
and the subclass fetch mode
defaults to join
These are generally the most robust and
performant strategies.
You can easily override the default fetch modes at runtime for any lookup or query through OpenJPA's fetch configuration APIs. See Chapter 9, Runtime Extensions for details.
Example 5.22. Setting the Default Eager Fetch Mode
<property name="openjpa.jdbc.EagerFetchMode" value="parallel"/> <property name="openjpa.jdbc.SubclassFetchMode" value="join"/>
Example 5.23. Setting the Eager Fetch Mode at Runtime
import org.apache.openjpa.persistence.*; import org.apache.openjpa.persistence.jdbc.*; ... Query q = em.createQuery("select p from Person p where p.address.state = 'TX'"); OpenJPAQuery kq = OpenJPAPersistence.cast(q); JDBCFetchPlan fetch = (JDBCFetchPlan) kq.getFetchPlan(); fetch.setEagerFetchMode(JDBCFetchPlan.EAGER_PARALLEL); fetch.setSubclassFetchMode(JDBCFetchPlan.EAGER_JOIN); List results = q.getResultList();
You can specify a default subclass fetch mode for an individual class with the
metadata extension described in Section 9.1.1, “
Subclass Fetch Mode
”.
Note, however, that you cannot "upgrade" the runtime fetch mode with your class
setting. If the runtime fetch mode is none
, no eager
subclass data fetching will take place, regardless of your metadata setting.
This applies to the eager fetch mode metadata extension as well (see
Section 9.2.1, “
Eager Fetch Mode
”). You can use this extension to
disable eager fetching on a field or to declare that a collection would rather
use joins than parallel selects or vice versa. But an extension value of
join
won't cause any eager joining if the fetch
configuration's setting is none
.
There are several important points that you should consider when using eager fetching:
When you are using parallel
eager fetch mode and you have
large result sets enabled (see Section 9, “
Large Result Sets
”)
or you place a range on a query, OpenJPA performs the needed parallel selects on
one page of results at a time. For example, suppose your
FetchBatchSize
is set to 20, and you perform a large result set query
on a class that has collection fields in the configured fetch groups. OpenJPA
will immediately cache the first 20
results of the query
using join
mode eager fetching only. Then, it will issue the
extra selects needed to eager fetch your collection fields according to
parallel
mode. Each select will use a SQL IN
clause (or multiple OR
clauses if your class has a
compound primary key) to limit the selected collection elements to those owned
by the 20 cached results.
Once you iterate past the first 20 results, OpenJPA will cache the next 20 and again issue any needed extra selects for collection fields, and so on. This pattern ensures that you get the benefits of eager fetching without bringing more data into memory than anticipated.
Once OpenJPA eager-joins into a class, it cannot issue any further eager to-many joins or parallel selects from that class in the same query. To-one joins, however, can recurse to any level.
Using a to-many join makes it impossible to determine the number of instances the result set contains without traversing the entire set. This is because each result object might be represented by multiple rows. Thus, queries with a range specification or queries configured for lazy result set traversal automatically turn off eager to-many joining.
OpenJPA cannot eagerly join to polymorphic relations to non-leaf classes in a table-per-class inheritance hierarchy. You can work around this restriction using the mapping extensions described in Section 9.2.2, “ Nonpolymorphic ”.