8. Eager Fetching

8. Eager Fetching
Prev	Chapter 5. Persistent Classes	Next

8.1. Configuring Eager Fetching
8.2. Eager Fetching Considerations and Limitations

Eager fetching is the ability to efficiently load subclass data and related objects along with the base instances being queried. Typically, OpenJPA has to make a trip to the database whenever a relation is loaded, or when you first access data that is mapped to a table other than the least-derived superclass table. If you perform a query that returns 100 Person objects, and then you have to retrieve the Address for each person, OpenJPA may make as many as 101 queries (the initial query, plus one for the address of each person returned). Or if some of the Person instances turn out to be Employees, where Employee has additional data in its own joined table, OpenJPA once again might need to make extra database trips to access the additional employee data. With eager fetching, OpenJPA can reduce these cases to a single query.

Eager fetching only affects relations in the active fetch groups, and is limited by the declared maximum fetch depth and field recursion depth (see Section 7, “ Fetch Groups ”). In other words, relations that would not normally be loaded immediately when retrieving an object or accessing a field are not affected by eager fetching. In our example above, the address of each person would only be eagerly fetched if the query were configured to include the address field or its fetch group, or if the address were in the default fetch group. This allows you to control exactly which fields are eagerly fetched in different situations. Similarly, queries that exclude subclasses aren't affected by eager subclass fetching, described below.

Eager fetching has three modes:

none: No eager fetching is performed. Related objects are always loaded in an independent select statement. No joined subclass data is loaded unless it is in the table(s) for the base type being queried. Unjoined subclass data is loaded using separate select statements rather than a SQL UNION operation.
join: In this mode, OpenJPA joins to to-one relations in the configured fetch groups. If OpenJPA is loading data for a single instance, then OpenJPA will also join to any collection field in the configured fetch groups. When loading data for multiple instances, though, (such as when executing a Query) OpenJPA will not join to collections by default. Instead, OpenJPA defaults to parallel mode for collections, as described below. You can force OpenJPA use a join rather than parallel mode for a collection field using the metadata extension described in Section 13.2.1, “ Eager Fetch Mode ”.
Under join mode, OpenJPA uses a left outer join (or inner join, if the relations' field metadata declares the relation non-nullable) to select the related data along with the data for the target objects. This process works recursively for to-one joins, so that if Person has an Address, and Address has a TelephoneNumber, and the fetch groups are configured correctly, OpenJPA might issue a single select that joins across the tables for all three classes. To-many joins can not recursively spawn other to-many joins, but they can spawn recursive to-one joins.
Under the join subclass fetch mode, subclass data in joined tables is selected by outer joining to all possible subclass tables of the type being queried. As you'll see below, subclass data fetching is configured separately from relation fetching, and can be disabled for specific classes.
Note
Some databases may not support outer joins. Also, OpenJPA can not use outer joins if you have set the DBDictionary's JoinSyntax to traditional. See Section 6, “ Setting the SQL Join Syntax ”.
parallel: Under this mode, OpenJPA selects to-one relations and joined collections as outlined in the join mode description above. Unjoined collection fields, however, are eagerly fetched using a separate select statement for each collection, executed in parallel with the select statement for the target objects. The parallel selects use the WHERE conditions from the primary select, but add their own joins to reach the related data. Thus, if you perform a query that returns 100 Company objects, where each company has a list of Employee objects and Department objects, OpenJPA will make 3 queries. The first will select the company objects, the second will select the employees for those companies, and the third will select the departments for the same companies. Just as for joins, this process can be recursively applied to the objects in the relations being eagerly fetched. Continuing our example, if the Employee class had a list of Projects in one of the fetch groups being loaded, OpenJPA would execute a single additional select in parallel to load the projects of all employees of the matching companies.
Using an additional select to load each collection avoids transferring more data than necessary from the database to the application. If eager joins were used instead of parallel select statements, each collection added to the configured fetch groups would cause the amount of data being transferred to rise dangerously, to the point that you could easily overwhelm the network.
Polymorphic to-one relations to table-per-class mappings use parallel eager fetching because proper joins are impossible. You can force other to-one relations to use parallel rather than join mode eager fetching using the metadata extension described in Section 13.2.1, “ Eager Fetch Mode ”.
Parallel subclass fetch mode only applies to queries on joined inheritance hierarchies. Rather than outer-joining to subclass tables, OpenJPA will issue the query separately for each subclass. In all other situations, parallel subclass fetch mode acts just like join mode in regards to vertically-mapped subclasses.
When OpenJPA knows that it is selecting for a single object only, it never uses parallel mode, because the additional selects can be made lazily just as efficiently. This mode only increases efficiency over join mode when multiple objects with eager relations are being loaded, or when multiple selects might be faster than joining to all possible subclasses.

8.1. Configuring Eager Fetching

You can control OpenJPA's default eager fetch mode through the openjpa.jdbc.EagerFetchMode and openjpa.jdbc.SubclassFetchMode configuration properties. Set each of these properties to one of the mode names described in the previous section: none, join, parallel. If left unset, the eager fetch mode defaults to parallel and the subclass fetch mode defaults to join These are generally the most robust and performant strategies.

You can easily override the default fetch modes at runtime for any lookup or query through OpenJPA's fetch configuration APIs. See Chapter 9, Runtime Extensions for details.

Example 5.22. Setting the Default Eager Fetch Mode

<property name="openjpa.jdbc.EagerFetchMode" value="parallel"/>
<property name="openjpa.jdbc.SubclassFetchMode" value="join"/>

Example 5.23. Setting the Eager Fetch Mode at Runtime

import org.apache.openjpa.persistence.*;
import org.apache.openjpa.persistence.jdbc.*;

...

Query q = em.createQuery("select p from Person p where p.address.state = 'TX'");
OpenJPAQuery kq = OpenJPAPersistence.cast(q);
JDBCFetchPlan fetch = (JDBCFetchPlan) kq.getFetchPlan();
fetch.setEagerFetchMode(FetchMode.PARALLEL);
fetch.setSubclassFetchMode(FetchMode.JOIN);
List results = q.getResultList();

You can specify a default subclass fetch mode for an individual class with the metadata extension described in Section 13.1.1, “ Subclass Fetch Mode ”. Note, however, that you cannot "upgrade" the runtime fetch mode with your class setting. If the runtime fetch mode is none, no eager subclass data fetching will take place, regardless of your metadata setting.

This applies to the eager fetch mode metadata extension as well (see Section 13.2.1, “ Eager Fetch Mode ”). You can use this extension to disable eager fetching on a field or to declare that a collection would rather use joins than parallel selects or vice versa. But an extension value of join won't cause any eager joining if the fetch configuration's setting is none.

8.2. Eager Fetching Considerations and Limitations

There are several important points that you should consider when using eager fetching:

When you are using parallel eager fetch mode and you have large result sets enabled (see Section 10, “ Large Result Sets ”) or you place a range on a query, OpenJPA performs the needed parallel selects on one page of results at a time. For example, suppose your FetchBatchSize is set to 20, and you perform a large result set query on a class that has collection fields in the configured fetch groups. OpenJPA will immediately cache the first 20 results of the query using join mode eager fetching only. Then, it will issue the extra selects needed to eager fetch your collection fields according to parallel mode. Each select will use a SQL IN clause (or multiple OR clauses if your class has a compound primary key) to limit the selected collection elements to those owned by the 20 cached results.
Once you iterate past the first 20 results, OpenJPA will cache the next 20 and again issue any needed extra selects for collection fields, and so on. This pattern ensures that you get the benefits of eager fetching without bringing more data into memory than anticipated.
Once OpenJPA eager-joins into a class, it cannot issue any further eager to-many joins or parallel selects from that class in the same query. To-one joins, however, can recurse to any level.
Using a to-many join makes it impossible to determine the number of instances the result set contains without traversing the entire set. This is because each result object might be represented by multiple rows. Thus, queries with a range specification or queries configured for lazy result set traversal automatically turn off eager to-many joining.
OpenJPA cannot eagerly join to polymorphic relations to non-leaf classes in a table-per-class inheritance hierarchy. You can work around this restriction using the mapping extensions described in Section 13.2.2, “ Nonpolymorphic ”.

Prev	Up	Next
7. Fetch Groups	Home	Chapter 6. Metadata