A developer's guide to EJB transaction management
A developer's guide to EJB transaction management
By Kevin Boone, Web-Tomorrow
Overview
This article describes the principles of transaction management in an EJB from the perspective of a Java developer; that is, it explains what the developer has to do to take advantage of the transaction management facilities offered by the EJB1.1 specification.
Basic principles
It is important to understand the the EJB transaction system is designed to support the EJB paradigm. What I mean by this is that the designers had very clear ideas about how EJBs should be used. The specification may allow EJBs to be used in other ways, but the developer is out on a limb in these cases. In particular, the following basic assumptions are made:
- An entity EJB is a representation of business data. Its state is represented completely by the values of its persistent instance variables, that is, those that are synchronized with the persistent store (database). If a transactional operation in such an EJB fails, then the system considers it to have been successfully rolled back if its underlying business data is put back to its pre-transactional state. Instance variables that do not represet data in the persistent store cannot be restored. If the EJB has container-managed persistence, then all the synchronization of the variables with the persistent store will be automatic, even after a rollback. This takes some work away from the developer, but it does imply that the EJB must be no more than an object representing a database row.
- Because of the assumption above, transaction management in entity EJBs should never be under programmer control. The EJB1.1 specification prohibits an EJB from managing its own transactions; the container must do everything. Therefore, if your transaction management requirements are beyond those offered by the container, then you can't use Entity EJBs at all. Of course, the EJB paradigm says that an entity EJB is simply a model of business data and, as such, will never need a more complicated transaction mechanism than that offered by the container.
- Where entity EJBs are involved in a transaction, the entity EJBs' methods will normally be called by a method in a session EJB.
Transaction management in EJBs is complicated by the fact that different principles apply to different (session and entity) EJBs, and the system works differently depending on which combinations of container-managed persistence and container-managed transcations the EJB is designed for. Transactions can be managed by the container; in this case the developer's role is to specify the correct `declarative' transaction attributes in the deployment descriptor. Only very limited programmatic control of transactions is allowed. Entity EJBs must use this mechanism. Alternatively, a session EJB can opt to manage its own transactions programatically. Because a session EJB s not just a model of business data, the container can only offer limited support for transaction management, as will be discussed. This means that if the session EJB has a state that will change during a transaction (e.g., instance variables will change value), then the developer has to provide support for this, even in container-managed transactions. The EJB API provides the SessionSynchronization interface for this, as will be described.
What is a transaction?
A transaction is a sequence of operations that must all complete successfully, or leave system in the state it had before the transaction started. The textbook example of a transaction, and the one I will use throughout this article, is the case of transferring money between two bank accounts.
Suppose we have an entity EJB that models a bank account (let's call it Account). It has a method updateBalance() to adjust the account balance, and a getBalance() to retrieve the current balance. Suppose we also have a session EJB (let's call it AccountManager), whose job is to handle the transfer of money between accounts. When a transfer occurs, the session EJB gets references to the two affected EJBs (let's call them account1 and account2). It calls account1.updateBalance(-XXX) to remove XXX units of currency from one account, and account2.updateBalance(XXX) to add XXX to the other. If both operations complete, the process is a success. If the first completes, but the second fails, then we have a situation where a sum of currency has dissappeared from the system. This is not good. There are many reasons why the operation may fail. For example, if the account details are stored in a relational database -- and they probably will be -- then any loss of availability of the database server (crash, power failure, network failure...) could cause the failure of the update operation.
The removal of money from account1 and its addition to account2 constitute a transaction. If the first step fails, we need to reverse whatever effect it had, and not continue with the second step. If step two fails, we want to reverse all effects. In all cases, we want the system to end up with either a completed transaction, or a return to the pre-transaction state. This reversal of the effects of a partially-completed transaction is called a rollback.
Transactions are not specifically associated with database management; any operation can be designated as transactional. However, the built-in support for transaction management in the EJB specification is mostly relevant to database access. It provides very little support for other kinds of transaction.
The technicalities of database transaction management, especially for global transactions (see below), are very complicated. Happily for the EJB developer, the EJB container, the JDBC drivers, and the database engine between them do most of the work. The EJB developer mostly needs to know how to hook into this process. The developer can, if desired, leave all aspects of transaction management to the EJB container. Provided the transactions comprise only database updates, this will work most of the time. Alternatively, the developer can exercise some finer control over the transaction process. The amount of control allowed is very limited for entity EJBs, but much greater for session EJBs.
Local and global transactions
All modern database engines can handle transactions access to their own data stores. A local transaction is exactly that: it is confined to a single process executing against a single database server. However, in more sophisticated systems a transaction may encompass multiple database servers, perhaps from different vendors and with different protocols. A transaction that can handle multiple databases is called a global (or `distributed') transaction; if the servers are different (in vendor, or protocol) this is called a heterogenous global transaction. (There is a subtle technical difference between a distributed transaction and a global transaction, but that won't concern us here).
A single database server cannot handle global transactions; neither is there direct support in the JDBC specification, because it operates at the level of an individual database connection. Generally we need a transaction manager to control global transactions. The Java Transaction API (JTA) provides access to the services offered by a transaction manager. If EJBs require control of global transactions, they can get access to JTA via the container, as will be explained.
Isolation levels
Transactions are supposed to be `atomic', that is, everything succeeds or everything fails together. However, database engines generally allow many simultaneous connections to be reading and updating data simultaneously. Thus it is very possible for one connection to read or update data that another connection is in the process of reading or updating. Where this simulataneous access is permitted, the following types of error can occur.
Dirty reads A dirty read occurs when data is read by one connection, which may later be rolled back by a transaction that is occurring on a different connection. For example, suppose the account1.updateBalance() method has been called to subtract a sum from account1, and then -- as part of a different process -- account1.getBalance() is called. This process will read the depleted adjusted balance in account1. Then suppose the operation account2.updateBalance() was called, but failed, and the whole transaction rolled back. The value obtained by the call to account2.getBalance() is now incorrect, because it read a value that was changed by a later rollback.
Non-repeatable reads A non-repeatable read occurs when a single transaction reads the same row of data twice, but another connection changes it between the two reads. This will not occur in our simple example, as the transaction consists only of a single read and a single write. I may occur however, in the following example. We wish periodically to find all the bank accounts that are overdrawn, and replenish them with funds taken from in-credit accounts held by the same customer. The first operation will be to search all accounts for overdrawn ones. Then for each account in this set we search all accounts for those in credit and held by the same customer. If another connection updates an account balance between these two searches, we could possibly end up with the same account in both sets; that is, in credit and overdrawn.
This type of error can only occur in a transaction that comprises multiple reads on the same object, which is uncommon.
Phantom reads A phantom read occurs when a single transaction performs a search that can retrieve a variable number of objects, then another connection inserts an object that would have matched the search criteria, then the first transaction performs the same search again. On the second search an extra object will be retrieved. This situation is even less common than the previous one (non-repeatable read), because this one requires the transaction to carry out a search operation, not merely a read.
A database engine will provide isolation of transactions, that is, transactions are prevented from interacting with one another and giving rise to the error described above. However, complete isolation of transactions, although offering the greatest assurance of data integrity, is potentially a performance bottleneck and may not be required. An EJB that is starting a transaction may therefore, in some circumstances, request a particular level of isolation, as will be discussed below.
Bear in mind that isolation is only an issue where simultaneous access may occur to the same database table. Therefore it does not affect what happens within a transaction, but only simultaneous transactions on the same database. Therefore, when using global transactions, isolation level has to be set individually for all connections that are part of the global transaction.
Declarative transactions
Declarative transactions are specified in the XML deployment descriptor; no specific coding is required. According the EJB1.1 specification, entity EJBs can only use declarative transaction management. Declarative transactions are managed exclusively by the container in entity EJBs, and by the container with support from the developer in session EJBs. The EJB specification defines a number of transaction attributes, which control the way transaction management is done. These can be applied to the whole EJB, or to individual methods.
When the container detects that a particular EJB method has an attribute that specifies that a transaction is required, then it will create one (or use an existing one, where this is allowed). With a new transaction, the container will prepare the transaction just before the EJB's method is entered. It will commit the transaction just after the EJB method exits. Transaction attributes specify when to create a new transaction and when to use an existing one, as described below.
You will notice that none of the transaction attributes allow a method to be association with more than one transaction; at best an existing transaction is suspended while a new one is in force. In other words, nested transactions are specifically disallowed. For session beans, the use of bean-managed, programmatic transactions provides greater flexibility but at the expense, of course, of more developer effort.
Transaction attributes
Required
`Required' is probably the best choice (at least initially) for an EJB method that will need to be transactional. In this case, if the method's caller is already part of a transaction, then the EJB method does not create a new transaction, but continues in the same transaction as its caller. If the caller is not in a transaction, then a new transaction is created for the EJB method. If something happens in the EJB that means that a rollback is required, then the extent of the rollback will include everything done in the EJB method, whatever the condition of the caller. If the caller was in a transaction, then everything done by the caller will be rolled back as well. Thus the `required' attribute ensures that any work done by the EJB will be rolled back if necessary, and if the caller requires a rollback that too will be rolled back.
RequiresNew
`RequiresNew' will be appropriate if you want to ensure that the EJB method is rolled back if necessary, but you don't want the rollback to propogate back to the caller. This attribute results in the creation of a new transaction for the method, regardless of the transactional state of the caller. If the caller was operating in a transaction, then its transaction is suspended until the EJB method completes. Because a new transaction is always created, there may be a slight performance penalty if this attribute is over-used.
Mandatory
With the `mandatory' attribute, the EJB method will not even start unless its caller is in a transaction. It will throw a TransactionRequiredException instead. If the method does start, then it will become part of the transaction of the caller. So if the EJB method signals a failure, the caller will be rolled back as well as the EJB.
Supports
With this attribute, the EJB method does not care about the transactional context of its caller. If the caller is part of a transaction, then the EJB method will be part of the same transaction. If the EJB method fails, the transaction will roll back. If the caller is not part of a transaction, then the EJB method will still operate, but a failure will not cause anything to roll back. `Supports' is probably the attribute that leads to the fastest method call (as there is no transactional overhead), but it can lead to unpredicatable results. If you want a method to be isolated from transactions, that is, to have no effect on the transaction of its caller, then use `NotSupported' instead.
NotSupported
With the `NotSupported' attribute, the EJB method will never take part in a transaction. If the caller is part of a transaction, then the caller's transaction is suspended. If the EJB method fails, there will be no effect on the caller's transaction, and no rollback will occur. Use this method if you want to ensure that the EJB method will not cause a rollback in its caller. This is appropriate if, for example, the method does something non-essential, such as logging a message. It would not be helpful if the failure of this operation caused a transaction rollback.
Never
The `NotSupported' attribute will ensure that the EJB method is never called by a transactional caller. Any attempt to do so will result in a RemoteException being thrown. This attribute is probably less useful than `NotSupported', in that NotSupported will assure that the caller's transaction is never affected by the EJB method (just as `Never' does), but will allow a call from a transactional caller if necessary.
Choice of transaction attributes
The J2EE developer's guide recommends that the `Requires' attribute be used on any method that will take part in a transaction, unless you have a good reason to do something else. In the `bank account' example, the method in AccountManager that controls the transfer of funds between two accounts should have this attribute. If this method is called as part of a broader transaction, then the balance transfer will be part of that broader transaction. If not, it will get its own transaction. In either case, we will get a transaction to handle the balance transfer. The method Account.updateBalance would also work the `Requires', but may require `Mandatory'. In the latter case we are stating that a balance update must always be part of a wider transaction (because the money going in to this account must be coming from somewhere else). A more detailed analysis of the application would be required to determine whether `Requires' or `Mandatory' is needed. It would certainly be inappropriate to use RequiresNew, because we don't want this balance update to be its own transaction, we want it to be part of a wider transaction.
Rollback
When to roll back
In a transactional method, either the entire method completes, or all the work done in the method has to be undone. Specifically, all relevant database updates have to be rolled back, that is, returned to the pre-transaction state. With declarative transactions, all rollbacks must be managed by the container. How does the EJB container know when to initaite a rollback? There are two main determinants.
- A method throws a `system' exception, such as EJBException. If this exception is not caught within the method, then the method will terminate and the container will begin a rollback. However, an application-defined exception will not bring about the same effect. In this case you will need to call the setRollbackOnly() method in the EJB's context before throwing the exception, as described below.
- At any point the method can call setRollbackOnly(), which will indicate that a rollback is to be performed when the method exits.
What gets rolled back
If a transactional method fails, in an entity-bean or a container-managed session bean, the container will automatically roll back all database updates made via JDCB. In an entity bean, it will also call ejbLoad() to allow the bean to reload its persistent instance variables from the database. With an entity bean that has container-managed persistence as well as container-managed transactions (and it is a specification issue that all entity beans must use container-managed transactions) this implies that the developer of an entity bean need not do anything in particular to handle transaction rollback.
How to roll back things that don't get rolled back automatically
With session beans, the container cannot restore values of the EJB's instance variables, as it has no way of knowing which of these variables are associated with the transaction. Therefore the bean developer must provide code for this. Note, however, that container-managed transactions, if specificed, will still roll back database access. The container can provide an indication to the session EJB of the progress of a transaction, so that it can takes steps internally to maintain its instance variables. If a session EJB implements javax.ejb.SessionSynchronization, then the methods it specifies will be called by the container whenever the session EJB is involved in a transaction. These methods are described below.
Method |
Purpose |
afterBegin() |
This indicates that a transaction involving one of the EJB's methods is about to start. The method should put the EJB's data into such a state that it can be restored if a rollback is signalled later. For example, it could store `backup' versions of its instance variables for later retrieval. |
beforeCompletion() |
This signals that the container is about to commit a transaction. It would, of course, have called afterBegin() previously, to signal that a transaction had started. In this method, the EJB can, if it wishes, call setRollbackOnly() to prevent the transaction being committed. |
afterCompletion(boolean committed) |
This signals that a transaction has been completed. It will either have been committed, or rolled back. Typically, if the committed flag is set, then the method does not do anything. If it is unset, then the method must restore the EJB to its pre-transactional state; it could do this by restoring the values of properties that were saved in the afterBegin() method. |
Note that container-managed EJB's must not attempt to manage transactions beyond the use of the facilities described above.
Programmatic transactions
If bean-managed, programmatic transactions are specified for a session bean, then the bean developer controls all aspects of transaction management. However, the situation is not as complicated as it may appear, as the developer can take advantage of the transaction management systems of JDBC, and of the Java Transaction API (JTA).
Using JDBC transactions
Support for transactions is built into JDBC, so coding bean-managed transactions for a single database is straightforward. Having obtained a JDBC java.sql.Connection object, the method can carry out a database transaction like this:
connection.setAutoCommit(false); // disable query-by-query commit
try
{
// subtract currency from account1
// add to account2
connection.commit(); // success: commit all
}
catch (Exception e)
{
connection.rollback(); // failure: roll back
}
(note that the rollback() method can itself fail, and throw an exception, which ought to be handled).
If the transaction involves more than just database operations, then the catch block has to reverse everything done in the preceeding try. This may include restoring the values of instance variables modified during the transaction. Therefore before the transaction starts, the method should prepare itself to be able to restore after a rollback. In addition, the transaction may consist of calls to methods in other EJBs; the developer of this EJB must decide whether a failure in such a call should result in a rollback in this method, or not.
Remember that stateless session EJBs do not have instance variables that are association with a particular client. This means that any method that begins a transaction must complete that transaction (either by a commit or a rollback). Transactions cannot span methods. This restriction does not apply to stateful session EJBs; in that case a transaction may encompass a number of different methods. However, a JDBC transaction is automatically committed when its connection is closed; this means when a transaction spans multiple methods the connection must remain open between method calls. JTA transactions don't have this limitation.
JDBC transaction are simple to use, but are limited to being confined to a single database (technically, to a datasource). JTA transactions, on the otherhand, can be global.
Using JTA transactions
The EJB container provides each session EJB instance with an instance of javax.transaction.UserTransaction, which provides access to the Java Transaction Service (JTS). JTA transactions can be global, that is, span multiple databases from mutliple vendors, and are associated with a thread, not with a connection to a datasource. The method in AccountManager that does a balance transfer using a JTA transaction may look something like this:
UserTransaction userTransaction = sessionContext.getUserTransaction();
try
{
userTransaction.begin();
Connection c1 = ...; // get connection to first datasource
// Substract currency from account1
c1.close();
Connection c2 = ...; // get connection to first datasource
// Add currency to account2
c2.close();
userTransaction.commit();
}
catch (Exception e)
{
userTransaction.rollback();
}
Of course we would only need to do this in the `bank account' example if the accounts were held on different databases.
Note that closing a database connection does not commit it, and that the transaction may span multiple methods.
Isolation levels in EJB transactions
As discussed earlier, a database engine will provide isolation of transactions, that is, transactions are prevented from interacting with one another and giving rise to the error described above. However, complete isolation of transactions, although offering the greatest assurance of data integrity, is potentially a performance bottleneck and may not be required. An EJB that is starting a transaction may therefore, in some circumstances, request a particular level of isolation. The levels that are available are shown below.
Isolation level |
Dirty reads may occur |
Non-repeatable reads may occur |
Phantom reads may occur |
TRANSACTION_READ_UNCOMMITTED (no isolation) |
yes |
yes |
yes |
TRANSACTION_READ_COMMITTED (partial isolation) |
No |
yes |
yes |
TRANSACTION_REPEATABLE_READ (partial isolation) |
No |
No |
yes |
TRANSACTION_SERIALIZABLE (full isolation) |
No |
No |
No |
Note that not all database engines will allow all these isolation levels. Oracle, for example, does not support TRANSACTION_READ_UNCOMMITTED. In practice, TRANSACTION_READ_COMITTED is suitable for most applications, and is the default for most databases.
With entity EJBs that use container-managed persistence, the developer has no control over the isolation level provided for transactions. This is because the synchronization of the data with the persistent store is handled entirely by the container. The container will generally use the database engine's default isolation level -- usually TRANSACTION_READ_COMMITTED. Thus the most common cause of error -- dirty read -- is prevented, but other errors may still arise. In all other cases, the EJB may set the isolation level.
Isolation level cannot be set declaratively; it operates at the connection level. Having obtained a java.sql.Connection object, we can call its setIsolationLevel method to set the isolation level (provided that the database engine supports it).
Summary of transaction management choices
Type of bean |
Transaction management choices |
||||
Entity EJBs with CMP |
Must use container managed transactions; development effort is limited to specifying the correct transaction attributes in the deployment descriptor. In a rollback, the bean restores its pre-transactional state in the ejbLoad() method, which the container calls after a rollback. Since the developer will already have provided an ejbLoad() method, no extra programming is required. The state may involve instance variables other than those synchronized with the persistent store if they can be recovered in ejbLoad(); but the container does not tell the EJB when a transaction is about to begin, so it is difficult to prepare the EJB for recovery. In practice, therefore, only database-synced variables are rolled back. |
||||
Entity EJBs with BMP |
Must use container managed transactions; the persistent instance variables are reloaded by the container after a rollback, and the developer does not (and can not) programatically restore any instance variables. The developer need only specify the appropriate transaction attributes in the deloyment descriptor. |
||||
Stateful or stateless session EJBs |
|