Merlin writes “The ability to understand and transactions and use them from within your application is what makes you a truly great database developer. This article explains how to incorporate transactions into your database development strategies. A follow up artical will give specific example and suggestions about how to deal with transatcions while working with ZeosDBO.
Locking PostgreSQL/Transactions & Multi-Version Concurrency Control (MVCC) (Part 1)
This article is primarily geared towards PostgreSQL developers but the concepts can be applied to any database that supports transactions.
Part 1 will describe the overall concepts involved and lay the foundation for transaction based development. Part 2 give concrete real world examples on how to apply these concepts in ZeosDBO.
What are the reasons for moving to a transactional model?
The non transactional model is commonly known as pessimistic locking. Sometimes it is referred to the ‘library book model’. Records inside tables are like library books: you borrow records from the server, and until you return it, nobody else can use it.
This model guarantees safety but is problematic for users: sometimes the book is taken out but never returned for extended periods of time. This has a direct cost of employee downtime and an indirect cost of administrator overhead who must deal with the locks.
There is additional overhead in development because all application code must be aware of the lock status of the record. Inconsistent handling of locked records is a major issue in application logic which is not easily resolved, and in-process locking errors are very difficult to develop and test for.
Sometimes, complex data interactions can’t be represented by a single lock. This requires ‘super locks’ which turn off entire processes or lock down records with application level code. Most of these conditions can be avoided while using a transactional data model with little or no extra coding while retaining the high level of data integrity that users expect.
What is MVCC?
MVCC is a system that facilities locking in a transaction system. MVCC systems offer very good performance in high concurrency situations. Oracle and PostgreSQL server are MVCC systems, SQL Server and Access are not. Non MVCC systems rely on physical locks on the record which have performance problems.
What is a transaction?
A transaction is a group of commands which are treated by the database server as a single logical unit. Even though the statements inside transaction are spread out over time, to other users of the database they happen in an instant. If there is any kind of error inside of a transaction, the entire unit of work is discarded by the server. Also, until the transaction is committed to the server, it can be ‘rolled back’ at any time. Virtually all modern database systems support transactions at some level. In fact, database performance is often measured in Transactions/Minute.
In the above diagram, the transaction begins with a begin statement. After that a number of statements execute which do some work on the database, such as posting or deleting. At some point, the transaction must be rolled back or committed to the server.
The simplest transaction is a single statement.
Do transactions lock records?
Yes; any record that is written to during a transaction is locked for the duration of the transaction. When the transaction is committed or rolled back, all the locks for the entire transaction are released.
If transaction A is trying to acquire a lock held by transaction B, transaction A will usually wait for a short period of time before aborting and returning an error message. Well designed database systems usually have small (in time duration) transactions, which reduces the likelihood of a transaction abort.
What about maintenance screens?
For developers used to the library book locking model, the first impulse will be to wrap an entire maintenance screen (a dialog that edits and posts a single record) operation inside a transaction. At the start of the operation (when entering add or change mode), the record is ‘dirtied’ by simulating a write operation on it. After that, until the record is aborted/written, a lock is held which effectively prevents other users from viewing the record inside the maintenance screen at one time:
Unfortunately, this is widely considered to be a problematic model even though this would provide a similar work flow to the library book model. In practice, ‘real’ concurrent updates to a single record are extremely rare in a production setting (as opposed to two users wanting to view the record). The fact that one user has to wait for a lock to resolve does not necessarily fix the root of the problem…that two users are potentially trying to make a conflicting change. Users often become annoyed by the locking system because they are forced to deal with it…the locking system cries wolf in non-error situations which in actuality increases the chance of a real error slipping through.
A more dangerous situation arises when one user is in a maintenance screen and another is in a data process, such as a posting report. In this scenario, the posting report can be canceled (or worse leave the data in an invalid state) because of the inability to acquire a lock on the record. These situations are hard to predict and deal with in practice. Even worse, errors from them may not become apparent until much later.
The transaction friendly alternative is to maintain an audit system. After every write (commit) from a maintenance program that involved a data changed, an audit log can be checked to see if the record was modified by another user during the editing process. If the log indicates that the record was changed, an error message to the user in the form of “user x modified the record at time y…please confirm changes”. This check could be provided in a general way without having to be re-implemented for every maintenance screen. In special cases a sophisticated check could be made to analyze the data change and suppress the message if it is allowed.
Maintenance screens are always available to users who simply want to view the record.
Maintenance screens will not block non-maintenance processes unless specially designed that way.
User collisions are much rarer, because they only occur when multiple uses make conflicting changes to the same record. (Instead of two users simply wanting to view the same record).
Posting processes and other non interactive processes do not have to deal with record locks
Both the software development process and the user process are generally simpler.
Lower administrative overhead
Lower chance of inconsistent data
A descriptive error will provide users with a more accurate sense of warning than a ‘locked’ message…due to the fact the implied error is much more likely a real error.
What are some of the other advantages of transactions?
Applications designed for transactions have a far less chance for data errors because interaction with the database is done in a standardized and logical way. The developer does not have to be concerned with ‘what if’ scenarios that arise as a result of pessimistic locking.
Transactions allow complex data interactions to be rolled back without complex programming to undo half written work. They also catch error conditions in a more normal way by blocking further statements until transaction completion. “