Friday, August 14, 2009

ACID Properties

Atomicity

First, a transaction needs to be atomic (or all-or-nothing], meaning that it executes
completely or not at all. There must not be any possibility that only part
of a transaction program is executed.
For example, suppose we have a transaction program that moves $100 from
account A to account B. It takes $100 out of account A and adds it to account
B. When this runs as a transaction, it has to be atomic—either both or neither
of the updates execute. It must not be possible for it to execute one of the
updates and not the other.
The TP system guarantees atomicity through database mechanisms that
track the execution of the transaction. If the transaction program should fail
for some reason before it completes its work, the TP system will undo the
effects of any updates that the transaction program has already done. Only if it
gets to the very end and performs all of its updates will the TP system allow
the updates to become a permanent part of the database.
By using the atomicity property, we can write a transaction program that
emulates an atomic business transaction, such as a bank account withdrawal,
a flight reservation, or a sale of stock shares. Each of these business actions
requires updating multiple data items. By implementing the business action
by a transaction, we ensure that either all of the updates are performed or
none are. Furthermore, atomicity ensures the database is returned to a known
state following a failure, reducing the requirement for manual intervention
during restart. The successful completion of a transaction is called commit. The failure of
a transaction is called abort.

Consistency

A second property of transactions is consistency—a transaction program
should maintain the consistency of the database. That is, if you execute the
transaction all by itself on a database that's initially consistent, then when the
transaction finishes executing the database is again consistent.

By consistent, we mean "internally consistent." In database terms, this
means that the database satisfies all of its integrity constraints. There are
many possible kinds of integrity constraints, such as the following:
• All primary key values are unique (e.g., no two employee records have
the same employee number).
• The database has referential integrity, meaning that records only reference
objects that exist (e.g., the Part record and Customer record that are
referenced by an Order record really exist).
• Certain predicates hold (e.g., the sum of expenses in each department is
less than or equal to the department's budget).
Ensuring that transactions maintain the consistency of the database is good
programming practice. However, unlike atomicity, isolation, and durability,
consistency is a responsibility shared between transaction programs and the
TP system that executes those programs. That is, a TP system ensures that a
set of transactions is atomic, isolated, and durable, whether or not they are
programmed to preserve consistency. Thus, strictly speaking, the ACID test
for transaction systems is a bit too strong, because the TP system does its part
for C only by guaranteeing AID. It's the application programmer's responsibility
to ensure the transaction program preserves consistency.

Isolation
The third property of a transaction is isolation. We say that a set of transactions
is isolated if the effect of the system running them is the same as if it ran
them one at a time. The technical definition of isolation is serializability. An
execution is serializable (meaning isolated) if its effect is the same as running
the transactions serially, one after the next, in sequence, with no overlap in
executing any two of them. This has the same effect as running the transactions
one at a time.
A classic example of a nonisolated execution is a banking system, where
two transactions each try to withdraw the last $100 in an account. If both
transactions read the account balance before either of them updates it, then
both transactions will determine there's enough money to satisfy their
requests, and both will withdraw the last $100. Clearly, this is the wrong
result. Moreover, it isn't a serializable result. In a serial execution, only the
first transaction to execute would be able to withdraw the last $100. The second
one would find an empty account.
Notice that isolation is different from atomicity. In the example, both
transactions executed completely, so they were atomic. However, they were
not isolated and therefore produced undesirable behavior.
If the execution is serializable, then from the point of view of an end user
who submits a request to run a transaction, the system looks like a standalone
system that's running that transaction all by itself. Between the time he
or she runs two transactions, other transactions from other users may run. But

during the period that the system is processing that one user's transaction, it
appears to the user that the system is doing no other work. But this is only an
illusion. It's too inefficient for the system to actually run transactions serially
because there is lots of internal parallelism in the system that must be
exploited by running transactions concurrently.
If each transaction preserves consistency, then any serial execution (i.e.,
sequence) of such transactions preserves consistency. Since each serializable
execution is equivalent to a serial execution, a serializable execution of the
transactions will preserve database consistency, too. It is the combination of
transaction consistency and isolation that ensures that executions of sets of
transactions preserve database consistency.
The database typically places locks on data accessed by each transaction.
The effect of placing the locks is to make the execution appear to be serial. In
fact, internally, the system is running transactions in parallel, but through
this locking mechanism the system gives the illusion that the transactions are
running serially, one after the next. In Chapter 6 on locking, we will describe
those mechanisms in more detail and present the rather subtle argument why
locking actually produces serializable executions.

Durability
The fourth property of a transaction is durability. Durability means that when
a transaction completes executing, all of its updates are stored on a type of
storage, typically disk storage, that will survive the failure of the TP system.
So even if the transaction program or operating system fails, once the transaction
has committed, its results are durably stored on a disk and can be found
there after the system recovers from the failure.
Durability is important because each transaction is usually providing a service
that amounts to a contract between its users and the enterprise that is
providing the service. For example, if you're moving money from one account
to another, once you get a reply from the transaction saying that it executed,
you really expect that the result is permanent. It's a legal agreement between
the user and the system that the money has been moved between these two
accounts. So it's essential that the transaction actually makes sure that the
updates are stored on some nonvolatile device, typically disk storage in
today's technology, to ensure that the updates cannot possibly be lost after the
transaction finishes executing. Moreover, the durability of the result often
must be maintained for a long period. For example, some tax regulations allow
audits that can take place years after the transactions were completed.
The durability property is usually obtained via a mechanism that starts by
having the TP system write a copy of all the transaction's updates to a log file
while the transaction program is running. When the transaction program
issues the commit operation, the system first ensures that all the records written
to the log file are out on disk, and then returns to the transaction program,
indicating that the transaction has indeed committed and that the results are
durable. The updates may be written to the database right away, or they may

be written a little later. However, if the system fails after the transaction commits
and before the updates go to the database, then after the system recovers
from the failure it must repair the database. To do this, it rereads the log and
checks that each update by a committed transaction actually made it to the
database. If not, it reapplies the update to the database. When this recovery
activity is complete, the system resumes normal operation. Thus, any new
transaction will read a database state that includes all committed updates. We
describe log-based recovery algorithms in Chapter 8.

No comments:

Post a Comment