Managing transactions in distributed databases

Material Information

Managing transactions in distributed databases
Ahmad, Aimen
Place of Publication:
Denver, CO
University of Colorado Denver
Publication Date:
Physical Description:
vii, 51 leaves : ; 28 cm.


Subjects / Keywords:
Distributed databases ( lcsh )
Computer algorithms ( lcsh )
Computer algorithms ( fast )
Distributed databases ( fast )
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )


Thesis (M.S.)--University of Colorado Denver, 2011. Computer science
Includes bibliographical references (leaf 51).
General Note:
Department of Computer Science and Engineering
Statement of Responsibility:
by Aimen Ahmad.

Record Information

Source Institution:
University of Colorado Denver
Holding Location:
|Auraria Library
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
786171428 ( OCLC )


This item is only available as the following downloads:

Full Text


MANAGING TRANSACTIO N S IN DISTRIBUTED DATABASES b y Aim e n Ahm a d B Sc, Al J a b a l Al-G a rbi Univ e r s ity, 2000 A t hesi s submitte d to t h e Univ e r s it y of Color a do D e nv e r in p a r t ia l fulfillm e n t of the r e quir e m ents for the degree of Maste r of Scie nce Compu te r S c i e nce 2011


This thesis for the Master of Science degree by Aimen Ahmad has been approved by Chlebus Bogdan Tom Altman II I I 'Z ( czo \ \ Date


Ahmad, Aimen (Computer Science) Managing Transactions in Distributed Databases Thesis directed by Associate Professor Chlebus Bogdan ABSTRACT We consider efficient algorithms to manage transactions in distributed database systems. The emphasis is on commit protocols and their message com plexity. We propose means to improve the performance of Two Phase Commit Protocol. This abstract accurately represents the content of the candidate's thesis. I recommend its publication. Signed


Fi g ur es T a bl es 0 Ch a pter 1. Introduc tion 20 Distribute d D a t a b ases 201 Tra n s p a r e n cy 0 0 202 The Ar c hit ecture C ONTENTS 2 0 3 D es i g ning Di stributed D a t a b ase Sy s t e m s 204 Fr ag m e ntation 0 0 0 0 0 205 Allocation Alternatives 2 0 6 Data R e plication s 0 0 0 20 7 Homogen e ou s and Het e ro ge n e ou s D a t a b ase 30 Tr a n sac tion Conc epts 0 3 1 F a ilur e T y p es 0 0 0 0 3 2 R ec ov e r y Ope r a tion s 0 303 Tr a n sac tion Log 3 0 3 0 1 Chec kpoin t in g 3 0 3 0 2 Local r ec ov e r y m a n age r 3 0 3 0 3 Stor a g e Structure 3.4 Tr a n sac tion St a t es 0 IV Vl Vll 1 2 2 3 7 9 11 11 13 15 17 1 8 19 20 20 22 23


3.5 Transaction Isolation . . 3. 6 Conflicts between transactions 3. 7 Serializability . . . . 3 8 Transaction Isolation and Atomicity 3.9 The implementation of isolation levels 3 .10 Distributed Transactions 4. Commitment Protocols ... 4 1 Two-Phase Commit Protocol 4.2 Failure Handling in Two Phase Commit Protocol 5. The evaluation of Two-Phase Commit Protocol 5.1 The Motivation 5.2 The New Presumed Commit Protocol 5.2.1 The Protocol Design 5.2.2 The Correctnes 5.2.3 The Performance References . . . . v 25 26 27 27 28 30 32 33 36 42 42 43 43 46 49 51


FIGURES Figure 2.1 Distributed database schema architecture. 3 1 State transition diagram. . . . . 4 1 Coordinator Slave A l gorithms Diagrams in 2PC Protocol 4 2 Central ized 2PC Protocol Communication Structure 5.1 Two-Phase Commit Protocol ... 5.2 New Presumed Commit Protocol Vl 4 24 34 36 42 46


TABLES Table 3.1 Conflict Matrix. 26 vii


1. Introduction Distributed databases are critical for many modern information processing applications. They possess many properties in order to ensure reliability of transactions. For instance, when data is distributed its availability and reliabil ity are increased in addition to improvement in access time to the information. Distributed databases employ many types of protocols such as commitment pro tocols and replication protocols The function of these protocols is to manage data and transactions to ensure consistency of data across sites of the distributed database. Aside from consistency performance is the most important aspect of distributed database systems which is i mpacted by several components of the system one of which is the commitment protocol used to manage transactions amongst sites in the system The performance of these protocols can vary widely. The Two-Phase Commit protocol and its variants Presumed Commit and Pre sumed Abort are well-known and practical commit protocols for distributed databases. 1


2. Distributed Databases A distrubted database, in contrast to a centralized database receives its name based upon its having one or mor e tables divided into partitions or repli cated with the storage of these partitions and copies spread across different sites. In being distributed, it i s also possible for tables to be fragmented across differ ent sites. There exist many other alternatives for distributing data. 2.1 Transparency Transparency in a distributed database management system indicates that a user is never made aware of many of the design-details of the system. For instance the user does not need to know which tables will be replicated or how many fragments exist in regards to storage of the data. There are many aspects of distributed transparency, such as: Location Transparency The details of where tabl es and fragments are stored must be hidden from users. When a table or fragment is stored at a remote site, users should be able to utilize thi s remote table or fragment as if it were stored locally. Fragmentation Transparency The details regarding the fragmentation or number of fragments must be 2


hidden from users. Users shou ld be ab l e to utilize the data as though it were all stored l ocally. Replication Transparency Users should not be aware of how many cop ies exist or on wh i ch sites. Location and Replication Transparencies This is a combination of both l ocat i on and replication transparency that s h ou ld follow the requirements o f both. [1] 2.2 The Architecture Two important aspects of distributed database architecture are: Information Architecture In a centralized database system, the information in the system shou ld conform to the standard of the American National Standard Institute (ANS I ) known as the ANSI/SPARC standard. This architecture introduces three l ayers of abstraction: external scheme, conceptua l scheme, and internal scheme The ex ternal scheme i s for the system's users and is known as the user s view, which allows the u sers to access the informati on in the underlying database. The User's view is created from the concept u al scheme which comprises the who l e database. The conceptua l scheme in cludes the information about the database. The lowest level of abstraction, which represents the physical data, is the internal scheme. The sty l e of architecture of a centralized database system is not sufficient for a 3


distributed database system because the distributed database system must be able to int eract with a multiple sub-database systems and requires the use of more than one concept ual scheme. Each sub-database system has a physical scheme and a localized conceptual scheme which are not provided by the archi tecture of a centralized database system. [1] The user s view in a distributed database system architecture is comprised of the views from multiple localiz ed conceptua l schemes in order to generate a new level of abstraction known as a g lob a l conceptual scheme, as depicted in figure 2.1. General conceptua l scheme provides information about all of the tabl es External Schema External Schema Global ConceJJtual / Local Conceptual Schema Local Conceptual Schema I Physical Schema P}cysic al Schema Figure 2.1: Distributed database schema architecture. in the distributed database system environment, such as primary keys foreign keys and constraints. In a distributed database system architecture there also exists other important information such as the location of table storage, table 4


fragmentation, and the number of copies or fragments. The representation of this information is kept within a globa l data dictionary that contains additional information beyond that of the global conceptual scheme including information regarding data fragments data replication, and data location. The global data dictionary also includes sub-modu les: Globa l Conce p t u a l S c heme: Includes information about the table, columns, data types column constraints, primary keys foreign keys, etc. D a t a Directory: Containts information about the location of data fragments including site location site name, and IP address which facilities location trans parency in the distributed database environment. [1] Fragmentation Directory: Contains information about the data fragments in the system such as identifying a horizontal fragment's column and whether it is a part of a vertical fragment which facilitates fragmentation transparency R eplication Directory: Contains information about the number of table and fragments copies, which, when combined with the information provided by the replication and data directories, allows access to any copy of a table or frag ment in the system. This facilitates replication transparency in the distributed database system [1] Network Directory: Contains information about the topology and communi cation link speed used to provide network transparency. S oftwar e A rchitecture The software architecture is the second piece of the distributed database envi ronment architecture consisting of two modules : the application processor and the data processor. The Application processor is responsib l e for controlling the 5


user interface and distributed transaction management. The data processor handles local query optimization local transaction execution support and local transaction recovery The data processor also provides services that are neces sary to contact a local database environment in a distributed situation, such as data read, data write query request, etc. The application processor and data processor can cooperate with each other using the communication network. [1] There are two approaches to software implementation for distributed database environments: top-down and bottom-up. Components of the Application Processor The application processor contains two main subsystems: a global trans action manager and a distributed transaction manager. The global trans action manager contains five subsystems: the user interface module, the decomposer the merger the optimizer and the planner. It also contains the global data dictionary. The global transaction manager s function is to manage the requests of queries and commands. The user interface module accepts users requests and translates them into an internal presentation for processing. The results are then passed to the decomposer, which sep arates the sub-requests so that they can be processed by different data processors. Once this has completed, the decomposer will analyze the re quest in order to identify the name of table or fragments, the columns and predicts. The decomposer then consults the global data dictionary to look-up the location(s) of the tables and fragments. Once the request has been ensured the data processor runs the request. The optimizer handles the sub-requests so that the planner has an optimized execution plan. This 6


plan is then executed by the g l obal transaction manager. The destributed execution manager is responsible for executing the setup of the plan in distributed executing plan by coordinating with the local execution managers contained within the data processors at the local sites. Once the local sites return their results to the distributed transaction manager, the results are given to the merger in the global transaction manager for assembly. The merger performs the procedure to construct the final view which includes the joining of rows from two p l aces in a given table. Components of the Data Processor The data processor is comprised of the local execution manager and sev eral necessary services used for manipulating data. It is important to keep in mind that the distr ibuted database system environment provides nec essary communications at local sites such as end-to-end massage delivery and hides all the details of communication from the rest of modules of each local sites. The local execution manager is responsible for processing requests for data that resides at its site and returning the result to the distributed execution manager. The results depend on the type of request. For example if the request is a query the result will be the actual data. Likewise, if the request is a command the result will be information about the data such as the number of rows 2.3 Designing Distribute d D a t abase Systems In order to benefit from parallel-processing of data, it is necessary to take advantage of computing resources in the database system. This is one of the 7


primary advantages of a distributed database. The designers of distributed databases must make crit i cal decisions in order to find optimal performance such as whether to keep every table intact or break some tables into chunks known as fragments or partitions. Another possible decision is whether to store data locally or distribute it across several sites in the system. [1] Distributed data can have several of the following attributes: 1-Non-replicated, non-fragmented 2-Fullyreplicated (all tabl es). 3-Fragmented. 4-Partially-replicated (some tables or fragments) 5-Mixed (any combination of the above). The following concepts are relevant to consider: Localized Data: Data is kept at a sing l e site. Distributed Data: Some data or fragments are stored or replicated at different sites. These are the definitions of the attributes of distributed data: The definitions of alternatives are as follows: Non-replicated and non-fragmented Allows the designers of distributed databases to distribute tables in var ious sites without fragmenting or replicating the tables. This is usually done to place tables close to their intended sites of usage. Fully-replicated Allows the designers of distributed databases to ensure that each table exists in each site of the system. This results in the best case of localized 8


query processing and optimal performance. Fragmented Allows the designers of distributed databases to break tables into multiple pieces for storage in different sites. Examples of styles of fragmentation include: Vertical fragmentation Horizontal fragmentation Hybrid fragmentation Partially-replicated Allows the designers of distributed databases to replicate some of the tables across multiple sites in order to optimize performance for more frequently accessed data. Mixed Allows the designers of distributed databases to utilize fragmentation followed by replication as desired based upon a particular data set's usage. [2] 2.4 Fragmentation There are important reasons that designers choose fragmentation when de signing distributed database systems. They are: 1-When entire relations are stored at a single site or replicated across all sites, there are two cases of disadvantage. The first case is when there exist a large 9


number of in stances of remote access to non-r eplicated tables. The second case results in unnecessary update operations. Fragmentation can be used to miti gate both of these disadvantages. [2] 2-Fragmentation a llows dealing with the fragments as units so that there can be more than one transaction running concurrent l y from parallel execut ion. A classic example of this i s the subdivis ion of queries into sub-quer i es for concur rent exec u t ion. This results in a greater system throughput. [2] 3-In a distributed system, the relation is an important unit within t h e system and one which i s not simple to consider. The applicat ion view can work with subsets of t h e relation, which means that the access of the appl ication is not concerned with the entire relation. [2] Fragmentation In dividing the table into fragments, there are severa l possibilities: Vertical fragmentation: Separating g roup s of columns from the original table. This requires including the primary k ey with each group in order to reconstruct the original table throu g h recombination. Horizontal Fragmentation: Can be performed on the original table or ex isting fragments of the original table by separating gro ups of row s based upon the value of one or more columns This r equ ir es being abl e to reconstruct the original table. This type of fragmentation i s useful for secur it y and privacy con cerns There are two approaches to horizontal fragmentation: -Primary : B ased upon the value of one or more columns D erived : B ased upon the fragmentation of another tabl e This is usually done for effici ency for tables which are frequently combined. Storing these fragments 10


in the same site will aid in the joining of the tables. Hybrid fragmentation: Based upon a combination of the previously-mentioned methods of fragmentat i on There are two approaches for performing hybrid frag mentation. The first inv o l ves first performing horizontal fragmentation and then performing vertical fragmentation. The second involves first performing vertical fragmentation and then performing horizontal fragmentation The advantage of using hybrid fragmentation is its efficiency. The disadvantage is that it is the most expensive approach in terms of reco n struction of the origina l table. 2.5 Allocation Alternatives When data tabl es are fragmented designers must determine how to allocate these fragments across s i tes by choosing whether to replicate the fragments in order to increase the ability to access the data when failures occur. Replicat ing fragments across more than one s ite a ll ows parallel-execution of read-only queries. 2.6 Data Replications In distributed database systems data r eplication is an important operation. After fragmentation designers must choose to replicate these fragments across the sites in the system. This replication can have many benefits. System Availability: Failure of one s ite will not result in inaccessibility of the data. [2] High Performance: Paralle l -access and lik e l y l ocal i zed availabi lity produce performance increases. Scalability: Spreading geographic distribution smooths system performance to produce acceptab l e response times across all sites. 11


Application R equire m e nt: Some appl ications require the existence of mul tiple copies of data, which replication provides. The copies are referred to as physical data items. There are many factors that impact rep l ication that are related to the design of the distributed database or updating data synchronously across sites after a commitment operation. These factors are: Database Design Within a partially-replicated database, the data may vary from one data item to another with the possiblity that there are no copies for some data items As a result transactions that only access non-replicated data items are known as local transactions because they can be executed locally. A transaction executed upon multiple data items is called a global transac tion. The global transaction is executed after accessing more than one site. Database Cons i stency A database is mutually consistent if all copies of data items have identical values. Mutual cons i stency has two criteria which are strong consistency and weak consistency. Strong consistency requires that all copies are mutu ally consistent when an update operation commits while weak consistency does not. Where updates are performed Where an update is performed first is an additional factor of replication 12


If an update is performed first in the master copy (primary copy), the update is a centered update. If an update is performed first in another copy, the update i s a distributed update. Update Propagation After the update operation has been performed a write operation on some copy or the primary copy (master), the updated value in a data item must propagate across s it es which include this data item. The propagation operation is performed by two types of propagation protocols: eager and l azy. The eager protocol requires that as soon as a transaction has committed, the update must be appli ed at a ll s it es. The lazy protocol does not have this requirement. D egree of Replication 'fransparency The replication protocol gives limit e d replication transparency if it requires knowing the master site where the transaction was submitted. There are protocols which g iv e full replication transparency by using a transaction manager at eac h s ite. The user does not have to submit the transaction to the master site in this case because it is submitted to the local transaction manager. 'fransaction cons i stency i s obtained when the genera l history is serial i zable This happens when there exists local serialization of a variable's value across operations includ ed in the transactions 2. 7 Homogeneous and Heterogeneous Database A homogeneous distributed database system cons i sts of all sites having the same database management system software with each s it e being aware of the 13


others to cooperate in processing users' requests. A heterogeneous distributed database system consists of some sites using different management system soft ware resulting in limited cooperation amongst sites. [3] 14


3. Transaction Concepts A transaction in general, is a group or collection of operations as an indi visible sequence of operati ons that appear as one logical unit from the point of view of a user though thi s log i cal unit may contain a variety of operations. For example when a bank costumer transfers a particular amount of money from his savings account to his check in g account, he uses some function of an application. This function appears to be a single operation to the user of the database system even though there are many operations involved. Thus, a transaction is a unit of program that accesses and updates data items. Transaction is collocation of operations which are written by a programmer. High level languages like SQL C++ or J ava are used to write the transaction. The collo cation operations of transactions are between a statement to begin a transaction and one to end a transaction The transaction code has the property of being atomic which means it must be executed at a single pass In case of a failure it becomes necessary to cleanup the effects of the transaction If these effects were to re main the database wou ld be inconsistent in terms of va lu es of data items. This is unacceptable as the data would l ose integrity. To ensure the integrity of the data, the distributed database system must maintain the properties of atomicity, consistency, isolation, and durability for transactions. These properties must be provided by database systems to ensure that the transaction works ideally even if some failure occurs. The properties are defined as follows: Atomicity: All or none of the operations of a given transaction are executed. 15


Isolation: When multiple transactions run concurrent ly the database system must ensure that each transaction does not impact other transactions in such a way that all transaction run serially. Therefore, the conflict never ar i ses. [8] For examp l e if there are two transactions, transaction! and transaction2, the system must ensure that they never run at the same time and the same resource like CPU. Thus, transaction starts after the end of transaction2 or transaction2 starts after the end of transaction!. Therefore, transaction will be unaware of the execut ion of transaction2. In addition, isolation ensures predictable in ter leave between the operations of two transaction. [3] Consistency When the transaction is executed in isolation, transaction consis tency will be guaranteed As a result the databases will be consistent. Thus transaction consistency preserves database consistency. Durability When the transaction has been executed successfully the updates of this transaction will be applied to the database even if a system failure hap pens. These system failures can happen a fter a transaction completes success fully while the updates are still in main memory. To protect against this kind of failure a recovery system ensures the durability of the transaction by either the updates of the transaction being written to disk before the transaction finishes executing or the information of the update being written to disk with sufficiency to reconstruct the update when the system restarts after the failure. This mech anism is the recovery system. There are two kind s of failur es which are classified as so ft failure and hard fail ure. [3] 16


3.1 Failure Types Distributed database systems need to be able to deal with failures. There fore there are few terms that they need to be clarified as follows: Soft Failures Causes the loss of data in nonpersistent storage rather than on the disk. Soft failures can be caused by loss of power, misbehavior of the operating system, database system bugs or transaction issues. These are classified as soft failures because the persistent storage remains intact Hard Failures Causes the loss of data in non-volatile storage. Hard failures can be caused by loss of power I/0 errors, and media faults. These types of failure can be classified by their frequency. [1] Types of Frequencies of Failures Transaction Failures: Can occur many times in a single minute in systems such as banking and a irlin e reservation. The recovery will be in a fraction of a second. [1] System failure: Happens many times in a week. The recovery will be in min utes. Disk failure: Can happen one or two times a year. The recovery can take hours. Communication Link Failure: This kind of failure can happen quite fre quently possibly by a congested communication link. [1] Distributed database systems are responsible for ensuring the properties of atom icity and durability for transactions by use of a commitment protocol. Thus 17


when a failure happens during transaction execution, the commitment proto col guarantees atomicity and durability by a recovery mechanism (redo or roll forward/undo or roll-back). Before discussing details of the recovery mechanism, it is necessary to discuss the details of data exchange (memory management) between main memory and non-volatile storage as follows: A transaction performs its operations such as reading and writing on different types of data items l ocated in volatile memory Therefore, a write operation does not exist immediately in non-volatile storage. This value is written in volatile storage and then exchanged with non volatile storag e This exchange is accomplished by two operations fetch and flush. Fetch operations copy data from non-volatile storage to volatile storage. If transactions attempt to read a data item that does not exist in volatile storage, a fetch operation must be executed. A flush operation usually is preformed when there is not sufficient spac e available in volatile storage for a new data item which will result in the item being copied from volatile storage to non-volatile storage. The operations of the recovery mechanism are as follows: [5] 3.2 Recovery Operations Redo Operation A redo is a process for reapplying the changes of a transaction to a previous copy of database data from non-volatile storage. These changes are applied before a hard failure happens. A database system can ensure atomicity and durability of a transaction by writing the updates to non-volatile storage before commit ting the transaction. At that point if a hard failure happens resulting in some updates being preformed on the disk and the transaction stopping, the recovery 18


mechanism will execute the redo process. [1] Undo Operation When a transaction suffers from a soft failure in such a way that the transaction is committed while some of its updates are still in volati l e memory, meaning that not all of the updates were pushed yet, the recovery mechanism executes an undo process causing all of the updates to be eliminated. The redo and undo processes must obtain the information that the transaction wrote about pro cessed data items before a failure occurred As a result this information must be stored in log files to ensure the atom i city and dll:rability of the transaction. This log file must be stored in stable storage to prevent lo sing its information if a hard failure happens. [5] In describing the stabl e storage, several aspects of the transaction log must be presented: 3.3 Transaction Log The transaction log is a sequential file that includes the va lues of data items that have been updated by transactions. The updated values of data items are stored in sequent i al log files as records. This file is sequential because the records are recorded sequentially. [1] The transaction log is divided into two parts. The first part is called the log tail and it exists in volatile storage. It contains the most recently added records. The second part exists in non-volatile storage. When a record is added to the log, the write operation moves the contents of the log tail to non-volatile stor-19


age. Mov in g data to a non-volatile log at the time of a write operation is called piggybacking. [9] There are two purposes for using the log The first is to support the commit ment protocol when failures occur, resulting in the need to redo or undo. The second i s recovering the database when t h e power fails or a crash happens. The l og file provides information for recovery. For examp l e this information may include < transactionstart > < transactioncommit >, < transactionabort > ,< delete>,< updat e >, and the t im e for each operati on and its ID. The l og file must be preserved in storage in such a way that any power failure will not imp act its information. This storage is called stable storage. [1] 3.3.1 Checkpointing A checkpo in t is a process which the local recovery manager follows to stop the transacti on during a n y kind of f a ilure. When the local recovery manager gets to thi s point in a lo g file to redo or undo it will save t im e and star t from this point. This point is lik e a wall whic h indi cates that the database is consistent at thi s point. Without chec kpoin ts, the local recovery manager has to search in l og files across all the transactions to find out where the transacti on stopped. There are three steps of a checkpoint: 1-Write a begin chec kpoin t into the log. 2 Co ll ect the checkpo in t data into the stabl e storage 3-Write the end c heckpoin t record into the log. 3.3.2 Local recovery manager The l ocal recovery manager exists in each site in a distributed database system and is considered a part of the commitment protocol. The function of 20


the local recovery manager is to ensure atomicity and durability of the local transaction. The unit of transfer of data between memory and storage is called a block or page which is the smallest storage unit of database The size of this unit equals 8KB in SQL SERVER. When there is a need to read a row from a table it is necessary to transfer the page which includes this row into memory. Database management systems transfer eight contiguous pages together. The storage unit in memory must match the storage unit in the storage. Thus, the main memory is divided into equal sizes blocks called frames. Each frame can accommodate one page. To speed up read and write operations, database man agement systems reserve a number of frames as a cache or buffer and exchange data between the stable storage and the buffer by action of the local buffer manager. Read operations can be preformed when the page in an operation has complete data. When this page already exists in the buffer of the database it is unnecssary to access the stable storage. A write operation is considered complete when data is in the buffer in memory. When a database system has to recover a transaction that was aborted, the local recovery manager reads the log file of this transaction and collects the necessary information. [1] On behalf of the transaction the local recovery manager needs to read pages to complete the execution of a transaction operation Thus, it issues fetch commands to the database buffer manager to acquire the page. In turn, the database buffer manger checks the database buffer in volatile storage first. If it finds the page, it will return it. If not, it will fetch this page from stable storage and store it in the database buffer. If there is no room in the database buffer to store the fetched page the database buffer manager selects a page to write back into 21


stable storage to make a place in the database buffer. [2] After providing the data to the local recovery manager for execution of some operation the local transaction manager runs through the steps or requests of the transaction. I t will pass these steps (operations) one at a time to the local scheduler to perform these operations. The local scheduler will subsequently pass the requ est back to the local recovery manager. The buffer manager can serve the local recovery manager by writing back some pages. This process can be performed by sending a push command to the buffer database manger by the local recovery manager. [1] 3.3.3 Storage Structure There are three types of storage in which data is stored: Volatile Storage Also known as memory this type of storage is called volatile storage be cause the loss of power results in lo sing its contents As a result volatile storage is used for temporary storage of information to serve the proces sor. This kind storage is insufficient for log files for recovery from crashes because of the risk of lo sing the log files. Volatile storage can be used by transactions to roll-back in normal execution since the transaction is still running. Non-volatile Storage This type of storage uses magnetic devices that maintain data integrity 22


eve n with the loss of power. These devices will los e data if a hardware failure occurs. Stable Storage Stable storage is con s id ered to be a n important component in distributed database systems, especially for commitment protocols. Because trans action recovery ca nnot be performed without log files log files must be preserved in such away that the log will never lose its information. This type of storage is reliable because of its use of redundant arrays of inde pendent disks resulting in more than one copy of the data. Each log i cal data item has more than one physical data item. [3] 3.4 Transaction States When a transaction fail s to comp l ete execution the transaction will enter an abort state. When transaction comp l etes it s execut ion successfully, the transac tion is calle d committe d Some states of a transaction depend on its execut ion as the diagram depicts in figure 3.1. A t r ansitional state appears a8 a one-line circle while a two-line c ircl e indi cates a terminal state. An idle state occurs when the transaction i s not running. The transaction i s expressed by the arrow in which the tail appears to l eave a state and the tip appears to enter a state. These states are: Active State: Also calle d the initi a l state. While a transaction is running it is conside r ed to be in an act iv e state and not all changes a r e preformed as depicted in figure 3.1. 23


Figure 3.1: State transition diagram. Partially Committed: Also known as before committed, the transaction enters in this state immediately after executing its last sentence. Failed State: For some reason, such as deadlock the transaction enters this state if it could not process its execution. As a result it will be forced to enter an aborted state. Aborted State: A transaction enters an aborted state when it terminates un successfu ll y and will be rolled-back. The database has been returned to the state it was in before the transaction started. Committed State: When the transaction comp l etes its execution successfully, 24


it will enter the committed state as its final state. [3] 3.5 Transaction Isolation Transaction processing systems allow multiple transactions to run concur rently. Concurrently running transactions can destroy the consistency of data items and possibly the whole database. This consistency violation happens when there is not control of interleaved operations on a data item There are advan tages of running transactions concurrently. There are improved throughput and resource utilization. Transactions contain many steps some of which need data from I/0 units. Therefore, when another transaction runs at the same time, the two cou ld exchange using different resources. This parallelism in execut ing transactions improves the performance of the database system throughput as well as increasing the utilization of I/0 units. There is also a reduction in writing time. In running serially short transactions need to wait for longer transactions to complete cuasing unpredictable delays. This can be prevented if transactions are run in parallel in such a way that there i s no violation in consis tency. There are also concurrency control schemes in database systems that can serve this propose. When multiple transactions are running these transactions run serially in such a way that the parallel transactions execution and consis tency are ensured. This proposes the need to schedule transactions serially. The operating system can utilize CPU context switching amongst transactions even though this does not ensure the consistency of the transactions. Thus, the database system can ensure the consistency of the database by using a concur rency control component responsible for scheduling. [3] 25


Table 3.1: Conflict Matrix. Read T2 WriteT2 Read Tl No Conflict Conflict Write Tl Conflict Conflict 3.6 Conflicts between transactions A conflict occurs between two transactions if both of them work with the same data it em and at l east one operation is write. As the conflict matrix shows in table 3.1, the two transaction s can read the same data item without any con flict but when one transaction performs a write operation on the data item the second transaction cannot access this item to read it. This prevents the conflict between transactions. If the transactions are not isolated the result will be anomalous in one of the followin g ways: Unpredictable Read This anoma ly occurs when a transaction reads the value of data item and the second transaction writes the value of this data item When the first transaction reads this data item aga in it will read an unpredictable value that the second transaction updated. Reading Uncommitted Data This anoma l y occurs when the second transaction reads the value of an un committe d data item by the first transaction and then computes this value to produce a new value. The new value i s committed, but the first transaction that had committed the data item was aborted. In this case, it is necessary to per-26


form a cascaded abort and cancel the new data item by the second transaction from the database. Overwritten Uncommitment This anoma l y occurs w h e n the second transaction writes and comm its t h e value of an uncommitted data item that the first transacti on produced. The value that the second transaction wrote is lost from the database becuase when the first transaction aborted, the second tran saction a l so aborted. [1] 3. 7 Serializability To e n sure the cons i stency of the d atabase the co n s i stency of transactions must be e n sured. To r eac h t hi s purpose, it is necessary to guarantee the seri alizability of schedules of transactions. Serializability means that the effects of the serialized schedul e are the same as t h e effects wou ld be if the transactions were performed serially. [4] To understand serializab ili ty it is necessary to know how the database environ ment handles conflicts. When a transaction enters the system it is ass i gned to a transaction monitor. The transaction monitor submits the transaction 's op erations to the schedu l er. The operation s function w ill determine whether the requested operation will pose a conflict with another operation. If the scheduler finds that the submitted operation does not conflict with any previous opera tion this operation i s adde d to the sched ule. If not the schedule must roll-back the transaction wh i ch includ es the submitted operat ion 3.8 Transaction Isolation and Atomicity When multiple tran sactions run con currently attention must be given to the impacts of failures t hat can affect those transactions. Ensuring atom i c ity of 27


each transaction in concurrent execut ion depends on the type of schedu l e The schedules types are: Recoverable Schedule Assume there ex i sts a partia l sched ul e without a comm it or abort command that includes two transactions, transaction! and transaction2 which have not executed commit commands If transaction2 read a data item that transac tion! wrote and is committed, transaction fails by soft failure. In this case transaction2 cannot be aborted because it was committed This schedu l e is non-recoverable because transaction2 was committed. To ensure the atomi c ity of transaction s the sched ul e must be a r ecoverable sched ul e meaning that the transactions must be comm itted in the prop er order Cascadeless Schedule When a recoverable sc h e dul e is recovered it may be necessary to undo multiple transactions which may require a l arge amo un t of work. Let us assume that there are three transaction s where each depends on another. When one trans action i s aborted, the other transaction s will be aborted too in a recoverable schedu le. A casca d e l ess schedule i s one in which a cascad in g abort is not nec essary. This depends on the commit command of a transaction being executed before dependent transaction s perform read operations. 3.9 The implementation of isolation levels The purpose of u s in g concurrency control i s to provide a hi gh degree of concurre nct executio n amongst the transactions to view se ri a lizability. The important concurrency contro l schemes are : 28


Locking A transaction holds a so lit ary lock on the data item on which it needs to per form. The transaction must hold this lock until the work on this data item is completed. Therefore ser i alizability will be ensured so that the concurrency of execution cannot be harmed A Two-Phase Locking Protocol is widely used to ensure serializability. This protocol is called Two-Phase because it uses a growing phase and a shrinking phase. This protocol does not allow interleaving between lock acqu i s ition and lock release. When the transaction acquires all of the locks that it has requested the first phase of the protocol is complete The second phase begins when the first phase releases the a lock. During this phase the transaction releases the locks acquired during the first phase in a sequential manner and does not acquire any new locks. The first phase is called growing because the number of lock s has increased and the second phase is called shrink ing because the number of the locks has been decreased. The locking mechanism must satisfy lock compatibility, which includes a shared lock and exclusive lock. A shared lock can be acquired by more than one transaction because it is only for reading a data item. On the other hand, an exclus iv e loc k on a data item can only be acq uir ed by one transaction. Thus, the seria li zab ility of the schedule is satisfied. Timestamps This l eve l of isol ation assigns each transaction a timestamp that serves to coordinate access to a data item. When transactions enter the system, the timestamp ordering contro l gives this transaction a timestamp based upon the clock at the 29


time. Each timestamp is unique to each transaction since only a single transac tion is handled by the processor at a given time. According to the chronology of the transact ion s timestamps, the timestamp ordering allows coordinated access to data items in such a way that earlier transactions are committed before l ater transactions. Time stamps are assigned to transactions with read and write actions. [1] 3.10 Distributed Thansactions There are two kinds of transactions in a distributed database system. Global transactions access and updates data items across all sites in a distributed database system. Sub-transactions are ca ll ed local transactions. [3] Local transactions access and process data items at only one site in the system. A transaction in a distributed database system must preserve the properties of atomicity, isolation and consistency. The system structure of the distributed database system is as follows : System Structure In distributed database system each site has a local transaction manger with the function to ensure that this transaction meets the necessary properties. The managers across the sites work to accomplish the global transaction. There are two transaction managers: Thansaction Manger: Manages execution of the transaction in the local site. It is responsible for the following: Maintaining a log for recovery purposes. Participating in a concurrency scheme to coordinate access to data items in the s i te. 30


Transaction Coordinator: It is responsible for managing execut i on across the sites. [3] 31


4. Commitment Protocols Distributed database systems need commitment protocols to ensure prop erties of transaction lik e isolation and data integrity. The Two-Phase Commit Protocol extends the One-Phase Commit Protocol by reducing the number of windows of vulnerability, which are the periods of time when a slave s it e must wait on a coordinator decision. [2] The windows of vulnerability in the TwoPhase Commit Protocol are smaller because, during the first phase the s l aves sites simply vote and do not wait for the coordinator. The coordinator state will change. The Two-Phase Commit Protocol proceeds in two phases: voting and decision. The coordinator in the first phase collects the slaves votes. In the second phase the coordinator broad casts the general decision in which the s l aves commit or abort transaction. [7] The Two-Phase Commit Protocol ensures the atomicity property for distributed transactions. Synchronization is necessary to commit the transaction. Because the sched ul er may not be ready to commit one of the transactions that it sched uled it might be that another transaction in the schedule reads the value that a former transaction updates. The strict concurrency protocol does not allow reading to read any updated value until the updater of that value commits. It also does not allow the occurrence of deadlock in sub-transactions. A deadlocked site causes the transaction to enters an aborted state. This is called a unilateral abort. [2] 32


4.1 Two-Phase Commit Protocol The algorithms of the coordinator and slave are illustrated by the diagram of this protocol as depicted in figure 4.1. Before using this diagram, it is best to describe the terms of the different states of the coordinator and slave. For the coordinator the" Preparing state" or Wait state" means that the coord in ator is waiting for the slave sites to send messages. For a s l ave site Prepared state" or Ready state" means that the slave site sent a commit message and is waiting for new message from the coorinator. [2] If the coord inator makes a decision to execute the transaction it will enter to first phase of the protocol by sending a prepare message to the slave to commit the transaction. The first part of executing the transaction is that the coordinator writes a be gin commit' record in its log along with "Preparing states" while waiting for the slaves' votes. In the first phase of the protocol if a slave wants to commit the transaction, it writes a 'Ready' record in its log file and sends a commit mes sage to the coordinator. After that, the s l ave site enters the Prepared state. Otherwise, the s l ave writes an 'abort' record in its log file. After that, the slave sends an abort message to the coordinator. [2] The comm itment point has been reached when the coordinator receives the done message ( ready or not-ready ) from a ll slaves. If the coord inator receives at least one not-ready message from a slave in the first phase the coordinator broad casts the g lob a l abort message, cons id ered the start of the second phase of the protocol. The coordinator will g lob ally abort the transaction as soon as it re ceives acknow l edgement messages from a ll s l aves In this case, if the coordinator 33


receives more than one not-ready message from the slaves it will ignore them because it needs just one not-ready message to abort the transaction. Also the coordinator can make the decision to abort the transaction before it enters the first phase Then it globally sends the abort message to the slaves and enters the Aborting state. [2] As soon as the coordinator receives the not-ready message the coordinator writes an 'abort' record in its log. If the coordinator in the first phase receives toborts ..... bort mt.J:S,ill& Coordinator .. Prapr toCDnl mit"' mt.SH.J mtt.U..14t ra-ceiv.O.. .. nd AbortAck Slave Figure 4 .1: Coordinator Slave Algorithms Diagrams in 2PC Protocol a ready message from a ll of the s l aves it will write a commit record in its log and enter the second phase of the protocol. As soon as a slave sends a ready message it enters to "Prepared state. 34


In the second phase after the coordinator has received commit messages from all slaves, it will send a commit message to the slaves, and it enters the Com mitting state." This phase is called the decision phase. The coordinator awaits an acknow l edgement message from each slave site. When the coordinator re ceives a ll the messages it writes an 'end transaction' record in its log. There are two rules for the coordinator to reach a termination decision known as global commit rules They are: 1-If even one participant votes to abort the transaction, the coordinator has to reach a global abort decision 2-If a ll the slaves vote to commit the transaction, the coordinator has to reach a globa l commit decision [2] In the second phase after all slaves receive a commit message from the coor dinator, the s l aves apply the actions of the transaction, and acknow l edge the coordinator. After a s l ave enters the second phase and receives a global abort message, it acknowledges the coordinator by an abort message. In addition the slave sites that sent ready messages in the first phase cannot change their deci sions in the second phase. [2] When a s l ave is in the Ready state" in the second phase it co uld move to abort or commit the transaction based on the received message from the coord inator. In addition the globa l transaction decision is taken by the coordinator depending on the global comm i t rule. Finally the coordinator and the s l aves enter certain states and await a message to change from state to state. Therefore if the expected message is not received by any site in which it is a coordinator or slave, the timer runs out. 35


Centralized Communication in Two-Phase Commit Protocol The structure of the communication paradigm as illustrated in figure 4.2 is employed in implementing the Two-Phase Commit Protocol. It is called the centralized structure since the communication is only between the coordinator and slaves There is not any communicat ion amongst the slaves in any phase. Coordinator Preparing message Slaves Vote abort/ Votecommft Coordinator Globule commit/ Globule abort Slaves First Phase Second Phase Coordinator Committed/ Aborted Figure 4.2: Centralized 2PC Protocol Communication Structure 4.2 Failure Handling in Two Phase Commit Protocol In the first phase of this protocol, if the coordinator fails, the s lav es can abort the transaction by sending not-ready or vote-abort messages. In contrast, if the coordinator fails in the second phase, the slaves will be blocked To deal with the failures, the Two-Phase Commit Protocol must be extended to include a termination protocol that guides the sites to deal with these failures. 36


Termination Protocols in Two Phase Commit Protocol The function of the termination protocol is terminating transactions when a failure happens. These may be site failures or communication failures. The Two-Phase Commit Protocol treats both types in the same manner. [10]. When a site fails, the operational sites invoke a termination protocol to terminate the transaction by whether there is commitment of abortion. The purpose of invoking this termination protocol is to release the resources the transaction holds as soon as a failure is detected. A time-out mechanism detects the failure of a s l ave by the coordinator or detects the failure of the coordinator by the slaves The time-out occurs in the destination site when the source site cannot receive a response within the expected period. [2] Understa ndin g the termination protocol entai l s understanding the details about transaction states of the coordinator or s l ave The transition from state to state is performed by any site in the system by writing a record in its log and submitting a message in the message queue. Although it is appropriate to assume that three actions are performed by an atomic action, it is not reasonable to assume that the transitionin g of states and the transmitting of messages from the message queue are an atomi c action. For examp le, when the coordinator transitions from the "Preparing state" to the "Committing state," it might fails after send in g messages. As a result, some sites may not receive global commit messages. At this time the termination protocol must be invoked to terminate the transaction by commitment or abortion. There are two kinds of termination protocols: 37


Coordinator Termination Protocol This protocol is activated by the coordinator when it detects the failure of one or more s lav es. The coordinator has several cases to process: Case 1: A slave s failure is detected when the coordinator is in the "Before commit state. The slave failed during the transaction execution In this case, the coord inator writes an Aborting record in its log and sends a globa l abort message to all slaves. Case 2: A s l ave s failure i s detected when the coordinator is in the Preparing state." The slave site cou ld have been in the "Before commit state, "Aborting" or Prepared states. If the slave is in B efore commit, it never received any messages from the coord in ator. If the s l ave site is in the Abort state, it re ceived the prepare message and voted not-ready. If the slave is in the prepared state, the s l ave site f ailed to sent a ready message to the coord in ator. The co ordinator assumes that the slave s ite does not want to commit the transaction Therefore, the coordinator must proceed to abort the transaction. Case 3: A s l ave s failure is detected when the coordinator is in the Committing state." This signifies the fact that the coordinator did not receive an acknowl edgment message from the slave. There are two possibilities for this. The slave failed in either the "Prepared state" before it received the global commit mes sage or after receiving the commit message from the coordinator and before sending a commit acknow l edgment message to the coordinator. In both cases the coo rdinator must continue polling the s lav es for acknow ledgm ent messages before ending the transaction g lob ally. Case 4: A slave s failure is detected when the coordinator is in the "Aborting 38


state." This signifies that the coordinator did not receive an acknowledgment abort message from a sa lv e site and the s l ave may be in the "Before commit", Prepared", Aborting" or "Aborted state. Regardless of the slave's state, the coordinator must poll the slaves for acknowledgment messages to abort the transaction globa lly. Slave Termination Protocol: This protocol is activated by any s l ave site that detects that the coordinator has failed in order to terminate the transaction. Each slave site that detects the coordinator s failure takes a few steps: Case 1: The s l ave site detects the coord in ator's failure when it is in the "Before commit state. In this case, the coordinator failed when it was in the "Before commit", "Preparing", or "Abort in g state. The slaves must elect a new coor dinator to abort the transaction globa lly. Case 2: The coord inator s failure is detected when the s l ave site is in the "Abort ing" or "Before commit state. In this case the s l aves elect a new coor dinator to abort the transaction globa lly. Case 3: The coordinator's failure is detected when the s l ave site is in the "Pre pared state." The coordinator may have failed in the Preparing", Aborting or "Committing state. The coordinator failing in the "Aborting or Commit ting state will be handled in the same way. Therefore, there are two cases to be handled. In the first case, the coordinator failed in the Preparing state" before it sent the g lob al abort message or globa l commit message to the slaves. The slaves sites will be blocked while they are holding resources such as some rows or columns. For this reason, the Two-Phase Commit Protocol is called a blocked 39


protocol. What happens to the coordinator in this case is that the coord in ator decides to abort or commit the transaction and transfers to the Committing or Aborting state. Then the coordinator fails before the messages leaves the messages queue at the coordinator's site. The coordinator's decision has been applied on l y at the coord in ator's site and the slaves sites do not know anything about the coordinator's decision. In the second case, the coordinator fails when it is in the "Committing" or Aborting state. The coordinator fails before it sends a global comm it message or g lob al abort message or after receiving global messages by some of the slaves sites. In the last possibility the failure is detected by the slaves, necessitating their election of a new coordinator that discovers the coordinator's state at the time of its failure. The new coordinator sends a message to all the s l aves sites asking them to reply based upon the last message that they received from the failed coordinator. When the slaves sites reply to the new coordinator, they will have one of the following possibilities: First Possibility: No s l ave site has received a global abort message or global commit message so the s l aves sites will be blocked until the coordinator has been repaired. Second Possibility: Some slaves s ites received an abort message and the other slaves sites are still in the Prepared state." This means that the coordinator made a decision to abort the transaction and failed before it completed sending abort messages to all slaves. The transaction must be aborted by the new coor dinator. Third Possibility: The coordinator sends a global commit message but once it completed sending this message to all s lav es, it failed. In this case, the rest of 40


the s l aves s i tes are still in the "Preparin g state. When the new coord inator is elected, t h e transacti on will be committed. Forth Possibility: When a ll the s l aves sites received a g l obal abort message from the coord in ator, the coord inator failed. The s l aves s ites e l ect a new coordinator t hat will abort the transacti on. Last Possibility: When a ll the slaves s it es received a g l obal commit message from the coordinator, the coordinator failed. The slaves s i tes elect a new coordinator that will commit the transaction. There is no way for some slaves sites to receive a g lob al abort message and others to receive a g lob a l commit message. 4 1


5. The evaluation of Two-Phase Commit P rotocol The Two-Ph ase Commi t Protoc ol is expe n s iv e b eca u se it r e quir es three bro a d cas t s a nd ac knowl e d g m e nts as illu strated in fig ure 5.1. The c o s t of t hi s pro t o co l d e p e nd s upon the nu m b e r of sent m essages and t h e numb e r o f log r ec ord s on stabl e storage that a r e n ee d e d in the t w o phases of t h e protocol. In a dditi o n the pro t o c ol r e quir es a ddi t ion a l m essages if a f a ilur e o cc ur s The coordina tor First Phase Vot e Request s ThE; s laves ____ _____ F o rced I og Second Phase F o rced log _____ C O M M I T...:../ A B O R T ____ __,. F o rced log Figure 5 1: Two-Ph ase Commi t Protoc ol 5.1 The Motivation M a ny appl ications h a v e se rv e d a lot of u se r s by t h e ir fun c tion they m a y h a v e t h e ov erhead. Decr eas ing c on sume d time b y t h e t r a n sac tion pro cess i ng need s t o emphas i s the pro t o c o l whi c h c ommit s the distributed t r a n sact ion s to m a k e hi g h proto col's p e rform a n ce. These protocol s h ave work e d b y m essages to m a k e 42


connection amongst the sites. Increasing the number of messages means in creasing the possibility of communications failures. As a result the transactions need more messages to deal with failures occurrence. In addition, the commit protocol which produces many massages, consumes more time. Consequently the protocol is expensive in term of the cost of distributed transactions process ing. For these reasons there is a need to design c ommit protocol that it has ability to process the distributed transactions with less messages. In addition the protocol has to achieve the properties of proc e ssing the distributed transac tions which are atomicity isolation durability and as well as consistency. The proposed protocol is called New Presume Commit Protocol. 5.2 The New Presumed Commit Protocol The new protocol minimizes the execution overhead in the Two-Phase Com mit Protocol in terms of the exchanged messages. In addition the coordinator aims to release the held resources as soon as possible by taking decision by at least one abort, or abort acknowl e dgment messag e s Finally, the new protocol reduces the communicat ion failures because it already reduces some messages. 5.2.1 The Protocol Design The design of New Presume Commit Protocol is similar to the design of original protocol which is Two Phase Commit Protocol with some exceptions. The new protocol contains two phases to execute the distributed transactions with committed assumption, so for this reason is called the New Presumed Com mit Protocol. The first phase of the new protocol which is called voting phase includes one broadcast of messages from the coordinator to the slaves The first 43


broadcast means that the coordinator want to poll the s l aves whether they are ready or not ready to commit the submitted transaction. Subsequently, if a slave site ready to comm it the transaction, the assumption of the protocol is true. The coordinator has the same committed assumpt ion Then, the slave site writes commit record to his log file. If the sites are ready, that means there is no problem in the schedu l e of transaction such as deadlock problem. If all sites are ready to commit the transaction, the coordinator precedes to the second phase which all the slaves sites have the record in their logs files. In other words all slaves are ready. On the other hand, if a s l ave site in the first phase is not ready to commit the transaction, it sends abort message to the coord in ator. Because the slave site does not agree with the commit assumption of the protocol in the first phase the s l ave site has to send abort message to the coordinator. In addition, this site a l so h as to write abort record to its lo g before sends aborted message to the coordinator. If some slaves sites are not ready in the first phase, the coord inator needs just one abort message to make decision to abort the transaction. When a s l ave site is ready to commit the transaction, it holds the resources which needs (table, fragment, rows or columns) to apply the transaction in the sec ond phase. In distributed transaction processing, the resources are distributed amongst a lot of transaction s which need the same data. As a result, holding the resources for long time by a transaction takes the chance from another transac tion, and it makes delay which it is undesired in distributed database systems. When the coordinator delays to make an action, this causes the slaves sites hold 44


the recourse which is undesired. For this reason in the proposed protocol the coordinator has to make decision as fast as possible. For instance, if a site is not ready in two phases, the coordinator makes decision as soon as it receives the first abort message from slave site in the second phase of the new protocol. Releasing the recourse by the sites as fast as possible is one of the goals of the New Presumed Commit Protocol. In the second phase, when the coordinator sends the commit decision to the slaves sites, the slaves sites implement the transaction depend on the received commit decision. Before the slaves sites take action to implement the commit decision, the slaves sites have to take an action in the proposed protocol. The action entails that all slaves have to check its logs files to check if there is no commit record at the logs files of the salve site. If a slave site detects abort commit record in its log, immediately it sends abort message to the coordinator. The coordinator aborts the transaction as soon as it receives abort messages from a slave site. All sites release the resources which hold after they acknowledge the coordinator by abort acknowledgment messages. For the last action in The New Presumed Commit Protocol, it must ensure that all slaves sites have acknowledged the coordinator by the same kind of acknowledgment messages. To ensure the isolation property The New Presumed Commit Protocol uses Two-phase Lock Protocol as concurrency scheme to preserve the serializabil ity amongst all sub transactions. As a result there is no overlap between two transactions at the same time. This protocol as illustrated in figure 5.2 makes the transaction cheaper by reducing the number of sent messages in the first phase. The protocol includes 45


two phases, but in first phase the sites only reply to the coordinator when they abort the transaction. Therefore when a slave site commits a local transaction it does not send a message back to the coordinator. On the other hand, if some sites abort the transaction the coord inator needs just one abort message to make a decision to abort the whole transaction in the second phase In the second phase if the coordinator does not receive any message, it assumes The coordinator First Phase Vote ReqtJes t s The s lave s ______ Forcedlog Second Phase Forced I og _________ C.:....O c:__ M _I\i_;_11_ T....:../ A B.:....O.:....R T ________ ____.. Forced log Ackno-.v ledgrnent Figure 5.2: New Presumed Commit Protocol that all s l aves chose to commit the transaction. As a result, the coordinator broadcasts a general commit decision to all slaves. The slaves then apply the transaction and acknow l edge the coordinator by a commit acknowledgement message. 5.2.2 The Correctnes Quality benchmark of a commit protocol is achieving the transactions properties which are the cons i stency, isolation atomicity and durability. Therefore 46


the new protocol entails recording any action in a log file before it is achieved by a site. This technique gives the new protocol the ability to deal with any failure in such a way that it could ensure the properties of the distributed transaction The new protocol ensures isolation property for any distributed transaction. The new protocol uses concurrency-control component which is Two Phase Lock Pro tocol. The new protocol can ensure that all sub-transactions run serially in the first phase of the proposed protocol. In the write action there is no two trans action use the same data at the same time. Durability property is ensured by the new protocol. When the coordinator com mits the transaction successfully some soft failure can happen. By checking the logs files it could refresh all the transaction s updates. Although a failure happens, the new protocol ensures that any committed transaction is effective. The consistency property requires rules like integrity constraints. If a designer satisfies these rules when he designs distributed database, the protocol can achieve the consistency property. If the database is consistent before an ac tion of distributed transaction, it has to be in the same case after the update. Because the new protocol ensures the isolation execution of distributed trans action it can preserve that the database is in consistent after the execution of transaction. The new protocol can achieve the atomicity property because it can detect the soft failure at any site. In other words the new protocol saves all the updates at the log file before it takes an action. Therefore when the coordinator detects the soft failure, some updates are lost. The recovery operation recovers the failed sites by using undo or redo operations to rewrite all the updates which are done 47


the failed site. The recovery operations are one of the components of the new protocol. As a result all the updates are effective in the same time or not. Distributed transactions use commit protocols to ensure that all slaves com mit the transaction or not. Therefore, when a slave fails, it must be detected by the commit protocol. The New Presumed Commit Protocol aborts the transac tion in the following cases: Case 1: In the first phase if a slave votes to abort the transaction, it sends an abort message to the coordinator. In turn, the coordinator aborts the transac tion by making a general decision in the second phase. As a result the atomicity property is achieved in such a way that there is no any update happens. Case 2: If the abort messages are lost before being received by the coordinator the coordinator starts the second phase of executing the transaction. Although the coordinator sends commit messages to all slaves there may be one aborted slave in the transaction. The slave that aborted the transaction checks its log to look for a commit record. If it does not find this record it sends an abort acknowledgment message. If the coordinator receives at least one an abort mes sage from some slave the coordinator aborts the transaction. The coordinator achieves the atomicity property of the transaction in such a way that there is no any update happens. Case 3: If all slav e s vote to commit the transaction and a slave failed after voting the coordinator starts the second phase of executing the transaction and sends commit messages to all slaves. The failed slave is detected when the coor dinator does not receive an acknowledgment message from the failed slave. As a result if the coordinator receives at least one abort message from some slave, 48


the coordinator aborts the transaction. The coordinator achieves the atomicity property of the transaction in such a way that there is not any update happens. Case 4: If some site does not send acknowledgment message whether it is com mit message or abort message the coordinator can detect that the some site is failed by a failure. In this case the coordinator aborts the transaction. As a result the coordinator achieves the atomicity property of the transaction in such a way that there is no any update happens. Case 5: When the commit assumption is achieved by all sites in the transac tion and there is no any kind of a failure in this case the coordinator makes all the updates which are done by all sites of the transaction effective. In each case the coordinator aborts the transaction because it does not exist a consensus amongst slaves. 5.2.3 The Performance The Two-Phase Commit Protocol consumes (4 P) messages in committed and aborted transactions. The normal case of the proposed protocol when all slaves agree to commit the transaction. In this case there are three broadcast actions which are the messages from the coordinator, so the total numb e r of messages is (3 P) messages which is less than (4 P) If a slave aborts the transaction in the first phase the aborted transaction pro duces ( 3 P + 1 ) messages. The coordinator needs just one abort message to make a decision so as soon as it receives one abort message, it ignores additional abort messages. If a slave aborts the transaction in the second phase the aborted transaction produces ( 3 P + 1 ) messages. 49


The Summary We eva luat ed the Two Phase Commit Protocol in term of its performance. The Two-Phase Commit Protocol was evaluated in terms of its performance with concentration on improvin g its performance, resulting in the proposal of a New Presumed Commit Protocol. The new protocol produces fewer messages in the first ph ase of transaction-execution in both the committed and aborted transaction cases. The advantages of the proposed protocol are not just in reduction of the cost of performance but a l so in decreasing the possibility of communication failures in the first phase by d ecreas in g the number of messages sent 50


REFERENCES [1] S. K. Rahimi and F. S. Haug, Distributed Database Management Systems Hoboken: John Wiley and Sons, 2010 [2] M. T. Ozsu and P. Valduriez Principles of Distribut e d Database Syst e ms. 3rd ed. NewYork: Springer 2011. [3] A. Silberschatz, H. F. Korth and S. Sudarshan Databas e Syste m Conc e pts. 6th ed. NewYork: McGraw-Hill, 2011. [4] J. Walpole G. S. Blair D. Hutchison and J. R. Nicol Transaction mech anisms for distributed programming environments," SOFTWARE ENG J vol. 2 no. 5 pp. 169-177, SOFTWARE ENG J. Sep. 1987. [5] D. Chkliaev J Hooman and P van der Stok Mechanical verification of transaction processing systems in Proc. Formal Engineering M e thods 2 000. ICFEM 2000. Third IEEE International Conferenc e on. IEEE. doi : 10.1109/ICFEM.2000.873809, Sep. 2000 pp 89-97. [6] C. Mohan B. Lindsay and R. Obermarck Transaction Management in the R Distributed Database Management System, A CM Transaction on Databas e Systems, vol. 11, no. 4 pp.378-396 Dec. 1986. [7] M. L. Liu D. Agrawal and A Elabbadi "The Performance of Two Phase Commit Prorocols in the Persence of site Failures," Distributed and Parall e l Databases vol. 6. no 2 pp. 157-182 Apr. 1998. [8] K. P. Eswaran J. N. Gray R. A. Lorie, and I. L. Traiger "The Notions of Consistency and Predicate Locks in a Database System," Communications of th e ACM, vol. 19, no. 11, pp. 624-633. Nov. 1976. [9] G. K. Attaluri and K. Salem, "The presumed-either two-phase commit protocol," Know ledge and Data Engineering IEEE Transactions on, vol. 14, no. 5. pp. 1190-1196. Sep/Oct. 2002 [10] B. S. Boutros and B. C. Desai "A two-phase commit protocol and its performance, in Proc S e v enth International Workshop on. Database and Expert Systems Applications. IEEE. doi : 10.1109/DEXA.l996 558282 S ep. 1996, pp. 100-105. 51