Friday, August 26, 2016

Dead Lock - 2

What is a database deadlock? Provide an example and explanation of a deadlock in a database.



In a database, a deadlock is a situation that occurs when two or more different database sessions have some data locked, and each database session requests a lock on the data that another, different, session has already locked. Because the sessions are waiting for each other, nothing can get done, and the sessions just waste time instead. This scenario where nothing happens because of sessions waiting indefinitely for each other is known as deadlock.
If you are confused, some examples of deadlock should definitely help clarify what goes on during deadlock. And, you should probably read our explanation of database locks before proceeding since that will help your understanding as well.

Database deadlock example

Suppose we have two database sessions called A and B. Let’s say that session A requests and has a lock on some data – and let’s call the data Y. And then session B has a lock on some data that we will call Z. But now, lets say that session A needs a lock on data Z in order to run another SQL statement, but that lock is currently held by session B. And, let’s say that session B needs a lock on data Y, but that lock is currently held by session A. This means that session B is waiting on session A’s lock and session B is waiting for session A’s lock. And this is what deadlock is all about!
Let’s go through a more detailed (and less abstract) example of deadlock so that you can get a more specific idea of how deadlock can arise.

Database deadlock example in banking



Let’s use an example of two database users working at a bank – let’s call those database users X and Y. Let’s say that user X works in the customer service department and has to update the database for two of the banks customers, because one customer (call him customer A) incorrectly received $5,000 in his account when it should have gone to another customer (call him customer B) – so user X has to debit customer X’s account by $5,000 and also credit customer Y’s account $5,000.
Note that the crediting of customer B and debiting of customer A will be run as a single transaction – this is important for the discussion that follows.
Now, let’s also say that the other database user – Y – works in the IT department and has to go through the customers table and update the zip code of all customers who currently have a zip code of 94520, because that zip code has now been changed to 94521. So, the SQL for this would simply have a WHERE clause that would limit the update to customers with a zip code of 94520.
Also, both customers A and B currently have zip codes of 94520, which means that their information will be updated by database user Y.
Here is a breakdown of the events in our fictitious example that lead to deadlock:
  • 1. Database user X in the customer service department selects customer A’s data and updates A’s bank balance to debit/decrease it by $5,000. However, what’s important here is that there is no COMMIT issued yet because database user X still has to update customer B’s balance to increase/credit by $5,000 – and those 2 separate SQL statements will run as a single SQL transaction. Most importantly, this means that database user X still holds a lock on the row for customer A because his transaction is not fully committed yet (he still has to update customer A). The lock on the row for customer A will stay until the transaction is committed.
  • 2. Database user Y then has to run his SQL to update the zip codes for customers with zip codes of 94520. The SQL then updates customer B’s zip code. But, because the SQL statement from user Y must be run as a single transaction, the transaction has not committed yet because all of the customers haven’t had their zip codes changed yet. So, this means that database user Y holds a lock on the row for customer B. .
  • 3. Now, Database user X still has to run the SQL statement that will update customer B’s balance to increase it by $5,000. But, now the problem is that database user Y has a lock on the row for customer B. This means that the request to update customer B’s balance must wait for user Y to release the lock on customer B. So, database user X is waiting for user Y to release a lock on customer B.
  • 4. Now, the SQL statement being run by user Y tries to update the zip code for customer A. But, this update can not happen because user X holds a lock on customer A’s row. So, user Y is waiting for a lock to be released by user X.
  • 5. Now you can see that we have user X waiting for user Y to release a lock and user Y waiting for user X to release a lock. This is the situation of deadlock, since neither user can make any progress, and nothing happens because they are both waiting for each other. So, in theory, these two database sessions will be stalled forever. But, read on to see how some DBMS’s deal with this unique situation.

Database deadlock prevention

So now you have seen an example of deadlock. The question is how do DBMS’s deal with it? Well, very few modern DBMS’s can actually prevent or avoid deadlocks, because there’s a lot of overhead required in order to do so. This is because the DBMS’s that do try to prevent deadlocks have to try to predict what a database user will do next, and the theory behind deadlock prevention is that each lock request is inspected to see if it has the potential to cause contention. If that is the case, then the lock is not allowed to be placed.

Database deadlock detection

Instead of deadlock prevention, the more popular approach to dealing with database deadlocks is deadlock detection. What is deadlock detection? Well, deadlock detection is based on the principle that one of the requests that caused the deadlock should be aborted.

How does deadlock detection work?

There are two common approaches to deadlock detection: 1. Whenever a session is waiting for a lock to be released it is in what’s known as a “lock wait” state. One way deadlock detection is implemented is to simply set the lock wait time period to a certain preset limit (like 5 seconds). So, if a session waits more than 5 seconds for a lock to free up, then that session will will be terminated. 2. The RDBMS can regularly inspect all the locks currently in place to see if there are any two sessions that have locked each other out and are in a state of deadlock.
In either of the deadlock detection methods, one of the requests will have to be terminated to stop the deadlock. This also means that any transaction changes which came before the request will have to be rolled back so that the other request can make progress and finish.

Dead Lock - 1

What is a database lock in the context of SQL? Provide an example and explanation.

A database lock is used to “lock” some data in a database so that only one database user/session may update that particular data. So, database locks exist to prevent two or more database users from updating the same exact piece of data at the same exact time. When data is locked, then that means that another database session can NOT update that data until the lock is released (which unlocks the data and allows other database users to update that data. Locks are usually released by either a ROLLBACK or COMMIT SQL statement.

What happens when another session tries to update the locked data?



Suppose database session A tries to update some data that is already locked by database session B. What happens to session A? Well, session A will actually be placed in what’s called a lock wait state, and session A will be stopped from making further progress with any SQL transaction that it’s performing. Another way of saying this is that session A will be “stalled” until session B releases the lock on that data.
If a session ends up waiting too long for some locked data, then some databases, like DB2 from IBM, will actually time out after a certain amount of time and return an error instead of waiting and then updating the data as requested. But some databases, like Oracle, may handle the situation differently – Oracle can actually leave a session in a lock wait state for an indefinite amount of time. So, there are a lot of differences between different database vendors in terms of how they choose to deal with locks and other sessions waiting for locks to be released.

Database locking techniques

Database locks can actually be placed at different levels – also known as lock granularity – within the database.

Here is a list of the usual lock levels and types supported, and more information on what each technique means:

Database level locking


With database level locks, the entire database is locked – which means that only one database session can apply any updates to the database. This type of lock is not often used, because it obviously prevents all users except one from updating anything in the database. But, this lock can be useful when some major support update is necessary – like upgrading the database to a new version of the software. Oracle actually has an exclusive mode, which is used to allow just one user session use the database – this is basically a database lock.

File level locking

With a file lock level, an entire database file is locked. What exactly is a file in a database? Well, a file can have a wide variety of data – inside a file there could be an entire table, a part of a table, or even parts of different tables. Because of the variety of data stored inside a file, this type of lock level is less favored.

Table level locking


A table level lock is pretty straight forward – it means that an entire table is locked as a whole. This lock level comes in handy when making a change that affects an entire table, like updating all the rows in a table, or modifiying the table to add or remove columns. In Oracle, this is known as a DDL lock, because it’s used with DDL (Data Definition Language) statements like CREATE, ALTER, and DROP – basically statements that modify the entire table somehow or the other.

Page or block level locking

Block, or page, level locking occurs when a block or page that is part of a database file is locked. To read more about pages and blocks if you are not already familiar with them, then go here: Pages versus blocks.
Because the data that can be stored in blocks/pages can be wide and varied, page/block locking is less favored in databases today.

Column level locking


A column level lock just means that some columns within a given row in a given table are locked. This form of locking is not commonly used because it requires a lot of resources to enable and release locks at this level. Also, there is very little support for column level locking in most database vendors.

Row level locking

A row level lock applies to a row in a table. This is also the most commonly locking level, and practically all major database vendors support row level locks.

Are locks automatically used by databases?



When data is either deleted or updated locks are always used – even if a database user doesn’t write his/her SQL to explicitly say that a lock must be used. Many of the RDBMS’s out there today also have support to use the “FOR UPDATE OF” clause combined with a normal SELECT statement. The FOR UPDATE OF clause basically says that the database user intends to update some data – although the database user is not required to make changes to that particular data either. And, because the intent of updating data is declared, it means that a lock will be placed on that data as well.

Example of database locking

As a simple example of when locking would be used by database, suppose we have the following SQL:
UPDATE some_table SET some_field = "some_value" 
WHERE some_column = "XYZ";
The SQL statement above will lock the row or rows which have a value of “XYZ” for the column named “some_column”. The locking of the row(s) happens behind the scenes as part of the RDBMS software, and it prevents other database user sessions from updating the same row(s) at the same exact time as well.

Can data be read when a lock is in place?

It depends on the lock, since some locks are read-exclusive, which means that other sessions in the database can not even read the locked data – so if .

What is the point of database locking?

If it’s not already clear to you, the reason we have database locks is to prevent the potential loss of data that could happen if updates are applied concurrently – or at the same exact time. If two different database users are allowed to update the same data at the same exact time, then the results could be potentially confusing and disastrous. But if that same data were locked, then that issue would not arise, since only one user could update the locked data at a time.

What is lock contention?

One problem that occurs with having locks is that locks can cause what’s known ascontention, which means that because there are locks on the data, sessions that exist at the same time (concurrent sessions) are essentially competing for the right to apply updates on the same data, because that data may be locked by any given session. In the best case, lock contention means that some user processes run slower because a session is waiting for a lock. In the worst case, having sessions compete for locks can make sessions stall for an indefinite period of time.
When sessions do stall for an indefinite period of time, that is known as deadlock, which you can read more about here: Database deadlock.

Lock Escalation

You should also read about the concept of Lock Escalation, which is a built in feature of many RDBMS’s today.