A Simple Concurrency Use Case and Different Solutions to Handle It

Posted January 24, 2015 - 6 min read
Topics:  

When developing an SQL based web application, concurrent requests handling may be one of the most crucial issues that faces the application.

Within this post, we are going to discuss a simple concurrency use case and find out different ways to handle properly concurrent requests.

Use Case

Imagine that we have an application that allows users to manage their wallets.

At a given time, a user has decided to withdraw 200 from his wallet which has an initial balance of 700.

The user may send two money withdrawal requests from two different sessions at the same time.

Maybe a nasty user who want to play with us.

Let’s see what is happening:

Time t1:
Session1> SELECT blc FROM wallet WHERE uid = 1; (returns 700)
Session2> _

Time t2:
session1> _
session2> SELECT blc FROM wallet WHERE uid = 1; (also returns 700)

Time t3:
session1> UPDATE wallet SET blc = 500 WHERE uid = 1; (700 – 200 = 500)
session2> _

Time t4:
session1> UPDATE wallet SET blc = 500 WHERE uid = 1; (700 – 200 = 500)
session2>_

Is there something wrong?

Have you noticed that we just took out 400 from our wallet from a starting point of 700 and left the wallet with a balance of 500?

So 200 just vanished.

During application testing, running a single session, such kind of situations does not occur.

Furthermore, It can be hard to debug and requires rigorous testing to get noticed before deploying to production.

So the only measure we can take, is to be aware of it at the time when developing the application and to use different defense mechanisms that we will cover bellow.

Can We Use Transactions to Prevent This?

Unfortunately pure transactions can’t do anything to help us.

The only way I see is to avoid concurrency and this can be done by locking the table before starting the transaction.

Let me explain it.

In case of a pure transaction the outcome would be exactly the same:

Time t1:
session1>BEGIN;
session1>SELECT blc FROM wallet WHERE uid = 1; (returns 700)
session2>BEGIN;

Time t2:
session1>_
session2>SELECT blc FROM wallet WHERE uid = 1; (also returns 700)

Time t3:
session1>UPDATE wallet SET blc = 500 WHERE uid = 1; (700 – 200 = 500)
session2>_

Time t4:
session1>_
session2>UPDATE wallet SET blc = 500 WHERE uid = 1; (700 – 200 = 500)
gets stuck here (waiting for session 1)

Time t5:
session1>COMMIT;
session2>COMMIT;

Solutions

There exist many approaches to deal with concurrency.

Some approaches make use of the built-in database capabilities, others may be implemented on the application side.

Let’s see what we have in the menu:

  • Avoid read-modify-write cycles with calculated update values.

  • Apply database locking at the row level using for example SELECT FOR UPDATE mechanism for relational databases.

  • Serializable transactions.

  • Use optimistic concurrency control (also known as optimistic locking)

Avoiding Read-modify-write Cycle

The simplest solution to every problem is to eliminate its cause.

So in our case we need to get rid of using read-modify-write pattern.

What I mean is that our code can be rewritten to something like that:

Time t1:
session1>UPDATE wallet SET blc = blc - 200 WHERE uid = 1; (sets blc=500)
session2>_

Time t2:
session1>_
session2>UPDATE wallet SET blc = blc - 200 WHERE uid = 1; (sets blc=300)

This approach even works in case of concurrent transactions.

Because the first session would take a lock on the row and the second session would be waiting on the lock until the first session commits or rolls back.

One downside of this approach is that it can only be used for simple use cases.

It would not be useful if our application needs to do some complex logic based on the current balance to determine if the update should proceed, for example.

Using Database Row Level Locking

This approach will actually block any other query which is trying to modify the selected row (UPDATE, SELECT FOR UPDATE).

Pending transactions will pause until the row is released (committed or rolled back).

Let’s rewrite our example:

Time t1:
session1>BEGIN;
session1>SELECT blc FROM wallet WHERE uid = 1 FOR UPDATE; (returns 700)
session2>BEGIN;

Time t2:
session1>_
session2>SELECT blc FROM wallet WHERE uid = 1 FOR UPDATE;
// gets stuck and waits for session 1

Time t3:
session1>UPDATE wallet SET blc = 500 WHERE uid = 1; (700 – 200 = 500)
session2>

Time t4:
session1>COMMIT;
session2> // previous SELECT returns 500
session2>UPDATE wallet SET blc = 300 WHERE uid = 1; (500 – 200 = 300)


Time t5:
session1>_
session2>COMMIT

SERIALIZABLE Transactions

Using this method, both transactions would proceed normally.

The first transaction would commit fine.

However, the second transaction would fail with Deadlock error.

Time t1:
session1>SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
session1>BEGIN;
session2>SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
session2>BEGIN;

Time t2:
session1>_
session2>SELECT blc FROM wallet WHERE uid = 1; (returns 700)

Time t3:
session1>UPDATE wallet SET blc = 500 WHERE uid = 1;
// gets stuck here
session2>_

Time t4:
session1>_
session2>UPDATE wallet SET blc = 500 WHERE uid = 1;
// ERROR 1213 (40001): Deadlock found when trying to get lock;
// try restarting transaction

Time t5:
session1> // Query OK, 1 row affected (10.38 sec)
session2>_

Time t6:
session1>COMMIT
session1> // succeeds, setting blc=500
session2>_

Optimistic Concurrency Control

In relational databases optimistic control is always implemented on the application side for dealing with concurrency.

Other NoSQL databases include such mechanism (MVCC) out of box at the database engine.

The basic idea behind optimistic control is that:

  • Records should have a version column, for example last updated timestamp;
  • All write operations should have an extra WHERE clause that checks that the record has not changed since the record was read.

The mechanism then checks after write completed if any row was affected and if none it aborts the operation (transaction).

In our use case we need to add a column called for example version:

ALTER TABLE wallet ADD COLUMN version integer NOT NULL DEFAULT 1;

Then we need also to rewrite our transactions:

Time t1:
session1>BEGIN;
session1>SELECT version, blc FROM wallet WHERE uid = 1; (returns 1, 700)
session2>BEGIN;

Time t2:
session1>_
session2>SELECT version, blc FROM wallet WHERE uid = 1; (also returns 1, 700)

Time t3:
session1>UPDATE wallet SET blc = 500, version = 2 WHERE uid = 1 AND version = 1;
// 700 – 200 = 500. Succeeds, reporting 1 row changed.
session2>_

Time t4:
session1>_
session2>UPDATE wallet SET blc = 500, version = 2 WHERE uid = 1 AND version = 1;
// (700 – 200 = 500). Blocks on session 1’s lock.

Time t5:
session1>COMMIT;
session2>_

Time t6:
session1>_
session2>ROLLBACK;
//UPDATE returns zero affected rows.

Optimistic control can be used even in autocommit mode or in separate database queries without using transactions which can cause performance degradation.

Conclusion

As we have seen, you should always keep in mind concurrency handling during the developing your application.

When ignored, serious damages may be caused to your business.

Within this post, we have discussed many approaches to properly handle concurrency where race condition situations can occur.

In general there is no perfect choice to use. It depends on your scalability requirements as well as on your business requirements.