Cloud Platform Engineering, DevOps, SRE, Kubernetes, KOPS

Transactions (mis)management: how to kill your app

In this article I’m going to write about transactions propagation. I will demonstrate how poorly designed application and misunderstanding of transactions propagation may cause your application to crash. Well, in fact I should say how to saturate connection pool and deadlock your database layer, but there is no much difference. The effect would be same – application would become unresponsive when trying to access and DB resource until locked connections are timed out.

To benefit most from this article, prior experience and knowledge of how Spring transactions management works is recommended. In general, when dealing with transactions I recommend to be familiar a one level below application abstraction of transactions management. That is, how transactions and isolation level work on database level. This is outside scope of this article, but if you would like to read on this topic, there is also very good and free online e-book available from InfoQ.

Propagating transactions

Transactions allow to treat an operation or group of operations as a single unit of work. In principle, if one operation within that group fails, or some conditions are not met then the system (database) is restored to pre-transaction state. In case of database transactions, all CUD (create, update, delete) operations would be rolled back to original state. It is not unusual for software systems to execute multiple functions and database queries within a scope of a transaction and when transaction spans across different layers of application. That’s where correct propagation of transactions becomes crucial. Spring Framework offers great support for transactions management. It makes developer’s life easier by taking away boiler plate of code which would need to be written otherwise to handle database transactions. In essence, Spring’s @Transactional makes transactions handling simpler and more readable. @Transactional takes multiple optional parameters which makes it a very powerful and flexible annotation. Examples of those parameters are: isolation, propagation, readOnly, noRollBackFor, etc. Spring’s Propagation Enum defines seven types of supported propagation:

  • MANDATORY – Support a current transaction, throw an exception if none exists.
  • NESTED – Execute within a nested transaction if a current transaction exists, behave like PROPAGATION_REQUIRED else.
  • NEVER – Execute non-transactionally, throw an exception if a transaction exists.
  • NOT_SUPPORTED – Execute non-transactionally, suspend the current transaction if one exists.
  • REQUIRED – Support a current transaction, create a new one if none exists.
  • REQUIRES_NEW – Create a new transaction, and suspend the current transaction if one exists.
  • SUPPORTS – Support a current transaction, execute non-transactionally if none exists Out of those seven propagation behaviour, there is one which if used incorrectly could cause your application to become unresponsive. That is, REQUIRES_NEW. As mentioned before, it is not unusual for transactions to span across different layers of application.

For a purpose of demonstrating how bad REQUIRES_NEW could be for you, I wrote a (simple web application)[https://github.com/resilient-data-systems/transactions-how-to-kill-your-app]. It’s a simple app which allows user to get a quote, get statistics of a quote, and buy, you will find more detailed description on a github. The application is divided into several layers:

Facade layer initiates transactions and hides services, Services executes ‘business logic’ and communicates with DAO, DAO abstracts database access and queries.

If you like to follow the examples, you can check out the project and run the examples in your favourite IDE (I’m using Eclipse). I will run through two examples, one I will simply call ‘good transactions’ and the other ‘bad transactions’ 😉 As I mentioned, it is a web application with JSON endpoints, you can make a calls with scripts added to the project, curl or browser (directly or with a great Chrome extension Postman). You can start the application by running Jetty (in Eclipse, go to src/test/java/ then to go package uk.co.resilientdatasystems.thtkya.web.jetty and do ‘Run as Java Application’ on Jetty.java)

During start-up of the app, an in-memory DB (H2) will be started and populated with sample data; a C3P0 connection pool with max 6 connections will also be created. Note: connection pooling is an important mechanism for improving database access performance as establishing connections, instead of reusing is expensive (time is the currency).

Let’s carry on, I assume you already started the app, if so, go to console view (Window -> Show View -> Console), right-click and clear all previous logs.

Do a GET request on http://localhost:9090/good/quote/1

You will see on your console following logs:

20:07:49.736 RDS DEBUG DataSourceTransactionManager - Creating new transaction with name [uk.co.resilientdatasystems.thtkya.goodtransactions.GoodTransactionsServicesQuoteFacade.getQuote]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
20:07:49.736 RDS DEBUG DataSourceTransactionManager - Acquired Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@5b672df6] for JDBC transaction
20:07:49.739 RDS DEBUG DataSourceTransactionManager - Switching JDBC Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@5b672df6] to manual commit
20:07:49.745 RDS DEBUG JdbcTemplate - Executing prepared SQL query
20:07:49.746 RDS DEBUG JdbcTemplate - Executing prepared SQL statement [SELECT id, quote FROM quote WHERE id=?]
20:07:49.765 RDS DEBUG JdbcTemplate - Executing prepared SQL update
20:07:49.765 RDS DEBUG JdbcTemplate - Executing prepared SQL statement [UPDATE quote_stats SET viewed = viewed+ ? WHERE quote_id=? AND channel=?]
20:07:49.767 RDS DEBUG JdbcTemplate - SQL update affected 1 rows
20:07:49.768 RDS DEBUG JdbcTemplate - Executing prepared SQL query
20:07:49.768 RDS DEBUG JdbcTemplate - Executing prepared SQL statement [SELECT id, quote_id, viewed, bought, channel FROM quote_stats where quote_id = ? AND channel=?]
20:07:49.769 RDS DEBUG DataSourceTransactionManager - Initiating transaction commit
20:07:49.769 RDS DEBUG DataSourceTransactionManager - Committing JDBC transaction on Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@5b672df6]
20:07:49.770 RDS DEBUG DataSourceTransactionManager - Releasing JDBC Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@5b672df6] after transaction
20:07:49.770 RDS DEBUG DataSourceUtils - Returning JDBC Connection to DataSource

What’s happening?

A HTTP GET call is routed to QuoteEndpointOne.java, that in turn calls Facade. Once the call reaches GoodTransactionsServicesQuoteFacade a transaction kicks in (line 1), then a connection is acquired and series of SQL queries are executed. Following that, the transaction is committed and JDBC connection is released. Because the Facade is annotated with @Transactional, any method (and subsequent as well) called through it will run within a scope of transaction.

So, it was sufficient to annotate only Facade with @Transactional, right? Right. What would happen if you annotated Services or DAOs with plain @Transactional? By default, propagation defaults to REQUIRED and no new transaction will be started if there is already one.

Ok, at the begin of the post, warned of misuse of REQUIRES_NEW. Take a look at the package ending .badtransactions. If you clean the console and open in browser http://localhost:9090/bad/quote/1 (or simply: do HTTP GET) you should see following logs:

22:03:32.662 RDS DEBUG DataSourceTransactionManager - Creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsServicesQuoteFacade.getQuote]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
22:03:32.662 RDS DEBUG DataSourceTransactionManager - Acquired Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@683ca3a4] for JDBC transaction
22:03:32.663 RDS DEBUG DataSourceTransactionManager - Switching JDBC Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@683ca3a4] to manual commit
22:03:32.663 RDS DEBUG JdbcTemplate - Executing prepared SQL query
22:03:32.663 RDS DEBUG JdbcTemplate - Executing prepared SQL statement [SELECT id, quote FROM quote WHERE id=?]
22:03:32.664 RDS DEBUG AnnotationTransactionAttributeSource - Adding transactional method 'BadTransactionsStatsService.incrementQuoteViews' with attribute: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT; ''
22:03:32.664 RDS DEBUG DataSourceTransactionManager - Suspending current transaction, creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsStatsService.incrementQuoteViews]
22:03:32.668 RDS DEBUG DataSourceTransactionManager - Acquired Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@6d30db75] for JDBC transaction
22:03:32.668 RDS DEBUG DataSourceTransactionManager - Switching JDBC Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@6d30db75] to manual commit
22:03:32.668 RDS DEBUG JdbcTemplate - Executing prepared SQL update
22:03:32.668 RDS DEBUG JdbcTemplate - Executing prepared SQL statement [UPDATE quote_stats SET viewed = viewed+ ? WHERE quote_id=? AND channel=?]
22:03:32.669 RDS DEBUG JdbcTemplate - SQL update affected 1 rows
22:03:32.669 RDS DEBUG JdbcTemplate - Executing prepared SQL query
22:03:32.669 RDS DEBUG JdbcTemplate - Executing prepared SQL statement [SELECT id, quote_id, viewed, bought, channel FROM quote_stats where quote_id = ? AND channel=?]
22:03:32.670 RDS DEBUG DataSourceTransactionManager - Initiating transaction commit
22:03:32.670 RDS DEBUG DataSourceTransactionManager - Committing JDBC transaction on Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@6d30db75]
22:03:32.670 RDS DEBUG DataSourceTransactionManager - Releasing JDBC Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@6d30db75] after transaction
22:03:32.670 RDS DEBUG DataSourceUtils - Returning JDBC Connection to DataSource
22:03:32.671 RDS DEBUG DataSourceTransactionManager - Resuming suspended transaction after completion of inner transaction
22:03:32.671 RDS DEBUG DataSourceTransactionManager - Initiating transaction commit
22:03:32.671 RDS DEBUG DataSourceTransactionManager - Committing JDBC transaction on Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@683ca3a4]
22:03:32.671 RDS DEBUG DataSourceTransactionManager - Releasing JDBC Connection [com.mchange.v2.c3p0.impl.NewProxyConnection@683ca3a4] after transaction
22:03:32.671 RDS DEBUG DataSourceUtils - Returning JDBC Connection to DataSource

If you follow the log you can figure out following:

  • When we hit getQuote of BadTransactionsServicesQuoteFacade.java a transaction is started (lets call it A).
  • Then we call getQuote of BadTransactionsServicesQuoteService.java (still same transaction in use)
  • Next one is getStats of BadTransactionsStatsService. This one is annotated @Transactional(propagation=Propagation.REQUIRES_NEW). What happens here:
    • Take a look at line 6 of logs: Adding transactional method ‘BadTransactionsStatsService.incrementQuoteViews’ with attribute: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT; ”
    • Line 7: Initial transaction is suspended and new transaction is created (B).
    • Line 8: A new JDBC connection is acquired
  • Later transaction B completes, its JDBC connection is released. Transaction A is resumed and completed and its JDBC connection is released as well. It could be represented as following:

What’s the problem? The problem is that pool of database connections is finite – no matter how big you set it to be because databases themselves have limits. If transaction B cannot acquire new JDBC connection, then it will wait until there is a free database connection. As a result, any operation depending on database will be locked.

REQUIRES_NEW deadlock demo How to demonstrate differences between ‘goodtransactions’ and ‘badtransactions’? In the script folder of the project there is a script concurrentRequests.py. The script takes three arguments: number of threads, number of requests per thread and target URL.

Start the app, and from terminal execute the script:

marcin@RDS:~/development/rds/java/transactions-how-to-kill-your-app/scripts$ python concurrentRequests.py 20 100 http://localhost:9090/good/quote/buy/2
getting there...
done
marcin@RDS:~/development/rds/java/transactions-how-to-kill-your-app/scripts$

This script started 20 threads, and each thread requested http://localhost:9090/good/quote/buy/2 if you point your browser to http://localhost:9090/good/quote/stats/2 you will see in JSON response following fragment:

{"id":7,"quoteId":2,"viewed":0,"bought":2000,"channel":"WEB"}

“bought”:2000 – means all request have been completed.

Now, clear your Eclipse console and execute same script but pointing to different endpoint:

marcin@RDS:~/development/rds/java/transactions-how-to-kill-your-app/scripts$ python concurrentRequests.py 20 100 http://localhost:9090/bad/quote/buy/3
getting there...

Then if you check console logs you will see it will stop at one point:

08:07:42.685 RDS DEBUG DataSourceTransactionManager - Suspending current transaction, creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsStatsService.incrementQuoteBuys]
08:07:42.685 RDS DEBUG DataSourceTransactionManager - Suspending current transaction, creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsStatsService.incrementQuoteBuys]
08:07:42.688 RDS DEBUG DataSourceTransactionManager - Creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsServicesQuoteFacade.buy]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
08:07:42.688 RDS DEBUG DataSourceTransactionManager - Creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsServicesQuoteFacade.buy]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
08:07:42.689 RDS DEBUG DataSourceTransactionManager - Creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsServicesQuoteFacade.buy]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
08:07:42.690 RDS DEBUG DataSourceTransactionManager - Creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsServicesQuoteFacade.buy]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
08:07:42.691 RDS DEBUG DataSourceTransactionManager - Creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsServicesQuoteFacade.buy]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
08:07:42.692 RDS DEBUG DataSourceTransactionManager - Creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsServicesQuoteFacade.buy]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
08:07:42.693 RDS DEBUG DataSourceTransactionManager - Creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsServicesQuoteFacade.buy]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''
08:07:42.693 RDS DEBUG DataSourceTransactionManager - Creating new transaction with name [uk.co.resilientdatasystems.thtkya.badtransactions.BadTransactionsServicesQuoteFacade.buy]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT; ''

At this point, the DAO cannot get any new JDBC connection from connection pool because all resources are busy and cannot be freed – we got a deadlock.I configured C3P0 so that unreturned connections are timed out after 30 seconds. If you wait a bit python will start printing error messages and in your Eclipse console you will see:

Aug 13, 2014 8:08:16 AM com.mchange.v2.resourcepool.BasicResourcePool removeResource
INFO: A checked-out resource is overdue, and will be destroyed: com.mchange.v2.c3p0.impl.NewPooledConnection@5798795f
Aug 13, 2014 8:08:16 AM com.mchange.v2.resourcepool.BasicResourcePool removeResource
INFO: A checked-out resource is overdue, and will be destroyed: com.mchange.v2.c3p0.impl.NewPooledConnection@774d4142
Aug 13, 2014 8:08:16 AM com.mchange.v2.resourcepool.BasicResourcePool removeResource
INFO: A checked-out resource is overdue, and will be destroyed: com.mchange.v2.c3p0.impl.NewPooledConnection@30c5a84
Aug 13, 2014 8:08:16 AM com.mchange.v2.resourcepool.BasicResourcePool removeResource
INFO: A checked-out resource is overdue, and will be destroyed: com.mchange.v2.c3p0.impl.NewPooledConnection@5d38d07b
Aug 13, 2014 8:08:16 AM com.mchange.v2.resourcepool.BasicResourcePool removeResource
INFO: A checked-out resource is overdue, and will be destroyed: com.mchange.v2.c3p0.impl.NewPooledConnection@45de1237
Aug 13, 2014 8:08:16 AM com.mchange.v2.resourcepool.BasicResourcePool removeResource
INFO: A checked-out resource is overdue, and will be destroyed: com.mchange.v2.c3p0.impl.NewPooledConnection@74120f99

After the ‘overdue resource’ is destroyed the application will start processing other requests until it deadlocks itself again.

Summary

This article demonstrated how improper use of transactions could harm your application. Just image what would happen if you released a new feature without doing any performance test, and that feature would start new transaction under existing one (REQUIRES_NEW). If you run the demo you know what would happen 😉

There is another point I would like to touch: consistency and agreed convention on declaring transactions. I see it as a bad practice to declare transactions on various layers of application – make your mind and stick to it. Declare your transactions on Service layer, Facade or whatever you call it. My personal preference is to annotate Facades as that would cover all services running behind Facades. Besides, if you need one-off service/method to be treated differently, you can always annotate that service/method with for example @Transactional(propagation=NESTED)