Saturday, April 11, 2015

Microservices and transactions - an update

It's almost a year since I wrote my first thoughts on how transactions fit into the world of microservices and it's time for an update. I've had the pleasure of working in the field of fault tolerance and distributed systems for almost 30 years. In that time I've worked with some great friends and colleagues from within the same companies or across different companies on transactions, both traditional atomic (ACID) transactions and extended transactions. Back when I was doing by PhD on transactions and replication, weak consistency replication was in its infancy but there were already a range of extended transaction protocols.

Over the years we've seen these transaction protocols move from research into standards and industrial usage, with efforts such as the OMG's Additional Structuring Mechanisms for the OTS and WS-Transactions from OASIS. Although not as pervasive as ACID transactions, these additions to a developer's repertoire have seen some uptake. Now I'm not someone who believes transactions of any form should be used in all situations, but neither do I believe that they are so bad to be completely useless. Yet throughout the work we did for both Web Services and REST there were some groups that vehemently fought against transactions, often stating that applications should ensure that any transactional changes to state should be isolated within a service and not span multiple services, i.e., only local transactions should be supported.

As I said earlier, I don't believe that transactions, or even distributed transactions, are necessarily right for every application, or even some applications that use them today. Transactions (let's assume ACID for now) provide a nice and simple model for building applications, particularly if the implementation you use supports nested transactions. It's only natural for a developer who finds this structuring mechanism useful to expand it across objects, services and even machines. In a closely coupled environment when transactions last a few seconds this continues to be a useful approach. However, as we've seen and discussed many times before, they become problematical in loosely coupled environments. Hence the development of certain extended transaction models.

Structuring your applications so that all of the state changes which occur do so within a single object or service is often a lot easier said than done. Especially if you are building from components (services or objects) that have been developed over time by different groups or companies. It's easier to do if you build a Big Ball of Mud, which hopefully is not what you want to accomplish by going down the microservices route! Whether your stateful services interact directly with each other via RPC, say, or through a reliable, yet asynchronous messaging bus with queues and topics, such as JMS, it is fairly inevitable that your applications will have state updates which need to occur as some unit of work (note I didn't say "atomic" there). Some of these units of work will need to be atomic (though not necessarily ACID). Some will be fine with relaxed constraints, such as using forward compensation based approaches. Yes, I'm sure we'll hear people suggesting that atomic transactions aren't useful at all in these environments due to performance problems, but if they spent the time understanding the kinds of optimisations that mature transaction implementations have had in place for decades, then perhaps they'd realise that whereas there may be some overhead it's not as black-and-white as they may believe, or want you to believe. And please realise that XA is just one specific standard for transactions - it has its pros and cons, but any downsides you may have with XA shouldn't be assumed to carry over to the plethora of other transaction models and standards out there!

Let's return to the original topic: microservices and transactions. Where do (or should) the two come together? What I really don't want to repeat with microservices is the anti-transaction arguments we had for SOA. Get over it! Some applications will find them useful, whether atomic (ACID) or extended. Therefore, let's just assume that point for the rest of this discussion. As you develop your microservice(s) and hopefully take the approach of making each "do one thing well" as well as "be as simple as possible yet no simpler", you'll want to string them together; you'll want to have an invocation on one service trigger an invocation on another, or even more than one; you'll want to update the state of a number of services together. (I'll talk about weak consistency in a separate article.) You'll need to determine whether or not these updates have to occur atomically - just recognise the trade-offs this may mean to your application and services. As I've mentioned already, atomic transactions (local or distributed/global) aren't your only option though and one of these additional protocols could be better suited to your services and the way in which they have been constructed. And of course you can mix-and-match: just because some groupings of services may be better suited to a compensation-based model does not preclude you from using atomic transactions elsewhere - or even with the same services for for different operations.

In short what I hope anyone developing microservices will get from this is an understanding that transactions, both local and global, are not anathema to SOA/microservices. They may not be the default mechanism for you to choose when building your services, but they most certainly should be part of a good developer's palette. Having to implement equivalent capabilities in your infrastructure or the services themselves (consistency in the presence of arbitrary failures, opaque recovery for services, modular structuring mechanisms, span different communication patterns etc.) is something you shouldn't have to do because it's a monumental effort in its own right. A transaction manager microservice is something that should be available in many enterprise environments!