Monday, May 6, 2013

When you need ACID and can't get it ...

What happens when you need traditional ACID transactions across your resource managers and they won't behave? By that I mean they're not two-phase aware so can't be driven by a coordinator to achieve all of the necessary properties. Of course if you've just one such resource manager (let's call it one-phase aware for now) then you can probably make use of the LRCO optimisation.

However, what if you've got more than one such resource manager? Well if you've checked out JBossTS in the past then you'll know that we allow you to enlist multiple one-phase resource managers, but there's a very big caveat. This really isn't an option that anyone should choose. Fortunately I blogged about a better option over 6 years ago (!): compensation transactions. There were a few follow up entries, but the original one is the place to start.

Now what got me revisiting this was the article I wrote earlier on banks, ACID and BASE. It wasn't so much the notion that banks don't use ACID as the fact that we still see a lot of popular (NoSQL) databases that don't support transactions. Of course some use transactions internally (local transactions), but what I'm talking about is what is often referred to as global transactions: those transactions that span multiple resource managers. When you want to send a message, update a traditional database and update a NoSQL instance all within the scope of the same transaction, in most cases you're out of luck. And this is something that many enterprise customers want to do or will want to do soon.

That is unless you can use LRCO, which may not be possible if the resource manager(s) don't support the necessary recovery semantics.

Therefore, you're left in a situation that has very few options. One would be to ignore the problem and assume that because failures are rare (they are rare, right?) it's unlikely to ever be a problem. Personally I wouldn't recommend this.

Another option would be to resolve any problems manually. Again, since failures are rare (we're sure, right?) the chances of having to do this are slim and if you do have to resolve then it's a good enough trade-off. Of course you've got to hope that you've got enough information to detect and handle the recovery. Again, this isn't something I'd recommend unless you're a company that can afford to employ people who do nothing each day other than resolve these problems. (Yes, they do exist.)

My recommended option is to use compensation transactions, as I outlined many years ago. They will automate the recovery (and detection) as well as allow you to seamlessly integrate with a range of resource managers which are "well behaved". I'm hoping that we'll get a chance to try these out soon with some enterprise applications that use NoSQL implementations that don't support global transactions. Once we've done this then it'll be a good reason for one of the team to come back and write something here.

No comments: