Monday, October 17, 2016

Achieving Consistency in a Microservices Architecture

Microservices are loosely coupled independently deployable services. Although a well designed service will not directly operate on shared data it may still need to ensure that that data will ultimately remain consistent. For example, the requirement to debit an account to pay for an on-line purchase creates a dependency between the customer and supplier account balances and the stock database. Historically a distributed transaction has been used to maintain this consistency which in turn will employ some flavour of distributed locking during the data update phase. This dependency introduces tight coupling, higher latencies and greater lock contention, especially when failures occur where the locks cannot be released until all services involved in the transaction become available again. Whilst the user of the system may be satisfied with this state of affairs it should not be the only possible interaction pattern. A more common approach will employ the notion of eventual consistency where the data may sometimes be in an inconsistent state but will eventually come back into the desired state: in our example the stock level will be reduced, the payment has been processed and the item delivered.

I have, from time to time, seen blogs and articles that recognise this problem and suggest solutions but they seem to mandate that either service calls naturally map on to a single data update or that the service writer picks one of the services to do the coordination taking on the responsibility of ensuring that all services involved in the interaction will eventually reach their target consistent state (see for example a quote from the article Distributed Transactions: The Icebergs of Microservices: "you have to pick one of the services to be the primary handler for the event. It will handle the original event with a single commit, and then take responsibility for asynchronously communicating the secondary effects to other services"). This sounds feasible but now you have to start thinking about how to provide the reliability guarantees in the presence of failures, how to orchestrate services, storing extra state with every persistent update so that the activity coordinator can continue the interaction after failures have been resolved. In other words, whilst this is a workable approach it hides much of the complexity involved in reliably recovering from system and network failures which at scale will surely happen. A more robust design for microservice architectures is to delegate the coordination component of the workflow to a specialised service explicitly designed for this kind of task.

We have been working in this area for many years and one set of ideas and protocols that we believe are particularly suited to microservices architectures is the use of compensatable units of work to achieve eventual consistency guarantees in this kind of loosely coupled service based environment. I produced a write up of the approach and accompanying protocol for use in REST based systems back in 2009 (Compensating RESTful Transactions) based on earlier work done by Mark Little et al. Mark also wrote some interesting blogs in 2011 (When ACID is too strong and Slightly alkaline transactions if you please ...) about alternatives to ACID when various constraints are loosened and his summary is relevant to the problems facing microservice architects.

The use of compensations, coordinated by a dedicated service, will give all the benefits suggested in Graham Lea's article referred to earlier, but with the additional guarantees of consistency, reliability, manageability, reduced complexity etc in the presence of failures. The essence of the idea is that the prepare step is skipped and instead the services involved in the interaction register compensation actions with a dedicated coordinator:

  1. The client creates a coordination resource (identified via a resource url)
  2. The client makes service invocations passing the coordinator url by some (unspecified) mechanism
  3. The service registers its compensate logic with the coordinator and performs the service request as normal
  4. When the client is done it tells the coordinator to complete or cancel the interaction
    • in the complete case the coordinator has nothing to do (except clean up actions)
    • in the cancel case the coordinator initiates the undo logic. Services are not allowed to fail this step. If they are not available or cannot compensate for the activity immediately the coordinator will keep on trying until all services have compensated (and only then will it clean up)

We do not have an implementation of this (JDI) protocol but we do have an implementation of an ACID variant of it (called RTS) which has had extensive exposure in the field (and this can/will serve as the basis for the implementation of the JDI protocol). The documentation for RTS is available at our project web site. The nice thing about this work is that it can integrate seamlessly into Java EE environments and additionally is available as a WildFly subsystem. This latter feature means that it can be packaged as a WildFly Swarm microservice using the WildFly Swarm Project Generator. In this way if your microservices are using REST for API calls then they can make immediate use of this feature.

We also have a working prototype framework for how to do compensations in a Java SE environment. The API is available at github where we also provide a number of quickstarts showing how to use it.

Finally, we have a solution where we allow the compensation data to be stored at the same time as the data updates in a single (one phase) transaction thus ensuring that the coordinator will have access to the compensation data. This technique works particularly well with document oriented databases such as MongoDB

Monday, June 13, 2016

Karaf Integration

Narayana was introduced in the karaf 4.1.0-SNAPSHOT with 5.3.2.Final. You need to build from https://github.com/apache/karaf

Configuration

The narayana configuration file could be found in <karaf-4.1.0-SNAPSHOT>/etc/org.jboss.nararayana.cfg

Quickstart

First you need to install the narayana transaction manager feather and others related.
 karaf@root()> repo-add mvn:org.ops4j.pax.jdbc/pax-jdbc-features/0.8.0/xml/features
 karaf@root()> feature:install pax-jdbc-pool-narayana jdbc pax-jdbc-h2 transaction-manager-narayana jndi
 karaf@root()> jdbc:ds-create --driverName H2-pool-xa -dbName test test
 karaf@root()> bundle:install -s mvn:org.jboss.narayana.quickstarts.osgi/osgi-jta-example/5.3.2.Final

Run the commit example

karaf@root()> narayana-quickstart:testCommit

Run the recovery example

karaf@root()> narayana-quickstart:testRecovery -f
It could crash the karaf and generate the record to recovery. You need to restart the karaf and run the testRecovery command again.
bin/karaf
karaf@root()> narayana-quickstart:testRecovery

Admin tools

We are working on the JBTM-2624 [1] and support the commands
narayana:refresh                          Refresh the view of the object store
narayana:types                            List record types
narayana:select type                   Select a particular transaction type
narayana:ls [type]                        List the transactions
narayana:attach id                       Attach to a transaction log
narayana:detach id                      Detach to the transaction log
narayana:forget idx                      Move the specified heuristic participant back to the prepared list
narayana:delete idx                     Delete the specified heuristic participant

[1] https://issues.jboss.org/browse/JBTM-2624

Friday, June 3, 2016

Narayana in Spring Boot

It’s been available for over a month now, so some of you might have used it already. But I’m writing this post in order to give a better explanation of how to use Narayana transaction manager in your Spring Boot application.

First of all, Narayana integration was introduced in Spring Boot 1.4.0.M2, so make sure you’re up to date. At the moment of writing most recent available version is 1.4.0.M3.

Once you have versions sorted out, it’s a good idea to try it out. And in the rest of this post I’ll explain the quickstart application and what it does. After that you should be good to go with incorporating it in your code. The source code of this quickstart can be found in our GitHub repository [1].

Enabling Narayana

To enable Narayana transaction manager add its starter dependency to your pom.xml:
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-jta-narayana</artifactId>
</dependency>
After that Narayana will become a default transaction manager in your Spring Boot application. From then on simply use JTA or Spring annotations to manage the transactions.

Narayana configuration

Subset of Narayana configuration options is available via Spring’s application.properties file. It is the most convenient way to configure Narayana, if you don’t require to change a lot of its settings. For the list of possible options see properties prefixed with spring.jta.narayana in [2].
In addition, all traditional Narayana configuration options are also available. You can place jbossts-properties.xml in your application’s jar as well as use our configuration beans.

Quickstart explanation

Our Spring Boot quickstart [1] is a simple Spring Boot application. By exploring its code you can see how to set up Narayana for Spring Boot as well as configure it with application.properties file.
We have implemented three scenarios for you to demonstrate: commit, rollback, and crash recovery. They can be executed using Spring Boot Maven plugin. Please see the README.md for the exact steps of executing each example.

Commit and rollback examples are very straightforward and almost identical. They both Start the transaction, save the entry with your passed string to the database, send a JMS message, and commit/rollback the transaction.
Commit example outcome should look like this:
Entries at the start: []
Creating entry 'Test Value'
Message received: Created entry 'Test Value'
Entries at the end: [Entry{id=1, value='Test Value'}]
And rollback example outcome should be like this:
Entries at the start: []
Creating entry 'Test Value'
Entries at the end: []
Crash recovery scenario starts off the same as the other two, but then crashes the application between prepare and commit stages. Later, once you restart the application, the unfinished transaction is recovered. I need to note, that in this example we’ve added a DummyXAResource in order to allow us to crash the application on the right time. Feel free to ignore it, because it is in there only for the purpose of this example.
After the application is crashed you console outcome should look like this:
Entries at the start: []
Creating entry 'Test Value'
Preparing DummyXAResource
Committing DummyXAResource
Crashing the system
And after it is recovered the following should be printed:
Entries at the start: []
DummyXAResourceRecovery returning list of resources: [org.jboss.narayana.quickstart.spring.DummyXAResource@5bc98bd2]
Committing DummyXAResource
Message received: Created entry 'Test Value'
DummyXAResourceRecovery returning list of resources: []
Recovery completed successfully
Entries at the end: [Entry{id=1, value='Test Value'}]
Hope you'll enjoy using our transaction manager with Spring. And as always, if you have any insights or requests, feel free to post them on our forum.

Wednesday, February 3, 2016

Narayana Updates

Greetings from the Narayana team!

5.2.13.Final Released

We are very proud to announce the latest release of our project and its available for download now from http://narayana.io/
The release notes for this version are available here:
https://issues.jboss.org/browse/JBTM/fixforversion/12329358/?selectedTab=com.atlassian.jira.plugins.jira-development-integration-plugin:release-report-tabpanel
This release of Narayana was integrated into the WildFly application server as commit https://github.com/wildfly/wildfly/commit/f91666c2b46617229dd041e512488a379c83f16f under https://issues.jboss.org/browse/WFLY-6092. This means it will be in WildFly 10.1.0.Final (based on that projects currently allocated fixVersion in Jira).

5.2.12.Final Released

Its also worthwhile mentioning that 5.2.12.Final was released as part of WildFly 10 Final which is great too. That work can be seen via https://issues.jboss.org/browse/WFLY-5938.
The release notes for that version can be seen over here:
https://issues.jboss.org/browse/JBTM/fixforversion/12329351/?selectedTab=com.atlassian.jira.plugins.jira-development-integration-plugin:release-report-tabpanel

Upcoming work and requests for feedback

We are currently working on integration with various other frameworks. We could really do with some help understanding what features would be most beneficial to you. The areas we are looking at in particular are Spring, Camel and Karaf but we would be happy to discuss those or almost anything transaction related over on our forum:
Users: https://developer.jboss.org/en/jbosstm/content?filterID=contentstatus%5bpublished%5d~objecttype~objecttype%5bthread%5dImplementers: https://developer.jboss.org/en/jbosstm/dev/content?filterID=contentstatus%5bpublished%5d~objecttype~objecttype%5bthread%5d

Performance

Alongside the usual selection of enhancements and bug fixes we have been working on sharing performance figures comparing ourselves against a selection of other projects available in the open source community with a view to checking that the release remains competitive. We haven't been particularly been working on performance enhancements, rather the development of a microbenchmark of 2PC that is fair and consistent in our environment - you will almost certainly see different numbers in your particular environment based on the spec of your machine etc but we would expect the general ranking to be consistent. The tool we have found works for us is called JMH (a micro benchmark harness created by the OpenJDK project team available from http://openjdk.java.net/projects/code-tools/jmh/).

We have attempted to configure each product on an equal footing by choosing sensible defaults for each tunable parameter and by ensuring that recovery is enabled, although we do configure Narayana with the journal store, which is our best performing transaction log storage mechanism. If you have any recommendations for other transaction managers or how to tune the configuration then please let us know so that we can update our test job.

The benchmark runs a transaction containing two dummy resources.

We will let the figures speak for themselves, suffice to say that when more and more threads are thrown at the workload we scale better showing that we have excellent control over parallelism.

The graph for this run is: