Integration: Messaging

We have been exploring ways to share customer information between these two applications:

How can these applications share customer information?

The solutions we've covered so far are:

  1. Shared database
  2. Extract/transform/load
  3. Web services (RPC/REST)

Web services are a nice way to decouple the applications, because they allow the applications to define and share a contract rather than taking a dependency on implementation details. But they do introduce other forms of coupling, especially around reliability.

Messaging

Messaging allows the applications to exchange information, using well defined contracts, asynchronously. The Marketing application would keep its own list of customers, and would accept messages from the Web Store application. Architecturally, it will look like this:

We can use queues to asynchronously share information between applications

The Web Store team would define the structure of a message - such as CustomerRegistered. They'd probably document not just the structure of the message, but some of the semantics around it (what does "registered" mean?).

The Marketing team would also define a message, such as CreateCustomer, which it would accept, along with the semantics of what "create" means. Note that our tone has changed from describing an event, to describing a command.

Integration using messages

The Web Store would behave like this:

  1. A user clicks the "register" button of an web page
  2. The customer is saved locally to the Web Store database
  3. The CustomerRegistered event is written to the queue.

The queue (generally something like MSMQ, ActiveMQ or RabbitMQ) would ideally be local to the machine Web Store is being served from. Web Store can then continue to process other requests. It doesn't care what happened to the event. It doesn't know how other applications intend to use the events. It just writes to the queue, and moves on.

Somehow, the message will make its way from a local queue on the Web Store machine to a queue on another server running some kind of transformer application. The transformer will handle the CustomerRegistered message, apply integration logic, transform it to a CreateCustomer command message, and write it to a queue destined for the Marketing application.

From the Marketing teams point of view:

  1. A CreateCustomer message lands in a local queue. They have no idea how or why, just that it did.
  2. Code in the Marketing application picks up the message, writes the customer details to the MySQL database
  3. The message is removed from a queue

Note that steps 2 and 3 are typically done within a transaction; we only delete the CreateCustomer message from the queue when the new customer's details are safely committed to the MySQL database.

The transformer

The box I've labelled "transformer" above is a bit of an iceberg. It could be:

  • A $200,000 BizTalk installation
  • An NServiceBus DLL with a simple handler using pub/sub
  • An open source package like Apache Camel
  • A C# console app that uses System.Messaging directly
  • An intern who pastes the message into a Word document, prints it, faxes it to a data entry clerk, who then re-types it into an InfoPath form which emits XML compatible with the Marketing queue

Integration of all of your applications may be centralized or decentralized. Once you start adding many transformations, and you make it easy to expose applications to them, it's generally called a service bus.

The code that lives inside the "transformer" box tends to be pretty predictable, if complicated. Enterprise Integration Patterns is a good book (which I have read) about the kinds of things that happen in this layer.

Advantages

Messaging combines the best of our previous solutions:

  1. Like the ETL solution, neither application is (from a code point of view) aware of each other, nor do they require the other application to be online in order to function
  2. Like the Web Services solution, the application can control how requests are processed, and apply domain logic before data makes its way into the inner sanctum that is the database

Messaging can help to make our applications very reliable, since applications are designed to be completely decoupled from each other. They are decoupled not just from a "contract over implementation" point of view, but from an "uptime" point of view. I'm going to explore these coupling concepts more in another post.

Disadvantages

Developers generally have less experience with messaging for integration, so there will be a learning curve. This is also an area swimming with vendor sharks selling pricey products, so if you spend too much time on the golf course you could get stuck with an integration solution you really don't want.

Conclusion

Hopefully this brief tour of integration solutions gives you an idea of how they could apply in the real world.

If you've used messaging for integration, how did it go? If you thought about it but opted for another solution, why?

A picture of me

Welcome, my name is Paul Stovell. I live in Brisbane and work on Octopus Deploy, an automated deployment tool for .NET applications.

Prior to founding Octopus Deploy, I worked for an investment bank in London building WPF applications, and before that I worked for Readify, an Australian .NET consulting firm. I also worked on a number of open source projects and was an active user group presenter. I was a Microsoft MVP for WPF from 2006 to 2013.

David McClelland
David McClelland
27 Apr 2011

We used Tibco EMS at the last couple of places I worked at - and yes, it's a pricey behemoth. The problem with message queues that we ran into is that no one was responsible to make sure the queue was still being read from. When our application (which read a queue of commodity trades) was unable to process a message (example - invalid data that could not be parsed from the XML into a .NET DateTime field), we logged the problem and made sure the app would keep running smoothly for the traders. However, someone needed to check our logs, go into the Tibco message queue viewer/editor, repair the bad data in that particular message, and then "re-queue" the message.

The need for the "repair job" on the message in the queue was WAY beyond the scope of understanding of our userbase - all they knew is that the trades could be entered in an upstream system just fine, but didn't make it down to our app because "this app is too strict" on date field input!

And you wouldn't believe (or maybe you would!) how common it was for a user to fat-finger a bad date into that upstream system that interpreted all dates as text that only had to match the YYYYMMDD format.

28 Apr 2011

@David, NServiceBus is well worth taking a look at. Very easy to use, very powerful and wonderful to test too. It has a lot of built-in support for handling error situations - this article is a good place to start. From what you say it sounds like the message source could have been improved to reduce the number of bad messages.

28 Apr 2011

Thanks for the comment David. The visibility into bad messages is a good point to raise. I guess this is the cost of any asynchronous communication.

ETL scenarios usually work in a similar way; if I have to move thousands of records from one database to another, I'd want it to ignore failed rows and log them somewhere so I can manually investigate and fix them up. And just like your messaging issue, it's important that I actually remember to check the logs and actually do fix them up.

Synchronous communications (like WCF services) have an advantage in this regard, because the service will raise a fault and the calling application will be more aware of it. It's a high price to pay though.

Sean
Sean
28 Apr 2011

What a great series of articles. Very useful, so thanks! Nice blog template too.