Integration: Shared Database

In my last post, I introduced a scenario where we need to allow two applications to make use of the same customer information:

How do we make these applications talk?

The Web Store already has a SQL Server database. Since we're still designing the Marketing application, we could just make it use the same database. The result would look like this:

We make both applications use the same SQL Server database

This is probably the simplest solution that could possibly work, but it has a few downsides:

  1. Since changes to the schema could affect the other team, changes need to be co-ordinated.
  2. Storage and indexing need to be optimized for the access patterns of both applications, which might be hard to accommodate.

In essence, the key problem is coupling. We create a ripple effect any time we try to change a shared database.

Unless the applications are talking through a fa├žade, such as a stored procedure layer, it's difficult for one application to isolate itself from another.

As the enterprise grows, the effects of this become much worse. If more applications are built on top of this database, adding a column to a table could involve meetings between four or five teams, all with different priorities.

Mommy, where do DBA's come from?

Over time, the database morphs from being "just a place where the application stores its data", to the most critical piece of infrastructure in the organization. To deal with this, organizations often hire dedicated Database Administrators. Their job isn't just to keep the server running, but to act as strict guardians of any changes to the schema.

With DBA's come a strictly defined change control process. Instead of just adding a column to a table, an application developer might find themselves having to justify their case to a DBA, even if the column isn't used by any other application. The DBA might be busy responding to change requests from other teams, leaving the application developer blocked.

Chances are you've seen the shared database solution many times. It's probably the most common way of sharing data between applications (or in monolithic applications that should have been many small applications). What are some of the positive and negative experiences you've had with it?

A picture of me

Welcome, my name is Paul Stovell. I live in Brisbane and work on Octopus Deploy, an automated deployment tool.

Prior to founding Octopus Deploy, I worked for an investment bank in London building WPF applications, and before that I worked for Readify, an Australian .NET consulting firm. I also worked on a number of open source projects and was an active user group presenter. I was a Microsoft MVP for WPF from 2006 to 2013.

22 Apr 2011

I've had lots of positives with this approach, generally the "Marketing" application was a reporting/analytics suite. And for a lot of applications it is the best solution. There is a limit though, one company tried this approach and it failed miserably.

The problem was that they had millions of records to report against and had to aggregate 3 DBs together. The solution was slow, but the Product Owners agreed it was ok. That is until it hit production and customers complained in droves. End solution was to use ETL.

That aside, even if the speed had been ok, the size of the project made the "Marketing" and "Transaction" parts of the application move on different release cycles. The strong coupling caused problems implementing new features.

22 Apr 2011

Robert, would your comments change if the Marketing application were more than just reporting/analysis?

For example, it will probably need to augment customer data (to track which customers respond to which campaigns, or to add customers to an "unsubscribe" list). It will also have its own writeable domain related to campaigns and goal tracking.

22 Apr 2011

The is the worst solution of the scenario in my experience especially if the Marketing application is owned/developed by other team even if it would be just simple read-only app for reporting. The main reason is the shared DB in most cases means freeze of schema evolution which means generally also main application evolution freeze = not maintainable in longer time horizon.

I'm looking forward to next post in the series - it looks really interesting :)

22 Apr 2011

Paul, probably not if the data flow was only one way. Possibly if it was two way, but separating them and having an integration between them makes for loose coupling. This'll allow the two applications to be managed separately. At the smaller level, you would use Interfaces (contracts) and Adapters.

There was something in the DDD book saying that a single domain model for a business is often not possible, and when appropriate, separate models should be developed, possibly with different data stores.

A common scenario is that the two applications are owned by different departments, or just have different Product Owners. As we know, it's vital to have a single product owner per product. Their budgets and time lines will be different. Having each own the input end of the integration would allow each app to move forward independently (as long as they don't break the contract).

If IRL I had the info you gave, I'd first question why they are using a different stack. Probably would find out there is some friction between marketing and the web devs (or they got the golf and wine treatment).

As an aside, I was in a meeting a few weeks back where the application has gotten so large and coupled that we had a half hour meeting to discuss whether the next version number is 1.3.5, 1.3.100 or 1.4. The product also has components moving at different speeds. It just begs to be split up.

Daniel Sniderman
Daniel Sniderman
22 Apr 2011

How can you not have a shared database for any "non-trivial" sized organization. Any organization wants to say "Who are my customers"? "How much did we have in sales last year"? If every application has their own data - how can you answer that question?

You can pay now or pay later... And paying now is usually cheaper.

23 Apr 2011

@Daniel, The "Customer" record would be owned by one of the applications (eg the CRM). Other applications would reference that DB from their DB (either by integration or linked servers).

How can any non-trivial org not separate it's data?

Daniel Sniderman
Daniel Sniderman
25 Apr 2011

I guess we are arguing semantics. Whether one system or another "Owns" the data and references it via Integration or Services - you have a "Shared Database".

"Linked Servers"? That is a shared database in every sense of the word. The only difference is how you connect (via another server)

As a consultant (and earlier in my career) - I've seen too many clients that DON'T share their data - and duplicate everything and then try to have some sort of Data Mart/Warehouse try to get the summary data and it's ugly...

Fernando Adachi
Fernando Adachi
26 Apr 2011

I'd go with a new customer repository/database and make in the top of it a service layer for the Web Store, Marketing and future applications to consume. Since customer data is a shared resource, it has to be in an app that conforms to that scenario. If a small company, then use the approaches mentioned above.

Great topic Paul.

26 Apr 2011

Fernando, I just posted about a web service solution here:

Is that what you had in mind, or were you imagining a separate service layer that would also be used by Web Store?