Chapter 13: SOA Primer

Back to Table of Contents.

SOA Primer

Service-oriented architecture, or SOA, is a concept that seems to be almost universally misunderstood. Especially in the era of Web 2.0 and web-services, the definition of true SOA services has been muddled. I have encountered people who think they have a service-oriented architecture, but don’t. I’ve encountered people who have stumbled upon a service-oriented architecture in solving an architectural scaling problem, but didn’t know the architecture they had invented had that name. “That’s what all those people talking about SOA are talking about? That’s easy!” Because SOA is a buzzword, no one wants to admit that they don’t know what it means.

True SOA is unfamiliar to many application developers, because the need for SOA enters late in an application’s lifecycle. It is often only when a company has matured to enterprise level that the old, tried and true, “monolithic” approach to building websites can no longer cut it. The single web application and single database of the monolithic architecture at some point cannot stand up to the demands of the company’s success, and a wholly new architecture is needed to meet the challenge. Usually that architecture is SOA.

But what is SOA? When is it useful and what are the advantages? The goal of this and the following chapters is to lift the veil of confusion surrounding “services” as they relate to service-oriented architecture. We’ll start with that what. What does a service-oriented architecture look like? Then we’ll look at the why. What are some common situations where a SOA solution is appropriate and likely to be successful? Finally, we’ll look at the who. Who uses SOA and how can it benefit your organization?

We’ll save the how for the next chapters.

What Is SOA?

SOA is a way to design complex applications, such that the complexity is more manageable. This is accomplished by splitting out major components into the building block of SOA, which are individual services.

Before defining that building block, the service, first let’s look at an analogous architectural jump that you’re no doubt already familiar with: object-oriented programming. Object-oriented programming, initially introduced in the 1960s and 1970s in Simula 67 and Smalltalk, swept the software development scene in the mid-1990s with C++, when programs were getting too large and complex to be maintained in procedural programming languages such as C. In procedural programming, you have large amounts of data, and lots of methods that process the data managed by the application. All functions are globally accessible, as is most data. Organization is imposed through naming conventions. For example, new_movie and delete _movie, rather than the more object-oriented and Movie.delete. Eventually, as program size increases, procedural programs became increasingly fragile. It is especially difficult for multiple programmers to be working on the same procedural application at the same time, because so much of the program—both methods and data—are more or less global in nature. It’s hard for developers to avoid stepping on each other’s toes, as it’s not always clear where one program module ends and another begins.

Object-oriented programming solved this problem by separating collections of methods and data through language mechanisms rather than through convention. While convention can be powerful, it can also be strayed from, while language rules are hard and fast. In object-oriented programming, lines are drawn between different types of data and their related methods.

For each type of data, a class—one of the building blocks of object-oriented programming—is defined. That class contains all of the data and data structures particular to that type of data. Also, any methods that are related to the data are packaged up in the class as well. Various amounts of the class’s data (often all of it) and methods (often a good deal of them) are declared to be private, which makes that data and those methods off-limits to the rest of the program and to developers working in other parts of the application. In this way, the language itself leads to an architecture within an app, which makes it easier for developers to organize their work.

What’s left after all of this information-hiding (public methods and data) defines the API of the class. The API is the contract of the class, and also the barrier. It is a contract because it says, “Provide me with these inputs, and I will provide you with these outputs.” It is a barrier because it is the only way—albeit indirectly—to access the class’s private methods and data. You can only access a class’s data in ways that the public API allows.

SOA follows a similar jump in the level of organization as did object-oriented programming from procedural programming. But Ruby is already an object-oriented language, so what is SOA jumping from? In this context, object-oriented programming is to SOA as procedural programming is to monolithic application design.

Before we go further, monolithic application design must be defined. In fact, it is everything that has been discussed in this book up to this point. In monolithic application design, all of the functionality of a website is contained in the same code base and runs in the same process on the web server. Figure 13-1 illustrates this concept. All of the data is contained in a single database—the monolithic database—or is otherwise accessible in some way, without restriction, from the monolithic application. This should sound familiar because this application design is the sole subject of virtually every Rails programming book in print today.


Figure 13-1. Simplified view of a monolithic web application

Now that we know what a monolithic application is, we should define a service—the building block of SOA. Recall that in object-oriented programming, a class represents a slice of an overall application represented by a data type and all the methods related to that data. SOA is analogous: a service is a slice of a website related to a particular set of functionality and the related data.

The difference between the object-oriented and the service-oriented analogies is how you access the APIs of different objects or services. In an object-oriented program, objects pass messages to each other, but the message passing occurs within a single process on a single machine. In a service-oriented architecture, individual services pass messages back and forth, but they do so over the network.

So what does a service look like? Actually, it can look like anything. Just like a class in object-oriented programming is defined by its API, and everything else is hidden behind a layer of abstraction, so too with a service. What’s behind the API is anyone’s best guess, and to gain the benefits of modularity, such details should be the service client’s most remote concern. Of course, there is a template for the structure of a service that will be adequate for most problems. That architectural template at a very high level is shown in Figure 13-2.


Figure 13-2. Simplified view of requests to a SOA service

Hopefully, it is not too shocking that the high-level architecture of a service looks remarkably similar to that of the monolithic application. Similarly, the code within a class in object-oriented programming doesn’t look much different from code in functional programming. In SOA, it’s not the structure of any particular service that matters, but rather the fact that the application has been organized into a number of simpler, more understandable components, each separated from the others, and communicating via well-defined, public APIs.

And, of course, Figure 13-2 presents a drastic over-simplification of a service “template.” In Chapter 14, a lower-level template will be provided for a Rails XML-RPC service based on ActionWebservice, the preferred mechanism for most SOA services. In Chapter 15, REST services will be introduced.

The great news is that if you do follow the service templates described in this book, all of the concepts already covered remain applicable. The design principles of a monolithic application and of a service within a service-oriented architecture are nearly identical.

Why SOA?

Many scenarios lead naturally to the need for a service-oriented design. Some scenarios, like a shared resource, demand it. Other scenarios, like the need for massive scaling, can be helped by a service-oriented architecture because a SOA design naturally segments databases. It also reduces local complexity, which can make a caching scheme easier to implement, further enhancing scalability. The notion of reducing local complexity can be a goal in and of itself, for which SOA is the means. In this section, each of these scenarios are explored in detail.

Shared Resources

Imagine that you are building a site that sells widgets. For that site, you’d have a database full of sales data, both products and orders. Next imagine that you took my advice in Chapter 4 and did not attempt to report directly out of the sales database, and instead created a separate data warehouse database, which transformed the 3NF or DK/NF sales and orders tables suited for OLTP into a star schema, which is better-suited for OLAP. This setup is shown in Figure 13-3. Each of these databases has a front-end interface for employees to access the contained data, be it sales data in the sales administration website, or aggregate reports in the reporting website.


Figure 13-3. Two databases in our system

There’s an additional wrinkle, though. Each front-end website needs to look up access control information to validate that the users who are logging in and carrying out actions are authorized. There are numerous solutions to this problem, each of varying merit. We’ll look at two candidate solutions that fail our litmus test for “enterprise” before settling on—and in the process defining—a service-oriented solution.

Synchronized tables

In Figure 13-4, we’ve seemingly solved this problem by reproducing the access control tables in both the sales and reporting databases. Each website is essentially its own monolithic application, with direct access to all the data it needs in its own local database. This works for a while, but now new employees need to be added to both databases through each front-end. Similarly, whenever an employee changes her password on the reporting site, she needs to remember to change her password on the sales site, too. Keeping two sets of passwords in sync might not seem too onerous, but three sets of synchronized data gets daunting, and 5 or 10 sets of data for each user to manually to keep in sync is downright ludicrous. This solution doesn’t scale with the business.


Figure 13-4. Attempting to solve the access control problem by replicating tables

A shared database

Figure 13-5 illustrates another attempt at solving the problem of shared data. This time, we’ve gotten around the problem of keeping multiple databases in sync with each other. We’ve moved the access control tables out into a third database, which each site accesses to authorize users.

This technique is DRY in that we don’t need to repeat user data in each database. However, at the code level it’s not very DRY at all. Each website needs to duplicate the logic for authorizing a user and any other logic associated with those tables. If we find a bug in one site’s authorization implementation, we have to correct the code not only in that site, but in every site that uses the access control database tables.

Similarly, if we make a change to the table structure to improve performance or to allow for a great new feature, we need to modify every application that authorizes users to remain compatible with our changes. Again, in our example of two sites, this may not seem too onerous, but with 10 applications you’re out of luck. Propagating the changes through software code is hard enough. In a production environment you’ll also have to roll out with a very well-planned strategy to avoid needing a site outage to commit your database changes.

A third problem with this setup is that it makes correct caching impossible. Imagine that you’ve got tons of traffic coming to your sites, and each request needs to check authorization data. All that authorization activity is becoming a strain on the database. Fortunately, the list of authorized users is relatively small, so you decide to cache the entire users table in Memcache, clearing all users’ entries on appropriate events: when they change their password, or when an administrator deletes or changes their account in some meaningful way. Of course, the sales and reporting websites are completely unrelated code bases, so they don’t share the same Memcache cluster. Having shared memory between distinct applications is to be avoided under all circumstances, as there is never any guarantee, express or implied, that one application won’t manipulate that memory in a way that is counter to the expectations of another application. For example, two applications may choose the same cache key for different types of data, leading an application to load and display incorrect data, or to fail completely.

Therefore, with shared memory spaces out of the question, maintaining correct caches between separate applications requires that “appropriate events” for clearing each site’s user cache must be broadcast somehow from application to application so that a deleted user is deleted from the perspective of all caches. With our current design, when a user is locked out of one site, they can continue to access other sites until the bogus Memcache entries expire due to the course of time. The broadcasting necessary to keep the caches maintained by completely separate applications in sync would be difficult to implement, and is not worth the while, as better alternatives exist.


Figure 13-5. Attempting to solve the access control problem by splitting common tables off into a new database

A service-oriented architecture

We finally come to the third and final solution to our problem of sharing user authorization information. This solution avoids duplication in databases and in code. It abstracts database changes away from the clients of those tables so that code changes to support a schema change only need to be made in one place. Finally, this solution does not pose a problem for maintaining cache correctness, as the previous attempt did. In fact, this solution is ready-made for horizontal scaling. The solution is—you guessed it—an access control service.

Figure 13-6 shows the service in the context of our other two applications. It is a complete Rails application, but rather than serve web pages, it accepts service requests from other applications for a predefined API, and returns service responses. In this example, the service supports a method authorize(), which would return true or false to the application trying to authorize the user. The makeup of the tables in the access control database is completely abstracted by this method. We could change the database schema completely, or even replace it with an LDAP database; so long as the implementation of the authorize() method within the service is updated, all of our client applications—here, the sales and reporting sites—can continue to authorize users without needing to make any changes.

In this example, we have extracted not just the repeated database tables, but also the repeated application code, all into a single place: the service. The service publishes an API, or a set of methods to which it can respond. In a sense a service API is just like a model class in a Rails application, but the methods execute on a remote server.


Figure 13-6. Solving the access control problem by introducing a service layer

Reduce Database Load

The database is usually the first bottleneck to show up in a project, and it is also generally the most persistent. You can add as many application servers as you want, but you are stuck with a single database server to handle all of your SQL queries.

Luckily, the amount of time spent processing SQL queries can be greatly reduced through a variety of techniques available to the application developer:

  1. Analyze slow queries with a query planner. Add indexes to speed up slow queries, or rewrite the queries to use already existing indexes.
  2. Recast expensive queries as database views. Then materialize those views for faster access, as described in the previous chapter.
  3. Replace database queues with external queues mechanisms such as Amazon’s Simple Queue Service (SQS).

Unfortunately, database tuning can seem like a constant uphill battle. Every new batch of code you release is likely to contain new SQL queries. Often you won’t even know explicitly what those queries are when you’re releasing them, because the SQL will be hidden behind the abstraction of ActiveRecord magic in Ruby code. Ideally, all of your database queries use indexes; however, new queries may mean new indexes are necessary. In your development or staging environments, where there’s less data in the database and less traffic, the queries—while slow—may seem to be an acceptable speed. You may not feel the pain of missing indexes until the queries are out in production and massive amounts of traffic are suddenly getting slammed against a query that runs too slowly in the context of your production database and all its contained data.

Similarly, a change in the application may lead users to use a feature they didn’t use before. A slow culprit query may have been in your application all along, but growing popularity may cause it to hose your entire site.

Slicing and dicing

One of the biggest benefits of a service-oriented architecture is that it allows you to bend the rule that says you can only have one database server serving traffic. A service represents a vertical slice of your site’s functionality, from the database up to the service API itself. This means that each separable slice of functionality can persist its data in a separate database on a separate physical machine. If you have two services, such as a Product Service and an Orders Service, each handling roughly half of the database load, then by splitting the application into a service-oriented architecture, you can—at the expense of added hardware—reduce database load by 50%. While some may take issue here and point out that a second database server may be expensive, whereas making software faster is free, in practice the cost of hardware pales in comparison to the cost of a good software developer’s time.

To illustrate this, let’s take the example of the monolithic application we developed in the first half of this book, shown in Figure 13-7. There are two components, the Rails application, with code for dealing with movies (our product), and for taking orders for movie tickets.


Figure 13-7. A monolithic application serving two functions

There are two good ways to split this monolithic application up. The first step is to simply slice the database in half, leaving the movie information in one database, and the order information in another. Then model classes relating to orders are moved into their own new application, the Orders Service. A service API exposes the methods of those classes. Anywhere that model classes related to orders were referenced in the original application (now the Movies application), they are now delegated via the service API to the orders application. This architecture is shown in Figure 13-8.


Figure 13-8. A monolithic application split into an application and a service

The other way to split up a monolithic application like this is to make everything a service. The movies application and the orders application are both services: a Movie Service and an Order Service, respectively. Each contains only the logic pertinent to its own mission: movies and showtimes or orders for some type of product. Gluing the services together and providing the HTML web interface for customers is a thin front-end. This front-end has controllers and views, but no database. The thin front-end provides the user experience, but all the hard work is done behind the scenes by the services. This architecture is shown in Figure 13-9 and is the architecture preferred in this book.

It should be noted that with this architecture, movies and orders are completely decoupled. This means that the Order Service can be written to be completely generic, and could be reused regardless of what the product is being sold. If your company starts selling video games or music, the Order Service shouldn’t need to be rewritten to support new types of products.


Figure 13-9. A thin front-end backed by two services in a service-oriented architecture

One anti-pattern that is not a service-oriented architecture, but still splits the database, is shown in Figure 13-10. Here, the tables related to movies have been separated from the tables related to orders, and each set is placed in a separate database. This does solve the initial problem of reducing database load. However, it is an anti-pattern because a vast many parts of your application stack no longer work as intended.

The first problem is that Rails classes can no longer behave as you would expect them to. Our monolithic application has a has_many relationship between movie showtimes in the Movie Service and orders in the Order Service. This is perfectly valid from the perspective of Rails. However, as soon as you execute any of the following queries, you hit a wall:

Order.find(:all, :include => :movie_showtime)

Second, the databases can no longer maintain referential integrity. While this is true of the SOA examples in Figures 13-8 and 13-9, the applications have been decoupled and there can no longer be any expectation of referential integrity between the services (or course, within each service referential integrity would be maintained). In Figure 13-8, there is no has_many relationship in Rails between movie showtimes and orders because neither Rails application knows about both models. An order, in the decoupled Order Service, would maintain an external foreign key that the thin front-end knows is related to a movie showtime.

In Figure 13-10, showtimes and orders are represented by models in the same Rails application, which means the tables backing the models must maintain referential integrity. However, the split database guarantees that referential integrity cannot be enforced. In essence, this architecture, which removes even the possibility of maintaining referential integrity, although the application can still be written in a way that assumes it. This is the antithesis of the entire first half of this book.


Figure 13-10. A database split anti-pattern

You may be asking yourself why it is okay to lose referential integrity in the architectures shown in Figure 13-8 and Figure 13-9, but not in Figure 13-10. The difference is that in Figure 13-8 and Figure 13-9, there can be expectation of referential integrity to begin with. Because the applications are split along the same lines as the database, there are no database-related model classes for the tables that exist in the other applications’ database. Each table’s model classes exist in the service application sitting atop the physical tables. Within the tables, rather than have traditional foreign keys where there can be an expectation of a join, we’ll store external foreign keys. These will be understood to exist in a separate system, and to access the related data, a service call will be required, not a join or other SQL lookup. However, because we’re dealing with an external system at the application level, we don’t assume that the data must be present. Instead, the service can return the equivalent of a 404 Not Found error, and the calling application should be written to gracefully handle such a scenario. In Chapter 16, when we build an XML-RPC services that talk to each other, we’ll get a taste for how this works.

The myth of database replication

A common counterargument to SOA as a solution for managing database load is the contention that databases can be replicated to balance load, just like application servers. Except in rare read-only situations, nothing could be further from the truth.

Why is this so? First let’s examine the problem (database load) and the proposed solution (database replication). Figure 13-11 shows two configurations. On the left, a single database is connected to two application servers. We find that the database is heavily loaded, so we attempt to rectify the problem by replicating the database. In reality we would want to direct all writes at the master and allow both the master and the slave to handle reads; in this diagram, for simplicity, we have directed half of the traffic to the master and half of the traffic to the slave.


Figure 13-11. A database under load from two applications; the load is split by replicating the database

This looks like it should work. However, there is always a tradeoff between speed and consistency. If you have one or more slave machines that need to maintain the same data as the master, and you expect query results to be consistent, you will pay a heavy price in waiting for the databases to get synchronized—including network overhead—before you can trust any data retrieved from any slave machine.

Figure 13-12 shows the steps required for a write to the master database followed by a read of the same data from the slave database. With a replicated database, there are additional network operations to lock the slave database, send the data to be written over the wire, and then unlock the slave database. This blocks not only the read on the slave, which cannot return a value until the master has unlocked it. It also blocks the write on the master. The write cannot be deemed successful until it has been propagated successfully to all slaves.


Figure 13-12. Breakdown of a write and read with replication

Figure 13-13 shows the same scenario, but with a single database. Here, there are no network operations to synchronize databases for writes, because there is only a single database. A read following a write returns immediately after the write is deemed successful locally. In this scenario, there are two steps between initiating a write and the write succeeding, and two steps between initiating a read and obtaining the result. Further, none of these steps requires a network hop. On the other hand, with a replicated database, it takes six steps after the write is initiated for the write to succeed. The read waits for five steps before returning a result.


Figure 13-13. Breakdown of a write and read without replication

So in a situation where writes are frequent, replicating a database can actually slow your application down rather than speed it up. The load on the database machines may appear to be lower, but that’s because the machines are sitting around, waiting for locks to be lifted and for data to pass over the network. The database in Figure 13-13, while it appears from CPU and disk metrics to be more heavily taxed, will actually be outperforming the replicated database because it is much more efficient. It doesn’t sit around waiting for others to catch up, but rather slogs forward using all the CPU and disk cycles it can to serve application requests.

Scalability II: Caching Is Tricky

Centralizing the logic for a single “concern” of our application into a single place also eases our problem of cache correctness and horizontal scaling. To scale horizontally, we need to ensure that the database is not our bottleneck, and that we can handle more load simply by adding more application servers. To accomplish this, we need to cache data somewhere.

But caching can be a tricky proposition. Before relying on a cache, we need to be highly confident that the data in the cache is accurate. A service-oriented architecture can help us gain that confidence. Because all access to the data in question must arrive through the service API, we don’t have to look very far to find all the places where the data may change. This makes it much easier to cache data at the right times and be reasonably assured that we’re clearing our cache at the right times, too.

While correct caching in a monolithic application is certainly possible, correctness is harder to guarantee because there’s much more code around that has the potential to do something bad and break the cache correctness. In the monolithic application, any code, at any time, can access and modify the access control tables—or any other tables in the database, for that matter. You may have spent a lot of time in a monolithic application, carefully designing an abstraction for authentication that relies on caching to boost performance, but one day a new programmer comes along and forgets to use your abstraction. He accesses the database tables directly, and suddenly your carefully crafted authorization API no longer works as expected. Data has changed underneath the caching abstraction and the cache no longer matches the actual data. In a service-oriented architecture, such a catastrophe is impossible, because only the authorization service itself has direct access to the authorization tables. There is a physical separation of the data belonging to one service from all other services and applications. There is no opportunity for unrelated code to make modifications to a service’s data.

Just as the monolithic application can be run on multiple servers to balance load, the service application can run on as many physical machines as necessary to handle the load of all the client sites that need to authorize users. Each machine would have a Memcache server contributing to a shared cluster. Whereas we would not allow a heterogeneous set of applications to participate in the same Memcache cloud to cache and expire user data, we now have a single code base for authorizing users contained within the service, so the problem disappears. Figure 13-14 shows a service cluster of three machines, all running the Access Control Service and a shared Memcache cluster.


Figure 13-14. A service can scale horizontally while maintaining a provably correct cache

Reduce Local Complexity

When your application is designed as a set of services and service consumers, your organization—i.e., your company or business unit—gains something that is lacking in a company whose software is monolithic. Your business organization gains team modularity along the same module splits as your software. This is a good thing, because modularity gives teams autonomy, and autonomy means less need for communication to get any particular job done. That can spell huge productivity gains for the organization as a whole.

How is this possible? Consider the example we began the chapter with: a sales site and a reporting site, both clients of an authentication service. Let’s make this example a bit more realistic by splitting the sales site into two services: one for product information and a second for order data.

The services have a plain old HTML interface for administrative uses, accessible only within the firewall. These admin interfaces use the authentication service to validate employees and grant them access to the administrative features. The services also have a “service API,” which a single front-end consumer website consumes, combining the features of both services to present a unified view of the data to customers. When the front-end needs to access product data, it contacts the Product Service. To build and execute an order, it contacts the Orders Service.

All of this is hidden from the site’s visitors, but it is plain as day to those developing the website. There are five clearly delineated compartments of code:

  1. Authentication service
  2. Product service
  3. Orders service
  4. Reporting website
  5. Externally facing consumer website

This translates into five teams of developers. Because the services all have well defined APIs, consumers of the services don’t need to talk to the service teams unless the API is no longer satisfying their needs. Similarly, service teams can work within their code base without worrying that their changes may result in unknown consequences for others. As long as the service continues to behave as specified by the published API, clients of the service can remain blissfully ignorant of any architectural changes going on within the service. Thus, every team can innovate within their own bubble in a way they could not when all the source code and database tables were intertwined in a monolithic application. The same was true in the 1990s with the collective leap from procedural to object-oriented programming. And we’ve said nothing about the dramatic reduction in overhead accrued while resolving source control conflicts in a monolithic environment where everyone is simultaneously modifying the same files. Figure 13-15 shows an example of a Rails service split, with admin functionality contained within the service applications.


Figure 13-15. A likely Rails service split, with admin functionality contained within the service applications

In Summary

Simply put, a service is a vertical slice of functionality: database, application code, and caching layer. Within a service, you should follow the same principals of design that we covered in the first half of this book. Externally, the service-oriented architecture provides a number of additional benefits:


Each service is responsible for a single business function. The amount of code is minimized to a quantity understandable by a single person.


A service’s application code is separated and isolated from all other application code. Its database is separate from other services’ databases. Only the service application itself has direct access to its persistence and caching layers. A service is a gatekeeper for its data.

Uniform access

A service is accessed through a published API, and likewise, it accesses other services through their published APIs. The API is the only interface to access a service’s data.


A service lives inside the firewall, and trusts any client that has physical access to make service requests.


Isolation and uniform access means that cache correctness is much easier to guarantee. The number of places where persisted data can change is reduced to the service code. Thus, the use of caching can be maximized, which reduces demand for database resources, and in turn allows the application to scale to greater loads simply by adding more application servers.


  1. Examine your application for vertical slices of functionality. Which are good candidates to be separated out as services? Why?
  2. For each candidate service identified, what database tables drive the functionality?
  3. For each table, what are the foreign keys in and out of the service? How might you deal with foreign keys in isolated databases?
Chapter 12 : Materialized Views
Chapter 14 : SOA Considerations