Monday, October 09, 2006
« CSLA .NET 2.1 is available | Main | Open Source panel discussion »

For better or worse, SOA (service-oriented architecture) continues to be the current industry fad. As SOA continues along the “hype curve” (a term I’m borrowing from Gartner), more and more people are starting to realize that SOA isn’t a silver bullet, and that it doesn’t actually replace n-tier client/server or object-orientation.

 

What will most likely happen over the next couple years, is that SOA will fall into the “pit of disillusionment” (part of the hype curve, that I think of as the “pit of despair”), and many people will decide, as a result, that it is totally useless. This will happen, not in small part, because some organizations are investing way too much money into SOA now, when it is overly hyped – and they’ll feel betrayed when “reality” sets in.

 

After a period of disrepute, SOA may then rise to a “plateau of productivity”, where it will finally be used to solve the problems it is actually good at solving.

 

Some technologies don’t live through the “despair” part of the process. Sometimes the harsh light of reality is too bright, and the technology can’t hold up. Other times, a competing technology or concept hits the top of its hype curve, derailing a previous technology. Over the next very few years, we’ll see if SOA holds up to the despair or not.

 

This is a pattern Gartner has observed for virtually all technologies over many, many years. If you think about any technology introduced over the past 20 years or more, almost all of them have following this pattern: over-hyping, over-reacting-to-reality and finally used-as-a-real-solution.

 

My colleague and mentor, David Chappell, recently blogged about some of the realities people are discovering as they actually move beyond the hype and try to apply SOA. It turns out, not surprisingly, that achieving real benefits in terms of reuse is much harder than the SOA evangelists would have anyone believe.

 

I think this is because SOA focuses on only one part of the problem: syntactic coupling. SOA, or at least service-oriented design and programming, is very much centered around rules for addressing and binding to services, and around clear definition of syntactic contracts for the API and message data sent to and from services.

 

And that’s all good! Minimizing coupling at the syntactic level is absolutely critical, and SOA has moved us forward in this space, picking up where EAI (enterprise application integration) left off in the 90’s.

 

Unfortunately, syntactic coupling is the easy part. Semantic coupling is the harder part of the problem, and SOA does little or nothing to address this challenging issue.

 

Semantic coupling refers to the behavioral dependencies between components or services. There’s actual meaning to the interaction between a consumer and a service.

 

Every service implements some tangible behavior. A consumer calls the service, thus becoming coupled to that service, at both a syntactic and semantic level. At the syntactic level, the consumer must use the address, binding and contract defined by the service – all of which are forms of coupling. But the consumer also expects some specific behavior from the service – which is a form of semantic coupling.

 

And this is where things get very complex. The broader the expected behavior, the tighter the coupling.

 

As an example, a service that does something trivial, like adding two numbers, is relatively easy to replace with an equivalent. Such a service can even be enhanced to support other numeric data types with virtually no chance of breaking existing consumers. So the semantic coupling between a consumer and such a service is relatively light.

 

Another example is credit card verification. Obviously the internal implementation of this behavior is much more complex, but the external expectations of behavior remain very limited. Like adding two numbers, verifying a credit card is a behavior that accepts very little data, and returns a very simple result (yes/no).

 

Contrast this with many other possible business services, such as shipping an order, or generating manufacturing documentation. In these (quite common) scenarios, the service performs, or is expected to perform, a relatively broad set of behaviors. The result is a whole group of effects and side-effects – all of which should be considered as black-box effects by any caller. But the more a service does, the less “black-box” it can be to its callers, and the tighter the coupling.

 

And this leaves us in a serious quandary. There’s a high cost to calling a service. There’s a lot of overhead to creating a message, serializing it into text (XML), routing it through some communications stack onto the wire, getting the electrons across the wire through some protocol (probably TCP) and all the attendant hardware involved, picking it up off the wire on the server, routing it through another communications stack, deserializing the text (XML) back into a meaningful message and finally interpreting the message. Only then can the service actually act on the message to do real work.

 

Worse, that’s only half the story, because most people are creating synchronous request/response services, and so that whole overhead cost must be paid again to get the result back to the caller!

 

Before going further, let me expand on this “overhead cost” concept to be more precise.

 

I worked for many years in manufacturing. In that industry there’s the concept of cost accounting – people make their living at tracking costs. They divide costs into overhead, setup and run (there are other models, but this one’s pretty standard).

 

To make this somewhat more clear, I’ll use the metaphor of baking cookies.

 

Overhead cost are all the salaried people, the buildings, equipment and so forth. Costs that are paid whether widgets are manufactured or not. When baking cookies, this is the cost of having a kitchen, a stove, electricity, natural gas, and of course the person doing the baking. In most homes these costs exist regardless of whether cookies are baked or not.

 

Setup costs are applied overhead. They are costs that are required to build a set of widgets, but they are only incurred when widgets are being manufactured. These costs include setting up machines, programming devices, getting organized, printing documents, etc. When baking cookies, this is the cost (in terms of time) of getting out the various ingredients, bowls, spoons and other implements. It is also the cost of cleaning up after the baking is done – all the washing, drying and putting-away-of-implements that follows. These costs are directly applied to the process, but are pretty much the same whether you bake one dozen or ten dozen cookies.

 

Run costs are those costs that are incurred on a per-widget basis to make a widget. This includes the hourly rate of the workers manning the assembly line, the materials that go into the widget and so forth. When baking cookies, this is the time spent by the baker, the cost of the flour, eggs and other ingredients consumed in the process. Ideally it would include the amount of electricity or natural gas used to run the stove as well. Obviously detailed run costs can be hard to determine in some cases!

 

When calculating the cost of your cookies, each of these three costs is added together. The run rate is easy, as it is per-cookie by definition. The setup rate is variable – the more cookies you make in a batch the lower the relative setup cost, and the fewer cookies the higher the relative setup cost. Overhead is typically aggregated – the annual overhead cost is known, and is divided by the number of cookies (and other things) made over a year’s time. Obviously there’s lots of wiggle room in this last number.

 

For my purposes, in discussing services, the overhead rate isn’t all that meaningful. In our industry this is the cost of the IT staff, the servers, the server room, electricity and cooling and so forth.

 

But the setup rate and run rate become very meaningful when talking about services.

 

Calling a service, as I noted earlier, incurs a lot of overhead. This overhead is relatively constant: you pay about the same whether you send 1 byte or 1024 bytes to or from the service.

 

The run rate is the actual work done by the service. Once the message is parsed and available to the service, then the service does real, valuable work. This is the run rate for the service.

 

In manufacturing it is always important to manage the overhead and setup costs – they are a “pure cost”. The run rate cost must also be managed, but it is directly applicable to a product, and so that cost can be factored into the price you charge. Perhaps more importantly, your competitors typically have a comparable run rate (materials and labor cost about the same), but the overhead can vary radically.

 

To switch industries just a bit, this is why Walmart does so well (and is so feared). They have managed their overhead and setup costs to such a degree that they actually do focus on reducing their run rate (in their case, the per-unit acquisition cost of items).

 

Coming back to services, we face the same issue. Typically we deal with this using intuition rather than thinking it through, but the core problem is very tangible.

 

Would you call a service to add two numbers? Of course not! The setup/overhead cost would outweigh the run cost to such a degree that this makes no sense at all.

 

Would you call a service to ship an order, with all the surrounding activities that implies? This makes much more sense. The setup/overhead cost becomes trivial when compared to the run cost for such a service.

 

And yet coupling has the opposite effect. Which of those services can be more loosely coupled? The addition service of course, because it performs a very narrow, discrete, composable behavior.

 

Do you even know what the ship-an-order service might do? Of course not, it is too big and vague. Will it trigger invoicing? Will it contact the customer? Will it print pick lists for inventory? Will it update the customer’s sales history?

 

I would hope it does all these things, but very few of us would be willing to blindly assume it does them. And so we are forced to treat ship-an-order as something other than a black box. At best it is gray, but probably downright clear. We’ll require that the service’s actual behaviors be documented. And then we’ll fill in the gaps for what it does not provide, or doesn’t provide in a way we like.

 

(Or, failing to get adequate documentation, we’ll experiment with the service, probing to find its effects and side-effects and limitations. And then we’ll fill in the gaps for the bits we don’t like. Sadly, this is the more common scenario…)

 

At this point we (the caller of the service) have become so coupled to the service, that any change to the service will almost certainly require a change to our code. And at this point we’ve lost the primary goal/benefit of SOA.

 

Why? How can this be, when we’re using all the blessed standards for SOA communication? Maybe we’re even using an Enterprise Service Bus, or Biztalk Server or whatever the latest cool technology might be. And yet this coupling occurs!

 

This is because I am describing semantic coupling. Yes, all the cool, whiz-bang SOA technologies help solve the syntactic coupling issues. But without a solution to the semantic, or behavioral, coupling it really doesn’t get us very far…

 

What’s even scarier, is that the vision of the future portrayed by the SOA evangelists is one where we build services (systems) that aggregate other services together to provide higher-level functionality. Like assembling simple blocks into more complex creations, that in turn can be assembled into more complex creations or used as-is.

 

Except that each level of aggregation creates a service that provides broader behaviors – and by extension tighter coupling to any callers (though the setup vs run costs become more and more favorable at the top level).

 

To bring this (rather long) post to a close, I want to return to the beginning. SOA is heading down the steep slope into the pit of disillusionment. You can head this off for yourself and your organization by realizing ahead of time, right now, that SOA only addresses syntactic issues. You must address the much harder semantic issues yourself.

 

And the tools exist. They have for a long time. Good procedural design, use of flow charts, data flow diagrams, control diagrams, state diagrams: these are all very valid tools that can help you manage the semantic coupling. Unfortunately the majority of people with expertise in these tools are nearing retirement (or have retired) – but the tools and techniques are there if you can find some old, dusty books on procedural design. Just remember to include the setup/overhead cost vs run cost in your decisions on whether to make each procedure into a "service".

 

SOA solves some serious and important issues, but it is overhyped. Fortunately the hype is fading, and so we can look forward (perhaps 18 or 36 months) to a time when we can, with any luck, start focusing on the “next big thing”. Maybe, just maybe, that “big thing” will be some new and interesting way of addressing semantic coupling.


Monday, October 09, 2006 6:34:14 PM (Central Standard Time, UTC-06:00)
Thank you.. You express what I've been thinking everytime I read about SOA but I could not express it this well.. I have much of this same coupling problem when I think about workflows.. I mean what is the point to being able to "rearange" or introduce new modules when most of the time the modules are tightly coupled?
Steve Perry
Monday, October 09, 2006 10:09:26 PM (Central Standard Time, UTC-06:00)
I really admire your insight and courage. Somehow I believe SOA is just an offshoring conspiracy (“reuse”, big design, big project, so, offshoring). On the other hand, SOA is just text based processing that have been there for more twenty years (at least it was there where unix started ;-)
Tuesday, October 10, 2006 3:27:15 AM (Central Standard Time, UTC-06:00)
Hi Rocky, Here's my thoughts (bit long, sorry).

First off, 'SOA' is an ambiguous term. You can mean one or all of the following when you use it (1) An Integration pattern or (2) A form of Component/Module design or even (3) a type of Distributed Architecture (if you're insane/brave). It's a sliding scale of how much semantic info you need for each.

If we put that to one side, then it comes down to the fact that the syntactical definition is very much tied to how much semantic understanding we can get from the 'contract' of the Service. If this was always just about data we would use a form of Data Definition Language, such as XSD to describe it. This means we just have to deal with coupling of the data (which is still hard), and not all of the Services behaviour. On the data contract we can do basic things like check type, ordered sequence and even (gasp!) scalar boundaries. It's like a database, but with ODBC serialized to unicode and routed through port 80 to stop it going too fast :-)

The semantics for a data contract typically comes from either a shared understanding written by a 3rd party (i.e. a common spec that we both understand, perhaps domain specific, say XBRL which is logically 'above' the syntactical service endpoint) or something that requires both ends figuring out, but is obvious enough, i.e. xlmns:PostalAddess. These 'semantic contract' schemas seems to be more popular with people that like writing schemas rather than with people that like using schemas.

The leaves non-data services, or rather proxies for things that do stuff other than just CRUD on data contracts. Kind of like objects then. The reality is there is no real data definition language way of sharing this behaviour, because the problem defintion isn't bound or closed like data can be. The side-effects and consequences of 'DoOrder' Service often involve rules, logic and transforms that leak their way across the service boundary. We are back into the cloudy world of advice like 'multiple inheritance is bad' and 'goto considered harmful' - all human subjective stuff and not 'machine contraints checkable'.

This doesn't make behaviour based Services useless, it just turns them from an 'Integration' technique to a 'Componentization' pattern, and that tends to break more often and quicker than people expect.

SOA reuse, and any Gartner trough of despair, is probably due to crossing this line of 'integration' to 'componentization' without realizing the true cost of change.
David
Tuesday, October 10, 2006 10:04:14 AM (Central Standard Time, UTC-06:00)
Very well written. I have a question, though. If we need to perform the follwing actions - trigger invoicing, contact the customer, print pick lists, and update the customer's sales history - why would we call a service that *might* do these actions.

Wait, I think I can answer this one myself. You are talking about systems that are already in place, have been for quite some time, and have evolved to do these things. Then people start using the service and either need something else done or don't need all the "services" the object provides.

This is a very common problem and is certainly not limited to SOA. All developers have had to fix "established" code that has these problems. What you end up with is code that resembles a ball of twine and you're darned lucky if it doesn't have more than a few knots inside.

How do you solve the problem? Some would say to force the service objects to perform one and only one service. This is all well and good. But then you hit the overhead problem you so very well described.

What I have determined is this. Each application has a purpose and needs "services" performed. Other applications may need similar services. But the second you decide to combine the services provided to both applications into one object, you open the door to unexpected behavior. Heck, sometimes (usually) even the same application needs a service to behave differently for various situations.

Tuesday, October 10, 2006 10:28:43 AM (Central Standard Time, UTC-06:00)
Rocky - I very much enjoyed reading your article. I've passed it along to my project members as essential reading... "Chicken Soup for the SOA" :)

You've given me inspiritation to improve my semantic coupling documentation for this project. I can't remember the last time I got excited about that!!!

Cheers, and keep up the great work!

--matt
Matt Patterson
Tuesday, October 10, 2006 10:59:09 AM (Central Standard Time, UTC-06:00)
Hi Rocky,

Great post. In the first part, you touched on some next-gen topics like semantics. I recently wrote about the difference between semantics and syntax here:

http://blog.1530technologies.com/2006/09/what_is_the_sem.html

I am interested to see if we as developers can learn anything from the semantic web when creating our own service, or vice-versa. Perhaps Ontologies can aid companies in providing that semantic rosetta stone for services?

I look forward to seeing how this plays out.

Cheers,
Griffin
Tuesday, October 10, 2006 11:43:48 AM (Central Standard Time, UTC-06:00)
Mark, it is true that I was mostly talking about systems that are in place.

However, it is important to keep in mind that the SOA "direction" is toward building systems by composing services. And so even if you assume green field development, where you build a set of discrete services, the whole point is to create higher order services that aggregate lower order services. So you'd still end up with the ShipOrder service.

So my point (reading between the lines) is that the long-term goal of composable services is problematic. The more you compose services to create higher order services, the more you have to deal with semantic coupling.

As others have noted, this is not a new problem. This is _exactly_ the same problem that plagued procedural design with COBOL and FORTRAN, and is a primary driver for the rise of OO design over the past couple decades.
Tuesday, October 10, 2006 8:13:04 PM (Central Standard Time, UTC-06:00)
David , very interesting thoughts.

I think semantic coupling or semantic interoperability though harder, is achievable by following three approaches
1. Vertical domain-centered business vocabularies

2. Horizontal canonical cross-vertical frameworks like ebXML,UBL, etc.,

3. Semantic Web-based ontological frameworks


http://vikasnetdev.blogspot.com/2006/10/semantic-coupling-or-semantic.html
Wednesday, October 11, 2006 6:29:43 AM (Central Standard Time, UTC-06:00)
Vikas,

Those vocabs and specs lead to better data interop - which is still some way from being able to reuse someone's Service without a priori info. The standards work has always been better at the 'data shape' side, EDI/X12 has been around longer than XML and sliced bread (almost).

The narrower the domain vocab the more constrained the behavior has to be, to the extent that the service could not be reused beyond it's original remit. Ontologies describe relationships that can be machine navigated, but still require human interpretation due to the contacts to being 'open', i.e. RDF has described what the data is for but not what I can permissibly do with it.

I guess my main point was that Services that share semantic understanding around behavior, rather than data in/out will always be harder to reuse. If you do reuse them, then they will break compatibility sooner too. If you control all the end-points then it gets easier, but then you've just built yourself a distributed architecture, sharing classes not contracts...

I'm optimistic about where it can go in the future, but Rocky is correct that the reasons for selling the original SOA vision of reuse were not strictly kosher, and due a trough.
Wednesday, October 11, 2006 11:08:21 AM (Central Standard Time, UTC-06:00)
Well being much like Rocky, pragmatic in nature, all of the key buzz / hype words like the following: SOA, Enterprise, Proven, architect etc annoy me to no end; however, at the same time; it is the use of these words that has promoted salaries to ridiculous levels and allowed many opportunities for all of us.... so then I become torn. I was "architecting" applications as a developer long before the term architect became popular... upper management will pay me more to be an architect than they will to be a "developer". So terms like SOA become necessary in the art of confusion / dellusion to sustain growth in salaries and the promise of a better future just on the other side of the horizon.

I would still argue that good design / good code written by people passionate about their job will far outperform / outlast anything written by a career resume builder.

Oh... incidentally, our new EAI team (formed 2-3 months ago) has recently changed its name to the SOA team so I was informed. Our productivity amazingly hasn't changed and I didn't even feel any sudden surges of motiviation :)
Thursday, October 26, 2006 10:29:10 PM (Central Standard Time, UTC-06:00)
AWESOME post Rocky, absolutely AWESOME! It's posts like these that help keep me in check with just how little I know about doing software correctly and how much there is to learn. Steve Perry's comments about workflows hit the nail on the head too!
Saturday, December 16, 2006 1:55:23 PM (Central Standard Time, UTC-06:00)
Great post. I'm fairly new to SOA, but what thinking along the same lines.

In terms of "aggregate services" though, why not let the client do the aggregation? Say you have a series of WS methods - Add, Subtract, Multiply, Divide. Instead of a Response DoEquation(Request) aggregate, why not a Response[] DoEquation(Request[]) queue? Obviously the client defines order of operation and you'd have to serialize events or delegates (or use reflection to do that), so that you could know how to arbitrarily pass the result of one operation to the next item in the queue. Then the client's business process defines the semantic coupling instead of the service -- which I would imagine that it already does, and you don't have to worry about multiple calls across the wire.

I haven't tried this yet (maybe when I do I'll find out why I haven't heard anyone talking about it), but it seems plausible.
Brad
Wednesday, January 03, 2007 9:55:23 AM (Central Standard Time, UTC-06:00)
There certainly is the idea of the client/consumer doing the aggregation. This is a type of mashup, only with services instead of web content.

The problem with client mashups is that the client still ends up making numerous calls to services to combine their functionality. If the services are too fine-grained this will result in very poor performance, but if the services are too coarse-grained you lose flexibility, or the cient ends up callng a big service just to do a small task (and then lives with the side-effects of the big service).

These are challenging issues, there's no doubt about it.
Saturday, January 20, 2007 9:29:27 PM (Central Standard Time, UTC-06:00)
Your explanation of the semantic coupling problem (not to mention cost accounting, through the baking cookies analogy) was very well thought out, Rocky. SOA is definitely no panacea, or put another way, software developers jobs are safe for another ten years.

There are only two points I can see where these arguments against SOA fall a little short.

1. Unquestionably, I'd choose pushing two operands onto a stack register and calling the ADD instruction over the entire serialization / deserialization, 7 layer OSI stack transport over a network overhead that SOA (and other distributed application method calls) involves. A classroom full of computer science undergrads would have trouble finding a slower way of sending a missive back-and-forth if they tried.

Before I rule out messaging I can't overlook that demand has proven to be the mother of specialization, and that today's programming frameworks and hardware have become exceptionally good at messaging. In the same way stem cells of the human body specialize themselves to accomplish amazing functions, so have vendors innovated the technology to narrow this efficiency gap. My cell phone can send a text message faster than I can make it add two numbers together (there's a calculator under a submenu someplace), and there are now XML parsers that come microcoded on a chip.

2. XML, XSD, WSDL, sure these address the syntax of the messaging contract and the abstraction of addressing and binding to a service endpoint are still young. I don't think of this as the endpoint. Additional recommendations (and draft works in progress) continue to tackle the real world issues practitioners are encountering.

WS-Policy is the foundational spec for annotating services with a policy, a list of behavioral assertions. At low levels these address things such as the authentication model and encryption expectations, and it's easy to only see these as reshaping the message to it's channel. Policy doesn't have to stop there, a service endpoint's policy can describe (agreed upon) behavioral capabilities that it's implementation of "ShipAnOrder" performs (like sending a copy of an
invoice to arbitrary mailing addresses). Consumers, with their "shopping" list of business requirements can discover and rate those service endpoints that can meet their semantic expectations.

The formality of agreement over policy assertions is where many
industries haven't yet reached maturity, and I'd look forward to companies in various industry groups coming together on these definitions to create reusable service ecosystems, as one more milestone before SOA can reach it's productivity phase.

- Derek
Derek Harmon
Comments are closed.