Thursday, July 21, 2005
« Congratulations, you've installed DasBlo... | Main | A variety of physical n-tier options »

I am often asked whether n-tier (where n>=3) is always the best way to go when building software.

 

Of course the answer is no. In fact, it is more likely that n-tier is not the way to go!

 

By the way, this isn’t the first time I’ve discussed this topic – you’ll find previous blog entries on this blog and an article at www.devx.com where I’ve covered much of the same material. Of course I also cover it rather a lot in my Expert VB.NET and C# Business Objects books.

 

Before proceeding further however, I need to get some terminology out of the way. There’s a huge difference between logical tiers and physical tiers. Personally I typically refer to logical tiers as layers and physical tiers as tiers to avoid confusion.

 

Logical layers are merely a way of organizing your code. Typical layers include Presentation, Business and Data – the same as the traditional 3-tier model. But when we’re talking about layers, we’re only talking about logical organization of code. In no way is it implied that these layers might run on different computers or in different processes on a single computer or even in a single process on a single computer. All we are doing is discussing a way of organizing a code into a set of layers defined by specific function.

 

Physical tiers however, are only about where the code runs. Specifically, tiers are places where layers are deployed and where layers run. In other words, tiers are the physical deployment of layers.

 

Why do we layer software? Primarily to gain the benefits of logical organization and grouping of like functionality. Translated to tangible outcomes, logical layers offer reuse, easier maintenance and shorter development cycles. In the final analysis, proper layering of software reduces the cost to develop and maintain an application. Layering is almost always a wonderful thing!

 

Why do we deploy layers onto multiple tiers? Primarily to obtain a balance between performance, scalability, fault tolerance and security. While there are various other reasons for tiers, these four are the most common. The funny thing is that it is almost impossible to get optimum levels of all four attributes – which is why it is always a trade-off between them.

 

Tiers imply process and/or network boundaries. A 1-tier model has all the layers running in a single memory space (process) on a single machine. A 2-tier model has some layers running in one memory space and other layers in a different memory space. At the very least these memory spaces exist in different processes on the same computer, but more often they are on different computers. Likewise, a 3-tier model has two boundaries. In general terms, an n-tier model has n-1 boundaries.

 

Crossing a boundary is expensive. It is on the order of 1000 times slower to make a call across a process boundary on the same machine than to make the same call within the same process. If the call is made across a network it is even slower. It is very obvious then, that the more boundaries you have the slower your application will run, because each boundary has a geometric impact on performance.

 

Worse, boundaries add raw complexity to software design, network infrastructure, manageability and overall maintainability of a system. In short, the more tiers in an application, the more complexity there is to deal with – which directly increases the cost to build and maintain the application.

 

This is why, in general terms tiers should be minimized. Tiers are not a good thing, they are a necessary evil required to obtain certain levels of scalability, fault tolerance or security.

 

As a good architect you should be dragged kicking and screaming into adding tiers to your system. But there really are good arguments and reasons for adding tiers, and it is important to accommodate them as appropriate.

 

The reality is that almost all systems today are at least 2-tier. Unless you are using an Access or dBase style database your Data layer is running on its own tier – typically inside of SQL Server, Oracle or DB2. So for the remainder of my discussion I’ll primarily focus on whether you should use a 2-tier or 3-tier model.

 

If you look at the CSLA .NET architecture from my Expert VB.NET and C# Business Objects books, you’ll immediately note that it has a construct called the DataPortal which is used to abstract the Data Access layer from the Presentation and Business layers. One key feature of the DataPortal is that it allows the Data Access layer to run in-process with the business layer, or in a separate process (or machine) all based on a configuration switch. It was specifically designed to allow an application to switch between a 2-tier or 3-tier model as a configuration option – with no changes required to the actual application code.

 

But even so, the question remains whether to configure an application for 2 or 3 tiers.

 

Ultimately this question can only be answered by doing a cost-benefit analysis for your particular environment. You need to weigh the additional complexity and cost of a 3-tier deployment against the benefits it might bring in terms of scalability, fault tolerance or security.

 

Scalability flows primarily from the ability to get database connection pooling. In CSLA .NET the Data Access layer is entirely responsible for all interaction with the database. This means it opens and closes all database connections. If the Data Access layer for all users is running on a single machine, then all database connections for all users can be pooled. (this does assume of course, that all users employ the same database connection string include the same database user id – that’s a prerequisite for connection pooling in the first place)

 

The scalability proposition is quite different for web and Windows presentation layers.

 

In a web presentation the Presentation and Business layers are already running on a shared server (or server farm). So if the Data Access layer also runs on the same machine database connection pooling is automatic. In other words, the web server is an implicit application server, so there’s really no need to have a separate application server just to get scalability in a web setting.

 

In a Windows presentation the Presentation and Business layers (at least with CSLA .NET) run on the client workstation, taking full advantage of the memory and CPU power available on those machines. If the Data Access layer is also deployed to the client workstations then there’s no real database connection pooling, since each workstation connects to the database directly. By employing an application server to run the Data Access layer all workstations offload that behavior to a central machine where database connection pooling is possible.

 

The big question with Windows applications is at what point to use an application server to gain scalability. Obviously there’s no objective answer, since it depends on the IO load of the application, pre-existing load on the database server and so forth. In other words it is very dependant on your particular environment and application. This is why the DataPortal concept is so powerful, because it allows you to deploy your application using a 2-tier model at first, and then switch to a 3-tier model later if needed.

 

There’s also the possibility that your Windows application will be deployed to a Terminal Services or Citrix server rather than to actual workstations. Obviously this approach totally eliminates the massive scalability benefits of utilizing the memory and CPU of each user’s workstation, but does have the upside of reducing deployment cost and complexity. I am not an expert on either server environment, but it is my understanding that each user session has its own database connection pool on the server, thus acting the same as if each user has their own separate workstation. If this is actually the case, then an application server would have benefit by providing database connection pooling. However, if I’m wrong and all user sessions share database connections across the entire Terminal Services or Citrix server then having an application server would offer no more scalability benefit here than it does in a web application (which is to say virtually none).

 

Fault tolerance is a bit more complex than scalability. Achieving real fault tolerance requires examination of all failure points that exist between the user and the database – and of course the database itself. And if you want to be complete, you just also consider the user to be a failure point, especially when dealing with workflow, process-oriented or service-oriented systems.

 

In most cases adding an application server to either a web or Windows environment doesn’t improve fault tolerance. Rather it merely makes it more expensive because you have to make the application server fault tolerant along with the database server, the intervening network infrastructure and any client hardware. In other words, fault tolerance is often less expensive in a 2-tier model than in a 3-tier model.

 

Security is also a complex topic. For many organizations however, security often comes down to protecting access to the database. From a software perspective this means restricting the code that interacts with the database and providing strict controls over the database connection strings or other database authentication mechanisms.

 

Security is a case where 3-tier can be beneficial. By putting the Data Access layer onto its own application server tier we isolate all code that interacts with the database onto a central machine (or server farm). More importantly, only that application server needs to have the database connection string or the authentication token needed to access the database server. No web server or Windows workstation needs the keys to the database, which can help improve the overall security of your application.

 

Of course we must always remember that switching from 2-tier to 3-tier decreases performance and increases complexity (cost). So any benefits from scalability or security must be sufficient to outweigh these costs. It all comes down to a cost-benefit analysis.

 

Thursday, July 21, 2005 7:32:26 PM (Central Standard Time, UTC-06:00)
Rocky,

You have a real knack for explaining complex topics. Keep up the good work... some of us out here really need the help. :)

"Scalability flows primarily from the ability to get database connection pooling."

Wouldn't scalability also flow from the ability to add servers (server farm) to physical tiers? I know how much you like marketing hype :) based on reading your book and your blogs, but ... the marketing hype I always heard regarding reasons to move from client server applications to n-tier was the ability to add servers in physical tiers rather than have to throw $ at the expensive dbms server when client server hit the wall. I'm still in n-tier theory land (ramping up the learning curve), and you live in the real world... what's been your experience.











Mike
Thursday, July 21, 2005 11:19:44 PM (Central Standard Time, UTC-06:00)
I really wish you wouldn't state the obvious. Every time I hear you speak you state the obvious crap that we already know. To say that Pat Helland was the best in the SOA space, well shit, what about everything else he has done? For god's sake man, become informative, say something that somebody else hasn't already said. Shit, every time I see you it's about n tier this and soa that....how about reality?
Fred
Friday, July 22, 2005 12:49:18 AM (Central Standard Time, UTC-06:00)
Ahh, but obvious to whom I wonder hmm? ;)
Friday, July 22, 2005 1:14:53 AM (Central Standard Time, UTC-06:00)
It appears you talk of an application server as only hosting the data access layer (DataPortal) in your technology. I must say this is quite different from what I am used to. In my opinion, an application server should host both business logic and data access. This way adding a cluster of application servers would really improve scalability of an application. And about performance - adding a tier decreases only performance when considered from the point of view of a single user (increases response time), but the overall performance of the application when considering maximum load should not be much different from a 2-tier architecture. Of course, this is provided that the upper tiers (e.g. web servers) do not waste resources while waiting for responses from the lower tiers (e.g. application servers).
Magnus
Friday, July 22, 2005 2:50:17 AM (Central Standard Time, UTC-06:00)
Hi Rocky please guide me..

We have 2-tier system..no other option..its a POS system where clients are connected to a server which hosts database..now i want to use CSLA.net..our clients are windows form based(smart client maybe)..now how should I distribute layers,using remoting?
UI + BL + Dataportal on client
BL + Dataportal + database on server..
Also, how to achieve fault tolerance in single-server scenarios..any link on net where i can get some info,or any good books..
vishal
Friday, July 22, 2005 8:32:50 AM (Central Standard Time, UTC-06:00)
Rocky,

Jeeze... hope old Fred there didn't run you off. I was interested in your input regarding scalability achieved by adding servers to physical tiers (Web and Windows apps). I guess with Windows CSLA, the potential is PC -> application server (DataPortal) -> dbms. I would seem the DataPortal application server becomes the candidate for physical tier scalability. With Web CSLA, the potential is Browser Monstrosity :) -> Web server (UI and BL) -> DataPortal (CRUD) -> dbms. It was my understanding that one might create specific server types (i.e. web servers) for physical tiers. The web server farm would be built for IO intensive, and application servers maybe processing/CPU intensive. It looks like with Web CSLA you always have BL coupled with the web server. I've always found the topic of server farm scalability complex.... just looking for input from someone who has been there... done that.

Mike
Friday, July 22, 2005 5:47:13 PM (Central Standard Time, UTC-06:00)
Is distribution really a config-time decision? Should we be making tools (like Indigo) that make it so easy to distribute that we can really do it all at config-time?

I think not.
Saturday, July 23, 2005 11:01:02 AM (Central Standard Time, UTC-06:00)
Obviously distribution is first a design-time consideration - then a deployment consideration. In other words, at deployment time you can only distribute layers if they were designed to be distributed. This means that at design time you must choose where the _option_ will be to distribute later. In CSLA .NET for instance, the option to distribute is designed such that the Business layer can sit on the client, but can also sit on the application server if so desired (because it is a mobile object model).

So while Indigo makes it easy to configure your distributed components, that is only useful if the layers were _designed_ to enable that distribution in the first place. Indigo is no more a miracle cure than any other network communication technology throughout the history of computer science.
Saturday, July 23, 2005 11:30:52 AM (Central Standard Time, UTC-06:00)
Mike, the scalability benefit you are talking about with the database server flows from the idea that the database server is often the hardest tier to scale out. In other words, it is comparatively quite hard to add more database servers to gain scalability. Thus most n-tier architectures (2+) tend to be geared toward shifting processing off the database server and onto another tier. Sometimes this is the client, sometimes this is an application server. See my more recent post for a comparison of thick client vs thin client 3-tier models. But in either case (and also in the mobile object case) a key benefit is that processing is shifted from the database tier onto some other tier. The core idea is that it is easier to scale the client workstations and/or the application servers than it is to scale out a database server.
Saturday, July 23, 2005 2:24:58 PM (Central Standard Time, UTC-06:00)
Rocky,

You said in your most current post:

"There are loose analogies in the web world as well, but personally I don’t find that nearly as interesting, so I’m going to focus on the intelligent client scenarios here."

I'm primarily interested in Web / CSLA at the moment, so maybe I can get you interested for just a comment or two. :)

First, using your excellent technique of starting with vocabulary first:

CSLA CRUD = all DataPortal (which can be remoted) standard methods
BL = everything else, method/code that is invoked locally (client PC with Win Forms and IIS web server for web apps.

With CSLA, a 4 tier web app would be Browser Monstrosity -> Web Server (Presentation & CSLA BL) -> Application Server (CSLA CRUD) -> dbms.

Let's assume I wanted to have specialized servers for their specific function (finally tuned web server/s for IO, and a different type of server for the application server/s). Note: I asking this as a basis of CSLA education... I buy your arguments and would always lobby for 3-tier web apps rather than 4-tier. Obviously all DataPortal CRUD is on the application server. It would seem to me that some process intensive BL code (that is not CRUD) would be better moved to the application server. I think I read one of your blogs once where you said "what else is there beyond CRUD" responding to a similar question, which I didn't understand. Let me try the following example.

Let's say I retrieved several thousand rows off the dbms (CRUD). Then, maybe I need to cycle through those rows (collection) for various reasons (BL). For whatever reason, I would rather have this processing done on the application server rather than the web server (this lingo is also confusing to me, because the application server is also driven by IIS which sounds like web server to me, but I digress). Doesn't this represent a situation where I would like some non-CRUD processed via the DataPortal? It seems like I read one of your blogs or CSLA comments regarding Command execution which goes through the DataPortal also. If you could help clarify all of this for me, I would greatly appreciate it.

Actually, what I was remembering was using a Command to execute a stored procedure through DataPortal.

http://www.lhotka.net/Articles.aspx?id=2e980a8b-8bdf-4f83-ab18-12e40c6bb04d

I asking about something a little different... i.e. I have a large collection that I would like to process through on the remoted application server.











Mike
Friday, October 14, 2005 3:05:55 PM (Central Standard Time, UTC-06:00)
I'm late to this posting but it's relevent to my current situation and so I will ask a question.

In developing a new WinForms application, is there still a strong reasoning behind using a traditional client/server approach (2-tier) or is a "going-forward" service oriented arcitecture approach more appropriate. Would it be appropriate to move the business logic layer to an app server and move the data access layer to the data tier and move data back and forth between service layers via messages?

I know there is currently some concerns about performance of large resultsets, but most of this can be managed with customized serialization methods.

Do we still do client/server for internal applications or should we change? I see so much value in having a procedural service layer over my domain model that I'm struggling with writing _another_ tightly bound client/server application.

Does anyone have any thoughts?
Sunday, May 07, 2006 12:17:29 PM (Central Standard Time, UTC-06:00)
ok
Tuesday, May 09, 2006 7:40:39 AM (Central Standard Time, UTC-06:00)
Rocky,you are the real great guy!

Finally, i came to find that N-tier for customer application and web application is quite different.

For web application, browser,webserver,database struture(should be classified into 2-tier from the viewpoint of database) already has merits of N-tiers, e.g. scalability, fault tolerance. That's why people confused when they find true 3-tier solution(e.g. j2ee) for web often lead to unnecessary cost.

steve pan
Monday, December 25, 2006 6:40:46 AM (Central Standard Time, UTC-06:00)
Thanks for the sharing your knowledge on the page: http://www.lhotka.net/WeBlog/PermaLink.aspx?guid=efa88d0a-2388-4909-bee1-c9bddb6e9868

With regards to this phrase:
"By employing an application server to run the Data Access layer all workstations offload that behavior to a central machine where database connection pooling is possible."

I would like to propose here that we separate the application layers from the database layer as long as it is possible. The main reason is that in some cases (with Oracle RAC for example), you want failiure tolerance.
If the database server and the application server are on the same machine and that machine goes down RAC and similar technologies can switch to another database server but nothing will switch the application.

Separation of the servers leads to a some-what more fault-tolerant application.

I don't consider the above to be a fact, it is just how I think. Feedback is appreciated.
Emmad
Monday, December 25, 2006 10:13:18 AM (Central Standard Time, UTC-06:00)
Yes, certainly. That is precisely what I'm talking about in my discussion: an application server that runs the ADO.NET code (or similar), which is a separate machine from the database server.

But that is only possible if good layering is done when the application is architected. Before you can debate where the data access layer should run, you must actually have a data access layer :)

Given a data access layer, then you can decide where to run that layer based on the requirements of your environment and application. You might run it on the client workstation, on an application server, or I've even seen people run that layer on the database server itself in some scenarios.

But don't confuse my "data access layer" with something like stored procedures, which I consider to be part of the "data storage layer".
Comments are closed.