|
Getting a little closer to SOA by Fabrice Marguerie |
---|---|
SOA. This acronym is everywhere and seems to be the next revolution. But how to put it into practice? Are we going to have to forget most of what we know today and learn yet a new technology? Moreover, is it so revolutionary? |
M aking SOA is à la mode . At least, it is a word to know if you don't want to seem ridiculous in cocktails... However, you won't find a definition for SOA in this article. I will only introduce some rules and we'll try to determine what's new compared to your current system and habits, in order to help you to make your applications ready for tomorrow's needs. Hopefully, I also hope to show you that SOA can be of some interest for you and that all of this is not only about promises...
I don't want to get into theories such as the one Microsoft pushes with Indigo (integrated with Windows Longhorn) and that you'll deploy maybe in 2008... (and probably a lot of things will have happened till there). Anyway there is no exhaustive definition, nor absolute rules. SOA is still maturing. Hence, this article will be more about practice than theory. I won't speak about the entire SOA approach, but will only present you my personal interpretation.
I want to believe that SOA can be a source of benefits before being a too big constraint. I'll stick to a pragmatic approach that will help you to drive your developments towards SOA and to design your applications today so that they are ready for tomorrow's evolutions.
I'll start with a brief introduction to SOA. We'll try to determine whether it is a simple evolution or a totally new way to design our applications. We'll try to understand if there is any conflict between SOA and OOP, then we'll review some rules.
Nota Bene: in this article, you will not only find considerations about SOA but also about N-tier architectures and some best practices.
SOA, SOA... Yet a new acronym for some technology Microsoft and co. try to impose us for their own interest...
Well, in fact not exactly. It's more about philosophy that technology. Moreover, it is maybe a philosophy you already apply instinctively. I hope you'll be able to judge by yourselves at the end of this article.
SOA means “Service-Oriented Architecture”. According to DotNetGuru, SOA is:
“Une vision d'un système destinée à traiter toute application comme un fournisseur de services”, Symposium DNG 2003. (“The vision of a system in which each application is seen and managed as a service provider”)
In fact, HP and IBM have been speaking about SOA for some time with a different meaning . Microsoft swears Longhorn will be the ideal platform for SOA, according to their own definition .
SOA is a state of mind. It allows orienting our design towards interoperability and reuse. SOA can be implemented right now in your centralized applications. It is also the message I'd like to formulate here.
Modern applications aren't monolithic anymore and have to smoothly integrate with the rest of the enterprise information system. Which means interacting with existing systems (and platforms and applications), and designing with a future reuse of the new business or technical modules in mind.
Main concepts of SOA are:
Reuse and composition, enabling to share modules between applications and inter-application interchanges.
Permanence, which implies supporting current and future technologies.
Flexibility, since every application lives, has a precise life cycle, can be enriched with new modules and has to answer new business needs.
Openness and interoperability in order to share modules between platforms and environments.
Distribution, so that modules can be remotely accessed and so that they can be centralized
Performance, especially scalability.
This document will focus on defining the main criteria that will ease the transition of applications on those axes.
Of course, you may not be interested in all of those objectives. Those theoretical rules have to be adapted or forgotten if they have no added value in your case. For example, we'll sometimes have to optimize applications in one specific platform (.NET, J2EE...) and sacrifice the independence to gain performance.
Multi-tier architectures are now pretty well accepted and implemented in applications. They lie at the heart of our preoccupations while considering SOA and its impact on logical and physical layers of our applications. Our first schema will quickly remind us of classical three tier architectures:
Interactions between the layers in a 3-tiered architecture
In order to take all the advantages of such an architecture, some aspects have to be respected. Those aspects have to do with interactions between the layers and with the design of each layer.
In order to maximize flexibility and reuse, layers have to be de-coupled one from another.
The communication between layers depends on the physical architecture of the application. When layers are located on remote machines, .NET Remoting or WebServices can be implemented. When all the layers sit on the same machine, performance can be optimized by making the layers call one another directly in memory.
Anyway, the different layers will have to exchange data, not object references (so that data can be grouped and round-trips can be reduced between the layers).
Then, two rules apply:
Data objects must be serializable into binary or text streams.
Data objects must be decoupled from their original data sources. A data object can be stored in many ways in many containers, hence it is important not to directly include the mapping code in the data objects themselves.
We'll come back on those data transfer objects later.
A data access layer is responsible for interacting with databases by executing selection or modification queries.
The code we put in such a layer is inherently coupled to a particular database. For example, a data access layer that has been designed for Oracle won't be compatible with SQL Server. All the same for a data access layer that has been designed for storing data in XML format, which won't be usable as is to store data into an LDAP directory.
In the following diagram, the BLL layer can use one of the specific DAL layers (which are specific to each database):
In order to guarantee that BLL and DALs stay independent, an abstraction mechanism must be used. This mechanism is described by the Factory pattern.
The Factory design pattern provides an interface to create object instances without specifying their concrete implementation classes.
This pattern allows the BLL to manipulate IDalInvoice objects coming from the DAL without depending on a specific DAL implementation. Here is how the Invoice DAL would be designed:
The BLL layer would only have to use the DalFactory in order to get a specific DAL implementation and hence manipulate this implementation through the IDalInvoice interface.
There are two philosophies: the Object-Oriented one and the Service-Oriented one. Maybe you already belong to the second one. Maybe you are an Object ayatollah. Of course, we are interested in the service oriented approach in this article but it is good to know where lie the differences. This is a recurring question, so let's have a quick look to both approaches so that you can recognize which one you belong to.
Here is a three tier typical architecture with an object model:
We can notice the dependencies between the presentation layer and the business objects. The client code must interact with the object model of the business layer, which increases the coupling and requires an important amount of calls between those layers. You'll understand that such an amount of methods invocations between layers is a problem when business objects are located on a remote machine. Likewise, the amount of business objects that the presentation layer has to manipulate reduces the independence between the layers and makes it difficult to learn how to use the business layer.
Code sample:
Customers customers = Customer.List();
Orders orders = customers[0].Orders;
Order order = orders.Add("ORDER001", customerData.Customers[0]);
order.Lines.Add(new Product("53XYPR0D8"), 2);
orders.Save(); // If the update hasn't been done in real-time
We interact with the the Customer , Order and Product business objects. We ask here to the Order business object to take our modifications into account. In some models, updates are directly executed on every call to the business objects. Calls to the server are emphasized by bold characters.
Here is a service oriented architecture that would rely on the same business objects:
You'll have noticed that we introduced a new abstraction layer called “Services” . The Presentation layer doesn't directly manipulate the business objects any more, but uses the services to access them. Business objects are located in class libraries that the services load into memory – since the service layer and the business layer are now located in the same process, method invocations on our business objects don't suffer any overhead.
Services behave as black boxes: they give an abstraction of the object model and only expose a reduced set of features, which reduces the interchanges between layers.
Code sample:
CustomerData customerData = CustomerService.ListCustomers();
OrderData orderData = new OrderData();
OrderEntity order = orderData.Orders.Add("ORDER001", customerData.Customers[0].CustomerID));
order.Lines.Add("53XYPR0D8", 2);
CustomerService.AddOrders(orderData);
The CustomerService plays with data objects such as CustomerData and OrderData. Clients prepare data objects and send them in one call. Now it's up to the service to implement updates. This approach reduces the amount of calls to the server and groups data: methods granularity has increased.
As we have seen, the “full object” approach has its limits. Contrary to what we could have expected, those limits also include reuse. In the “full object” model, the layers are coupled one to another. When we reduce this coupling and reduce the dependencies by eliminating direct calls between objects of different layers, our modules become easier to reuse.
In the object model, the different layers are also coupled in time , during the whole life cycle of objects. We can consider that an object that is created and returned by a lower level layer stays alive as long as it is used by the upper level layer. This mustn't be the case in a service oriented architecture, where inter-modules calls must support asynchronous and / or disconnected modes. However, object don't loose their value. But Object-Oriented programming and Design Patterns will only be used inside each responsiblity layer.
Rumor: SOA implies the death of OOP.
At least it is a message Don Box spreads (Don Box is the architect of Indigo). Unless it is due to some journalist who wanted a pushy title... When we read articles whose title says “Don Box from Microsof says SOA paradigms will eclipse object-oriented programming”, we understand where the trouble can come from. But let's clarify the situation: this is not true, SOA won't replace OOP. May it be a journalistic deformation or an awkward declaration from Don Box, it's time to dispel the fears and to be objective.
To contrast OOP and SOA is a bit like saying that OOP will be eclipsed by AOP (Aspect Oriented Programming). The least we can expect from someone having a professional integrity is to avoid excesses like “that's it, I discovered the new technology that will replace everything else!”. But I digress...
Moreover, we could even think of SOA as being a come back to pure Object-Oriented Programming. Do you remember objects theory? Objects exchanging messages and calling one another? These ideas have been left aside at least in C++, Java and .NET. Well, maybe SOA will bring them back into fashion.
Rumor: in SOA, business modules communicate by exchanging messages.
Well, I think we can design an SOA without any message and apply most of the SOA concepts anyway. Once again, you don't have to throw everything away nor to start again from scratch.
Rumor: we cannot speak about SOA without business processes, orchestration and transactions.
Do you really think your applications have become so complex overnight? Do you really need to consider all those problems, to use all those standards and learn all those programs today?
If you don't respect the SOA precepts, your applications are doomed to a tragic fate, your architects will burn in hell and your project managers will be sentenced to hard labor to reimburse the huge losses your company will soon accumulate. No, let's be serious, you have your time, and maybe you already are an unaware pioneer of SOA.
If you base your development on an approach close to Duwamish Books 7, you're on the right track. Duwamish is a sample application proposed by Microsoft. The .NET version is released with VisualStudio.NET.
Duwamish design is a bit old fashion, but it already introduced some concerns we find back in SOA. You'll notice the separation into layers, and above all the use of DataSets to transfer data between layers. DataSets introduce data objects (business entities) which are serializable into XML and disconnected from the data source.
Here is a simplified diagram that sums up the design of a .NET application that integrates the classical separation of concerns into layers and SOA:
What we can notice in this diagram:
Layers aren't tightly coupled one to another
Layers exchange Data Transfer Objects (DTO)
Communications are terse, but messages can be bigger than in a pure object approach.
The Common module (or assembly) contains classes shared by all the other layers such as data objects. Of course, this can only be true if all the layers are implemented using the same technology.
The whole development lies on a hand-made framework, not only on .NET or J2EE.
The Controls library contains graphical components used by the Presentation layer.
As you can see if you compare diagrams, SOA can be progressively introduced since it isn't completely opposed to existing applications architectures.
Uncoupling is the fundamental concept of SOA. We'll find it in many occasions. For example, here are three aims of SOA where uncoupling appears:
to reduce the coupling between modules (to reach a better reuse)
to reduce the coupling towards the platform and infrastructure (for a better interoperability)
to reduce the coupling between the client of a service and a specific implementation of this service (for a better flexibility)
The keyword in Service-Oriented Architecture is of course “Service”. It isn't so easy to define the concept of “service” even if many persons already tried it. I'll humbly propose my own definition:
“A service is a module that can be invoked, that is assigned to a specific function and that offers a well defined interface”.
For example, you can draw a relationship with the different services a hospital provides.
It is important to understand that the “service” concept has a broader meaning than a simple “WebService”. We can do SOA without WebServices and reciprocally to implement WebServices doesn't mean we do SOA. WebServices propose one technical solution to implement SOA, the same way message queuing can do for example.
There are many kinds of services. A service can be local or remote. Examples: a service available via SOAP and WSDL on the Internet, a database, a directory...
A service is created in the context of an application (it is rare to publish services without any reason) but has to be designed so that it can be reused in other contexts. Hence, we'll try to forecast the future uses of a service and the associated constraints. Similarly, services must be designed so that they can be used both locally and remotely. A late adaptation to remoting would require a significant amount of work. Respecting design rules such as the Facade or the Data Objects greatly facilitates an evolution to remote applications.
Of course, applications and services will behave differently in different contexts. For example, a service that has been optimized for remote calls won't be optimal for local calls, and reciprocally. However, it is possible to be compatible with both usage modes. For example, if methods signatures are proposed in two versions (one that optimizes calls when they are made in the same process, the other that allows to physically distribute layers), it is possible to switch from local to remote mode without changing the code.
For example, we could call void ListCustomers(DataSet) locally and DataSet ListCustomers() remotely. The first version enables the Presentation layer to pass a DataSet to be filled. This doesn't fit well with remote invocations since it would imply many inter-process or inter-machine calls. Hence, the second version returns a DataSet that will have been created and filled by the ListCustomers() method of our service, and then serialized to the client.
A service must play a unique and well defined role. In the following diagrams, we present the situation when a service needs to authenticate the caller.
A
service that implements authentication
It would be better to isolate the authentication outside the service. In fact, this feature should become a service by itself.
After
the authentication has been externalized
This way, the service only has a business role and relies on another service whose business role is to authenticate users. The added value of this factorization is that yet another service will be able to take advantage of the authentication service. It also avoids to duplicate authentication code in each service. Hence, we respect the DRY principle (Don't Repeat Yourself): “ Every piece of knowledge must have a single, unambiguous, authoritative representation within a system”.
This reasoning will also help us to implement a Single Sign On (SSO) mechanism. Yet another evidence that this operation was well-founded.
Modules must be designed from an outside point of view: designers should take the point of view of a user of the module that is designed. What will be exposed to the user of this module is called a facade or the service interface, which has to be opposed to the implementation that contains the processing code.
A
business module cut off
-> Simplify the interface of your business modules.
It is very important to simplify the interface our modules publish to the outside world. We must provide users with a public interface that doesn't force them to manipulate objects directly. That's why we have to use a facade.
-> Our modules interfaces represent a business contract
The second aim is to separate the contract from its implementation. This allows to change implementation without breaking the contract. The same interface can be shared by many implementations. Services can be switched without problem.
The join point between the service and its consumers is the facade. In this situation, it becomes easier to replace a service by an equivalent one since the biggest difficulty is to establish the join between service providers and consumers.
The Facade design pattern simplifies and unifies the interface of a coherent and possibly autonomous sub-system. A facade is a simplified entry point of an API, which makes a set of classes easier to use.
To
implement the facade, the only thing to do is to group all the
features a business module proposes and to expose this group as a
list of methods so that those methods aren't spread over many
classes.
Recommendation: define modules facades before implementing them
It is all the more simple since a facade can be defined from UML use cases. Behaviors that are functionally close will be gathered in a facade (that can play with many business objects). Anyway, when you define use cases, do you think about business objects or behaviors (services invoked by users)?
Nice side effect: Martin Fowler uses the word “Remote Facade” which implies that the facade model is good for distribution. By increasing the granularity, we reduce the round trips and hence increase performances.
Here are the methods we could put in the CustomerService facade:
CustomerData ListCustomers()
CustomerData ListCustomers(filter)
CustomerData GetCustomer(customerID)
void AddCustomers(CustomerData data)
void DeleteCustomer(customerID)
...
... and in the OrderService facade:
OrderData ListOrders(customerID)
OrderData GetOrder(orderID)
void AddOrders(OrderData data)
void DeleteOrder(orderID)
...
Data objects only contain data, no business logic. They can be thought of as simple data structures that vehicle data from one layer to another. Of course, this strict definition can (should?) be adapted in some situation. We'll sometimes allow us to associate a minimal behaviors to those objects.
Data Objects must respect some rules such as:
They must be serializable into XML
They must be independent of any implementation platform (A Data Object coming from a service that is implemented in Java must be consumable from a .NET application for example. XML serialization guarantees this point. In this example, both side have an object, respectively in .NET for the client and in Java for the service, that is the incarnation of the same XML schema).
They must be independent of the data source. The best way to make sure that this point is respected is not to include any persistence code in the Data Objects.
Data Objects have many synonyms, such as Business Entity, Value Object or Data Transfer Object (DTO). The last one has the advantage to explicit the fact that such an object only contain an aggregation of data to be transfered, and not business logic.
The .NET DataSet is an object that respects those rules, that's why it is recommended. But scalar types such as integers, decimals, dates or strings aren't excluded. Any ValueType such as structures or enumerations are also valid. A last remark: be cautious with DataSets since their default serialization scheme make them difficult to use in J2EE. We have to choose an XML format that doesn't couple us to any specific platform.
Here is what the OrderData object can look like:
1 OrderHeader (OrderNo, CustID, OrderDate, ShipDate)
1 Customer (CustID, Name)
1..n Line (ProductID, Quantity)
Such an object can retrieve all the data an order contains in one call. Hence, we avoid to split our invocations into one for the header, one for client infos, and one to get the order lines. A method that would send back this data object could be: OrderData GetOrder(orderId) .
One last note to warn you about Object-Relational mapping frameworks that don't allow you to serialize data into XML. They are plenty, and often not adapted to service oriented architectures. Object-Relational mapping tools often take for granted that the entities you manipulate are linked to data sources that are updated when objects are modified. In SOA, the persistence responsibility falls to a specific layer – persistence code is certainly not located in the entity itself.
Try to ask yourself these questions by the way:
Can I save my data objects into different formats?
Do I master the method invocations that are made to support persistence? Are those calls frequent? When do they occur?
Is it possible to reuse my data objects in another platform? Without the object-relational mapping tool?
Are my business object going to directly connect to the database???
What happens when I distribute my presentation layer or the clients of my service?
Some people may find that I gave an unconventional and simplistic view of SOA. Well, this will only make me proud since my aim was to avoid the complexity SOA may impose to us.
On purpose, I let aside asynchronous invocations, disconnected mode, transactions, business processes, service orchestration and security in order not to make the message too complex. These points shouldn't be neglected, but it is possible to consider them later, without making our developments 100% SOA compatible right from the start. However, I invite you to consider these features at least to know where we are going to, and to keep in mind the constraints you may have to take into account in the future. For example, applications should be designed so that they can work in an asynchronous or disconnected mode even if it is not the case for now.
Here are some topics you can examine to go further:
Shadowfax, that will become the new reference application from Microsoft. It aims at showing how an SOA solution can be set up right now.
Indigo, the WebServices-based inter-application communication infrastructure that will be integrated into Windows Longhorn (a version will be released for Windows XP and Windows 2003 as well)... but that you won't deploy before some years from now... Keep a distant eye on it anyway...
Things about BPM (Business Process Management) and BPEL (Business Process Execution Language)
Biztalk and equivalent products to orchestrate Business Processes
As a conclusion, I invite you not to hesitate to adapt SOA to your applications real constraints (not the opposite). Theories are here to guide us, not to restrain us.
Who
is Fabrice Marguerie?
Fabrice Marguerie is a .NET architect for Masterline. Fabrice works as a consultant on design and implementation missions. He has created and developed Masterline's .NET framework. He also writes a weblog in English: http://weblogs.asp.net/fmarguerie and administrates the web site: http://sharptoolbox.com
His company: Masterline
Founded in 1989, Masterline is a service and consultancy company which mainly targets three markets: business intelligence, e-business and SAP. Masterline proposes technology consultancy, engineering and externalization services whose first aim is to create value for its clients. Masterline builts its competences (.NET, Java, Object and UML...) around dedicated competence centers and is a Microsoft partner since 1994.
Translator (French to English): Thomas GIL
Copyright DotNetGuru January 2004
Shadowfax: http://workspaces.gotdotnet.com/shadowfx
PetshopSOA: http://dotnetguru.org/article.php?sid=286
Duwamish: Documentation – Powerpoint slides
Udi Dahan weblog :
http://udidahan.weblogs.us/archives/012683.html
- I'll see you when you get there
http://udidahan.weblogs.us/archives/012770.html
- But that trick never works
15 seconds - Realizing a Service-Oriented Architecture with .NET: a simple and efficient web service oriented introduction http://www.15seconds.com/issue/031215.htm
The Pragmatic Programmer : if it is not yet your bedside book, go and get it right now! To read again and again. The fifth chapter "Bend, or Break" speaks about SOA while the word didn't exist yet. The authors also have a web site: http://www.pragmaticprogrammer.com/