A really useful post by Edwin, Ronald and Demed Start Small, Grow Fast.
Start Small, Grow Fast
by Edwin Biemond, Ronald van Luttikhuizen, and Demed L’Her
A set of pragmatic best practices for deploying a simple and sound SOA footprint that can grow with business demand.
Published January 2012
Table of Contents
Does your organization have middleware administrators?
Scripting versus manual configuration
Searching instances – composite sensors and bpel:exec
Negative testing – an absolute requirement
Other administrative considerations
Build a cluster – even if only with a single node
Think about your domains
Hardware, database and OS choices
Purging & Backup – how will you handle database growth?
Centrally storing artifacts in MDS
Canonical models – keep it simple
Service design guidelines and naming conventions
Wrap frequently used Oracle SOA Suite APIs in simpler custom APIs
Building Agility – using DVMs and Business Rules
Oracle Fusion Middleware
In an ideal world, everyone would start with Service-Oriented Architecture (SOA) in a very systematic and strategic fashion: hiring a team of developers and architects, surveying existing assets in the enterprise, coming up with an exhaustive enterprise-wide architecture, and deriving from there the corresponding hardware and software needs. As you can imagine, such a comprehensive approach requires considerable amounts of time, resources, and upfront planning that not everyone can afford. We often see companies looking at deploying an integration infrastructure in a much more nimble, tactical way; for instance to simply interconnect two applications. At the same time they would like to take the opportunity to lay a foundation that can eventually grow and support a more comprehensive SOA. The good news is that these two objectives are not irreconcilable.
This document highlights a short set of simple and pragmatic best practices that will help you build a solid foundation for the future. These practices span several domains: administration, infrastructure, development, and architecture.
You might find it surprising that this article begins with administrative considerations, since SOA efforts generally originate from the development or architecture side. Unfortunately, operations teams are very often involved too little and too late in SOA projects. Developers often dump a project into the operations team’s lap at the last minute, with unrealistic deadlines. This typically results in dramatically revised project timelines, or projects that go live without the required infrastructure in place, putting at risk months and months of careful development work.
A concrete recommendation would be to include in your virtual team at least one system administrator from the get-go. Ensure that the operations team is trained on the technology and that the team gets involved in the sizing, security, manageability, and acceptance criteria for go-live.
Remember that in the end your management team will measure your success by your adherance to end-to-end project timelines as well as by the stability and performance of the application – rather than by the elegance of the internal composite and BPEL design. An able and willing operations team is integral to the success of any SOA project.
Does your organization have middleware administrators?
The skills and responsibilities of administrators vary from organization to organization. Administrating middleware components such as Oracle SOA Suite and Oracle Service Bus (OSB) is a different ball game from administrating databases, networks, or file systems. Make sure your organization employs or hires administrators that know their middleware, even when the environment is small and easily manageable. Since SOA components run in Oracle WebLogic Server and use an Oracle Database, application server administrators or database administrators often are the ones who end up taking over SOA administrative responsibilities. Ideally, the administration of production environments should not be left up to developers – for skills, priorities and accountability reasons.
There are three options for deploying SOA composite applications:
- using an Integrated Development Environment or IDE (Oracle JDeveloper);
- using an administration console (Oracle Enterprise Manager);
- using scripts (WebLogic Scripting Tool and ANT).
Administrators rarely have (or want) IDEs in their environments that restrict that first deployment option to developers. In test, acceptance, and production environments administrators can either use Oracle Enterprise Manager or Ant and WebLogic Scripting Tool (WLST) scripts. While deployment through Oracle Enterprise Manager might seem feasible when dealing with only a few composites, scripting should always be the preferred approach. Stepping through deployment dialogs can quickly become tedious, but more importantly, it increases the chance of human error (wrong partition, wrong configuration plan, etc.). Start early in the project with the creation of Ant and WLST scripts for building, deploying, testing, and starting your Service Component Architecture (SCA) composites and Metadata Services (MDS) repository. The Oracle JDeveloper integration directory and the bin directory of the Oracle SOA Suite Home contain custom (SCA) Ant tasks that can be used for this purpose. WLST has specific commands for SCA composites as well. See Oracle Fusion Middleware Administrator’s Guide for Oracle SOA Suite and Oracle Business Process Management Suite  for details.
SOA composites are dependent on many other services and assets, such as databases, web services, Enterprise JavaBeans (EJB), or messaging providers, such as Advance Queuing (AQ) or Java Message Service (JMS). As composites flow from one environment to another (Development, Test, Acceptance, and Production – DTAP) the references to these dependencies must be updated. A web service that is invoked from a particular composite might be running on different ports in the test and production environments. A best practice is to use the out-of-the-box Oracle SOA Suite configuration plans to capture all environment-specific information—such as web service endpoints, Java EE Connector Architecture (JCA) settings, and Oracle Web Services Manager (OWSM) policies—in a separate configuration file instead of hardcoding them in your composites. Oracle WebLogic also supports WebLogic Server Deployments plans, but this is more for Java 2 Platform Enterprise Edition (J2EE) and Oracle WebLogic deployment descriptors and can’t be used with SOA composites.
Another best practice is to avoid creating a separate configuration plan per composite per environment. For example, suppose the number of SOA composites you deploy increases to 145 (not an unrealistic number). You would then have (145 * 4 =) 580 separate configuration plans to maintain! It is better to consolidate these into a single configuration plan per environment. As these files become too big and hard to maintain you might have to start looking at something more granular – for instance, by having one plan per SOA partition per environment. The same plan will be used to deploy all SOA composites to that specific environment. Eventually this will result in greater consistency across projects, and significantly fewer chances for error.
Figure 1: a typical deployment scheme leveraging one
configuration plan to account for environmental differences
Scripting versus manual configuration
A SOA environment typically consists of different components and intermediaries, such as OSB, Oracle SOA Suite, databases, and Java/JEE applications. Their respective configurations are not centralized, but are split over various files and repositories. Those configurations can be security settings (LDAP, WS-Security headers), messaging settings (JMS, AQ), database settings (JDBC), logging settings, endpoint settings (WSDL URLs), etc. If you configure each of these systems by hand it will quickly become very hard to track what you are doing. In addition, as you expand your SOA footprint and need to provision new environments, you will want them to be as close as possible to the existing ones to facilitate maintenance and troubleshooting; nothing is more frustrating and time-consuming than environment-specific problems!
The recipe to consistency is scripting: script as many configuration tasks as possible from the beginning. While this requires some upfront investment, it will ensure that your provisioning procedures are consistent and that all the expected benefits derive from there: easier maintenance and troubleshooting, faster turnaround time, and simpler updates. Also, configurations should be documented in a well-known central location, such as a team wiki, where they will be readily available to all stakeholders.
Concrete examples of tasks that you should consider scripting:
- Creation and management of users, groups, and roles in an LDAP repository using LDIF scripts.
- Creation and configuration of WebLogic Domains using WLST.
- Creation and configuration of JMS and JDBC artifacts using WLST or configuration based on Domain templates.
- Installation of all software, using silent installs and response files . This applies to the roll-out of new environments as well as to the addition of new nodes to an existing cluster.
WLST is a great tool for this. More on WLST can be found in Oracle Fusion Middleware Oracle WebLogic Scripting Tool .
Searching instances – composite sensors and bpel:exec
Every now and then a business user or administrator wants to know what happened to a specific request that entered the system, or wants information about the state of a certain process. Answering such questions might be fairly easy in low volume environments – a couple of SOA composites and a few dozen instances. It’s a radically more difficult exercise when you have dozens or hundreds of SOA composites and tens of thousands or even millions of instances. It is helpful if the business user knows the exact date of a fault or request. Usually, however, you will want to be able to find the information using specific data, such as an invoice number or an employee name.
Composite sensors are a convenient and out-of-the-box mechanism to enable the search of specific instances based on specific fields contained in the payload. Think of composite sensors as database indexes.
As with database indexes, you should not go overboard – only add necessary composite sensors. Evaluate the need on a per composite basis. For synchronous and short-lived composites that are frequently invoked, one should consider turning off audit trails and simply re-invoking the service in case of an error. In such cases, searching for specific instances does not provide much added value, only overhead. Long-running processes, on the other hand, are excellent candidates for composite sensors.
On a side note, bpel:exec activities can be used to set composite instance titles from BPEL components. Such instance names can also be used to pinpoint instances. However, setting instance titles this way can be somewhat dangerous for a variety of reasons: XPath expressions inside bpel:exec activities are not verified at compile and deploy time and will need to be thoroughly tested. A change in the namespaces or element names on which the instance title is based can cause runtime faults, and most of all, this embedded code is difficult to debug and manage.
Negative testing – an absolute requirement
The single most important piece of advice in this article may be this: make sure to extensively test problems you are likely to encounter in production. Ninety percent of the time, as deadlines loom people keep on polishing and iterating on their design, reducing the time left for testing. So resist the temptation and keep these extra bells and whistles for the next milestone – bullet-proofing your projects is far more important! At minimun, you should be testing and devising recovery procedures for the following:
- Infrastructure database going down (or a network failure);
- Target system going down (or a network failure);
- Target system returning unexpected data (such as an application-level error message).
It’s important to know how the system will behave in the above situations, and how to remediate to them. These problems, in accordance with Murphy’s Law, will typically happen when you are away from the office on vacation, so make sure you document:
- Recovery procedure (in what order to stop and restart services; can instances be re-submitted as-is or does any compensation logic needs to take place first, etc.).
This is your insurance to a good night’s sleep and uninterrupted business. A wiki is a perfectly acceptable place to document such important information, so long as you ensure that it is kept current and always available for your administrators. Ideally, you should also script the remediation procedures (Ant or WLST), but documentation is the absolute minimum.
Other administrative considerations
EIS: Enterprise Information System
The term denotes (integration points to) systems, including messaging middleware (AQ, JMS, MQSeries), packaged applications (EBS), and other systems and platforms (e.g., FTP Server, File System, Mail Servers).
When you use one of the many JCA Resource adapters in OSB or Oracle SOA Suite you need to create a resource adapter plan which contains all the EIS connections of that adapter. This plan (an XML file) will be created in a folder on the Admin Server. It also needs to be made available to each and every managed server. An alternative to creating copies on each local managed server is to use a shared storage, such as an NFS share.
Oracle SOA Suite 11g has SOA partitions (somewhat comparable to “BPEL domains” in Oracle SOA Suite 10g, without the tuning part). Partitions come in handy from an architectural and administrative point of view. Partitions are a mechanism to categorize composites into various groups, much like files are organized in folders. How you want to categorize them is up to you – this can be based on function, business domains, etc. The additional value of partitions is that they make it possible to execute various tasks, such as activation and deployment in bulk-mode, for all composites contained within a partition. You should consider using partitions for easier manageability.
On the infrastructure side, your overall objective should be to build a base system that you can easily expand (to handle more load or to increase your fail-over capabilities) without having to re-architect the entire system. We highly recommend that you take the time to read and familiarize yourself with the full-fledged Oracle Enterprise Deployment Guide  and Oracle Fusion Middleware High Availability Guide . For now, what follows is a short set of must-have considerations.
Build a cluster – even if only with a single node
While the current requirements might not yet call for full high-availability (HA) or high-throughput of messages, things are likely to change in the future. If you are successful in your roll-out (which we do not doubt of for a minute!) you will soon see more demand for your services. At that point you will need to ensure that you can scale out to handle the load and possibly implement non-functional requirements, such as high-availability.
To achieve HA and higher throughput you have two main options:
- WebLogic cluster, no shared storage. The fact that you have more than one server means that your operations will go on (albeit at reduced capacity) should one server die. However, this configuration does not completely prevent message loss. If a server fails you will have to manually restart it (after taking care of the event—such as a hardware failure—that might have caused it to go down in the first place). When the server starts up again it tries to recover by reading all the transaction logs and the JMS file persistence files. If this fails (for instance, because of corrupted disks) you might end up losing some in-flight requests/messages. Also you need to keep the domain on all servers in sync: for example, your resource adapter plans should be located on every server.
- WebLogic cluster with shared storage. In this case all the Managed Servers store the transaction logs and the JMS file persistence files on a shared storage. You would typically enable the server migration option here. Server migration allows the WebLogic Node Manager to migrate the Managed Server from the machine that just failed to a new one (this takes about a minute), add the Virtual IP to this machine, and start the Managed Server. The migrated Managed Server will then resume operations using transaction logs and the JMS file persistence files from the shared storage. This is the best setup to ensure no interruption of service and no loss of transactions.
Once you have more than a few services on your SOA platform and your services are used more intensively, HA will no longer be a nice-to-have option: it will become an absolute must-have.
It is important to note that there is no easy path to grow a single server install into a cluster—you will have to reinstall. This is why it is critical to always start with a cluster. If you do not have enough hardware for a 2-node cluster you can even build a single-machine cluster; this works, too, and ensures that you can expand in the future.
A few more tips:
- Use individual IP addresses for all components. While you can install WebLogic Admin Server, Node Manager, and the Managed Servers using a single IP address and different port numbers, it is not a good idea in production. Instead, give every component its own IP (or VIP in server migration) and use standard ports. The Node Manager can use the machine IP and all the WebLogic Servers have their own IP (you can add extra IPs to a single network card). This way, you can run a cluster on one machine and you can move the Admin Server, or one of the nodes of the cluster, to a different machine at a later point (remove the IP from the NIC and add this to the new machine and startup).
- Use JVM monitoring tools. Leverage Garbage Collection tools such as JRockit Mission Control that can help you to find out if you need to assign more memory to the JVM. You will also be able to tune or detect problems in your Oracle SOA Suite cluster.
- Leverage WebLogic channels. Channels allow you to separate cluster traffic (Oracle Coherence, synchronization, Node Manager traffic, Admin Server traffic) from production traffic. By using a different segment and switch, you reduce the chance that heavy production network traffic will negatively impact cluster integrity or give way to a “split brain” scenario.
- Load balancer. Use a load balancer in an Oracle SOA Suite cluster to divide traffic between services and detect outages of servers in the cluster. You can use either a hardware or software load balancer (e.g., Oracle HTTP Server). For defining EJB and JMS endpoints you can add all the Managed WebLogic Server URLs to the T3 URL to achieve load balancing.
Think about your domains
We assume here that you are already familiar with the concept of WebLogic Server Domains. Domain topologies can range from the simplest (both Admin Server and Managed Server on the same server; no Node Manager) to the most complex (Admin Servers separated from Managed Servers, Node Manager, multiple WLS Domains to divide applications and services in functional domains, etc.). There is no such thing as a “one-size fits all” domain topology; there are too many variables changing from one project to another: available hardware, functional requirements, etc. However, here are a few tips and guidelines that we can provide in this area:
- Use Node Manager. Node Manager is lightweight and very convenient to manage your servers (e.g., stop and start) and necessary for server migration.
- Do not combine Admin and Managed Servers. While the developer install allows you to combine Admin and Managed Servers in a single instance of WebLogic, you should not choose that topology for production environments. However, these two WebLogic instances can be co-located on a single hardware box if necessary.
- Deploy your software and services to Managed Servers. Don’t deploy software and services to Admin Servers.
- Leverage domains to partition environments with different lifecycles or significant functional differences. While you should avoid having too many domains (which would introduce administration overhead), keep in mind that all servers in a given domain need to be managed, upgraded, and tuned together in lock step. For these reasons you should use separate domains to deploy applications from different families (e.g., a domain for Identity and Access Management, a domain for JEE/Java applications, and another one for SOA), applications with different upgrade cycles (all servers in a domain need to be on the same WebLogic Server version), applications with significant functional differences that might call for different tuning (synchronous versus asynchronous for instance), or applications managed by different organizations (e.g., in-house development versus partner-provided development versus off-the-shelf applications).
- Dedicate the most performant hardware to Managed Servers. If you have heterogeneous hardware you should reserve the fastest machines for the Managed Servers (the ones actually processing requests). Admin Servers are used for configuration, deployment, monitoring, etc. As such, their usage is less frequent and often less demanding performance-wise. You might need to make an exception here if you are heavy users of Oracle Enterprise Manager, which can also become resource intensive, especially in high-volume environments. Note that we are talking about performance here – not reliability. For obvious reasons your Admin Servers should also be deployed on very reliable hardware. You might also consider co-locating the Admin Server on the same machine as a managed server to save on hardware resources.
We advise you to read the Oracle Enterprise Deployment Guide (EDG)  for in-depth information on domain topologies and options. Keep in mind that the EDG guide lays out a very exhaustive deployment architecture that may be more complex than is necessary for your situation.
Hardware, database and OS choices
The best OS to run Oracle SOA Suite is usually the one you have the most experience with: from Windows to Linux and Solaris. As of this writing, a typical server to run SOA would be a 64-bit quad-core machine with 16 to 32 GB of RAM. Linux, Solaris, and Windows are the operating systems we most often encounter.
The database is a key requirement for Oracle SOA Suite and more specifically for components that need to persist state, such as BPEL. (On the other hand, Oracle Service Bus makes little use of the database unless you use reporting and OWSM policies.) While Oracle SOA Suite supports a variety of databases for its infrastructure, and can integrate with many more through the adapter, it is highly recommended to deploy on Oracle Database. It is by far the most-traveled path, and Oracle SOA Suite contains many optimizations for that platform, such as support for partitioning, etc. Also, usage of RMAN and Data Guard on the database server is always a good idea.
Try to reproduce your expected production load on an acceptance environment (which should be an exact clone of the production environment). This will ensure that your system can meet expected business requirements, but it is also is the best place to train your administrators.
During load testing, make sure to document a few key statistics. For instance, the average and maximum response time for a given service, or the fact that a given call should result in X instantiations of a given downstream composite. You should also use monitor database growth during the tests and use these results to extrapolate the size of the database you will need in production. Testing is the only reliable way to make this assessment since the amount of data to dehydrate is specific to your composites.
Purging & Backup – How will you handle database growth?
Every time a SOA composite is invoked, data is added to the dehydration store in the database. This is where the state is persisted, and is key to transactionality, auditing, etc. The rate at which the dehydration store will grow will directly correlate to the volume of data the Oracle SOA Suite is processing. One thing is certain: at some point you will have to purge. The purging procedure should be tested early on and be part of your acceptance criteria þ do not postpone this consideration until you are live and running out of database space!
Purging is a complex topic that goes beyond the scope of this document. However, there are four main options for deleting instances from the dehydration store, listed below from the simplest to the most advanced:
- Oracle Enterprise Manager Console
- Looped purge
- Parallel purge
The EM Console option is mostly for deleting specific instances on an individual basis – it is not a purging option per-se. High-volume environments should seriously consider database partitioning, and it is important to note that this option requires careful up front planning (it will require a partitioning license at the database level as well as the re-partitioning of the schemas, therefore precluding the adoption of this strategy post go-live). Low to moderate volume environments can use looped or parallel purging at periodic intervals (for instance, every 24hrs).
Even with the fastest hardware, the best infrastructure, and a great operations team, poorly written software will restrict performance. In this section we will look at design-time considerations to take into account to ensure that SOA composites are performant and maintainable.
Centrally storing artifacts in MDS
Service reuse is key to capitalizing on the benefits of SOA. In order to consume a given service, your project must have access to the service’s interface, usually defined through WSDLs and XSDs. The simplest option is to import these artifacts and to save them locally – in your project. However, this will eventually result in maintaining as many copies of these artifacts as you have consuming projects. Whenever the contract, interface, or version of that service changes, each consuming project will need to re-import and re-deploy the associated WSDLs and XSDs. The local storage of these shared resources is not a very solid approach in the long run: the duplication of the artifacts not only complicates management and synchronization but also has an impact on the overall memory utilization on the server. Oracle Metadata Services (MDS) is a runtime repository, installed out-of-the-box with Oracle SOA Suite, that can be used to centrally store and share reusable artifacts, such as service contracts (WSDLs), canonical data models (XSDs), and fault handling configuration (fault policies and fault bindings) across multiple composites. Composites and components can refer to MDS artifacts using a logical URL (starting with oramds://).
A best-practice is to use MDS to keep a single copy of these artifacts, shared across multiple composites (versus copying them into each and every composite). This will avoid unnecessary redeployments, increase the speed of service changes, reduce overall memory consumption, aid in avoiding invalid composites, and save a lot of time on maintaining your artifacts.
Subsequent sections in this document will highlight the use of MDS for various specific purposes.
Even if you are going to start with only a couple of services, chances are that they are going to be touching some core (business) objects – objects that you are going to encounter again and again in your composites and various future projects. For instance, if you are going to connect your customer portal to your CRM you should spend some time looking at what the schema for your customer objects should be. You don’t need to boil the ocean and define canonical models for every single entity you will be working with in this first project. It is perfectly acceptable to rely on 1:1 mappings for peripheral entities that are used less often. Also keep in mind that using canonical models will have some performance impact: to go from format A to format B you will need to transform from A to Canonical and from Canonical to B, therefore introducing one extra transformation. However, having a canonical model for your core objects is critical, and some careful planning will go a long way in ensuring future re-use and consistency:
- Start by drafting your canonical customer objects in the context of your simple initial project. Think about namespaces you want to use and limit yourself to the complex type level. You can make a schema with basic elements of certain sizes or types and re-use that in your complex types.
- MDS is good place for storing your canonical model (as previously discussed).
- Bringing another system into the picture is a good way to validate your initial choices, for instance, a system that might be part of your next integration project. Will your canonical model still work with this system?
- You should not introduce a network hop just for canonicalization/de-canonicalization: if you are already using OSB to do service virtualization, it is the best place for these transformations. Otherwise you could also use a Mediator within your composite to do the conversion.
- Canonical models are often shared between composites and other systems and services (implemented outside Oracle SOA Suite). Consider using a local model for composites when composites invoke each other. That way, a change in the canonical model won’t break the integration between composites. Use the canonical data model on the integration and mediation layer (typically OSB if it is deployed).
Another best-practice is to keep your service interface definition as lean as possible. Instead of passing in full all business entities your composite operates on, pass only the logical identifiers of these entities. The identifiers can be used inside your composites to retrieve the business entities using elementary services. This mechanism is known as the “Claim Check” pattern. It promotes loose-coupling (since changes in business entities (XSDs) do not break service interfaces), can reduce payload sizes in audit trails and dehydration stores, and makes sure that business entities that are operated on are up-to-date. Instead of dragging possibly stale entities through our services, we retrieve the actual data if and when it is needed.
Contract-first, bottom-up or meet-in-the-middle?
So far in this article we have assumed a bottom-up approach to SOA, one that is organic, driven from an integration project, and without extensive upfront planning. Consider taking a contract-first or meet-in-the-middle approach for those interfaces exposed to the outside world. Many good articles and blogs discuss these approaches. .
When following a bottom-up development approach (for instance, by exposing the WSDL generated by a JCA adapter wizard) you have little control over some attributes, such as naming, namespace, operations, etc. This might be fine for your internal interfaces, but it is less desirable for these outside interfaces that need to be long-lived, logical, self-describing (naming-wise), and don’t change too often. When you develop services contract-first or meet-in-the-middle, you can control the naming conventions of the WSDL and XSD artifacts, and you can easily add more than one operation per service. Contract-first and meet-in-the-middle development also allows you to reference the canonical data model instead of ending up with auto-generated schemas.
Some additional best-practices:
- Use abstract WSDLs: Contract-first WSDLs of services can be so-called “abstract WSDLs,” i.e., they only contain messages and porttype – no implementation details, such as binding or service. Put this WSDL in MDS and use (reference) it from there to avoid dealing with environment specifics such as concrete endpoints. You can download the concrete WSDL (i.e., the implementation) from Oracle Enterprise Manager after the composite has been deployed. Rename it to xxxImpl.wsdl and add this next to the abstract WSDL in MDS. This WSDL can be used by other composites that reference the service when you deploy composites to a particular environment using, for example, configuration plans. Note that you might have to validate and fix the WSDL and XSD imports from the downloaded WSDL.
- You do not need to expose auto-generated WSDLs: If you have already created your composite and are not satisfied with the WSDL that was auto-generated as a result, use a Mediator to map from your ideal (contract-first) WSDL and XSDs to that auto-generated WSDL and XSDs. In this case the Mediator helps you take a meet-in-the-middle approach to reconcile contract-first and bottom-up approaches.
- Protect your core logic from external changes with a Mediator: When directly calling an external service that you do not own (versus going through an intermediation layer such as Oracle Service Bus) you might want to use a Mediator with a transformation. This way, if the external service ever changes you only need to update the Mediator, not the business logic inside your BPEL or BPM components. You can use the same approach for transformations between the “official” canonical data model and “local” elements used in your composites.
- Use MDS to avoid startup problems deriving from dependencies: Composites might reference each other. When starting Oracle SOA Suite, composites are activated one by one – there is no explicit control on the activation order (unlike WebLogic deployments such as EJBs, for instance). During startup, composite references need to be resolved; otherwise a composite will fail to start. Such a situation could arise when a referenced external service is down or a composite that is referenced by another composite is activated in a later stage during startup of Oracle SOA Suite. Using MDS, you can make sure that the composites are independent of each other and external services that are outside your span of control. The composites will read the WSDL and XSDs of other services from the MDS instead of trying to retrieve their endpoints. MDS therefore decouples services and avoids invalid composites at startup of the Oracle SOA Suite. 
Service design guidelines and naming conventions
Individual software developers have their own styles; that is the case for every craft. One style isn’t necessarily better than another, and teams can accommodate different a style for some variety – as long as it doesn’t result in fundamental inconsistencies and unmanageable code. Bottom-line: every development team should agree on baseline coding conventions.
The goal is to make the process of on-boarding new developers as smooth and quick as possible as your SOA effort grows. Uniformity in coding style can help since new developers know what to expect. This also helps operational teams who might have to delve in the code or simply follow the audit trails of various composites.
Two important aspects of style in SOA projects are: a)how services are designed (what components together form a composite), and b) naming conventions for composites, components, services, and references.
We recommend that your list of coding conventions should cover at least the following aspects:
- Language conventions: if you are located in a non-English speaking country you need to set clear guidelines in this area. A lot of terms are already in English (e.g., the BPEL activity names). Should you use English for everything or use as much of your native language in programming (e.g., variable names and activity names)? Again, there is no one-size-fits-all solution. However, using English is often a safe bet. Remember that your operations teams (which might reside in another country if your company is global) will be exposed to certain things, such as variable names when they appear in audit trails. External resources (consultants, Oracle support, etc.) might also need to be involved at some point.
- Naming conventions for composites, services, references, and components: e.g., should we use some short identifier as a prefix for every composite?
- Naming conventions for BPEL and BPM components and activities: This is especially important since discovering what happened in long-running BPEL and BPM instances can be somewhat tricky if the number of executed activities grows long. Consequent naming of scopes, sequences, and other activities can greatly help.
- Naming conventions for composite sensors: If you use composite sensors for searching specific instances though Oracle Enterprise Manager, or use the Oracle SOA Suite APIs, you might want to have specific naming conventions for composite sensor names. 
There is no need for a forty-page document of coding conventions – a wiki page often meets the requirements.
Wrap frequently used SOA Suite APIs in simpler custom APIs
The SOA Suite APIs are designed to be very generic, with good reason. The goal was to allow them to be used from multiple tools, multiple platforms, in multiple ways. However, that genericity comes with a price: options that may be irrelevant to your specific environment, naming that might be too abstract, etc.
It soon pays off to wrap the generic out-of-the-box Oracle SOA Suite APIs—especially the TaskQueryService and TaskService—in your own APIs, which you can then expose to your clients. Give the APIs self-describing names and make them single-purpose if possible. This will simplify things for your developers and reduce the potential for errors. Another positive side-effect is that your wrapper API will shield clients from potential changes in the out-of-the-box Oracle SOA Suite APIs when they upgrade to a newer version of Oracle SOA Suite.
Building Agility – using DVMs and Business Rules
Having too many different versions of the same composite in your environment can be difficult to manage. Two or three versions of the same service pose no problem and will support backwards compatibility and agility (service consumers are not forced to migrate to new versions all at once). Ten or twenty versions, on the other hand, will just be overwhelming, and will result in loss of control. Unfortunately a redeployment with the same version number means that in-flight instances for that composite will be marked as stale. Consider deploying as a new version if you still have in-flight instances.
The deployment of a new composite version does not always result from added or changed functionality. It can also be triggered by a minor design change or update.
A best-practice is to use Domain-Value Maps (DVMs) and Business Rules to encapsulate fast-changing logic (business rules) or values and lookups (DVM). These components can be updated at runtime without having to redeploy the composite , and therefore without stopping all in-flight instances. Start using DVMs and Business Rules when you can foresee that changes might be needed in the future. Do not hardcode such logic in BPEL and BPM activities or Mediators.
Consider the example of a service that picks up a file and feeds its content into another service. Initially we might have to start with PDF files, reading files with the *.PDF extension and mapping the content to an application/pdf mimetype. This pair of values could be hard-coded in a Mediator, but this would require changing the XSLT and redeploying the composite when we want to support an additional extension/mimetype combination. Instead, if we use a DVM, we can add any new extension/mimetype combination at runtime using the SOA Composer web console – without any modification and redeployment of the composite.
Another example is the evaluation of a discount level based on customer status. These levels are likely to change in the future, and probably faster than the overall business logic implemented in your composite. The best way to capture this is through a Business Rule. This will allow the various discounting rates to be updated without a redeployment of the composite. Consider capturing these rules using the new 11g Decision Tables – they are much more intuitive, and powerful enough for a wide range of use cases.
Another useful feature of Business Rules in this context is the ability to specify an effective starting date for rules. Some business rules (and changes to existing ones) should only be valid after a certain date (e.g., the date when the rule or regulation becomes effective).
One of the common pitfalls in service and process modeling occurs when the logic to handle all exceptional behavior is bundled together with the “normal” process logic.
Take the example of an invoice processing flow. We can add a human task and assign it to the finance department in the event that the invoice amount is higher than the approved amount. Such scenarios can be referred to as business faults. These “exceptions” have meaning to the business and differ per business service. Modeling these activities in SOA composites is perfectly legitimate.
On the other hand, there is another class of faults: technical or system faults (incorrect payloads, unavailable endpoints, etc.). The logic to handle these can also be captured inside the same SOA composite. However, such logic is unrelated to a functional or business need and will probably be the same for several (or all) composites. Adding the same fault-handling logic for technical errors to every composite leads to duplicate logic in SOA composites and a mix of boiler-plate code and functional logic. The boiler-plate code clutters your SOA composites and can make their design look like a highly complex printed circuit board.
The more SOA composites you create the harder it is to change the generic fault-handling logic if it is embedded in all composites. Also, the mix of boiler-plate code and business logic makes it difficult to maintain oversight. A best-practice is to implement the handling of technical and system faults outside the SOA components in separate fault policy files. Oracle SOA Suite offers an out-of-the-box fault-policy framework for this.
You can either add fault-policy configuration per composite, or centrally in MDS. If your focus is on starting small and growing fast, it is recommended to create a generic fault-handling mechanism and store it in MDS so it can be reused. This makes it easier to change the fault-handling for your composite (it is defined in one place), and adding fault-handling capabilities to new composites becomes easier by just referring to the MDS artifacts.
Before you start building composite services, think about what can fail, and about the consequences of that failure. Do you need to compensate to correct duplicated entries – if possible? Or can can you use XA (global transactions), which will automatically roll back the transaction? When using XA it is important to test and check that every component takes part in this transaction. Otherwise, adapters could commit (possibly in a retry) before the global transaction is committed or rolled back, resulting in duplicate entries. When using AQ or JMS as a starting point for the global transaction, a best-practice is to configure a max retry and error queue to avoid infinite loops.
Finally, you should consider from the beginning which fault-handling and prevention capabilities should reside in the service bus (OSB) layer versus the process layer (BPEL and BPM). 
Oracle SOA Suite allows you to deploy unit tests alongside your composites. You should leverage this feature for the very same reasons that you require your Java developers to deliver unit tests for all their code. In addition, these tests are crucial in ensuring that an infrastructure patch or upgrade has not introduced any regression. It is advised to make unit testing an integral part of your build process using the specific Ant tasks shipped with Oracle SOA Suite for running unit tests. See  and  for background information.
The line between software architecture and design is not always clear. In this section we identify some best practices that transcend composite design, and discuss best practices for shaping, categorizing, and governing your services.
A simple composite that returns the number of working days between two dates is not the same, in terms of business value and granularity, as a composite that implements a bank’s mortgage business process.
Not all services are equal. People often use prefixes to denominate a type of service: business services, process service, elementary service, composite service, and so on.
If you want to expand your SOA efforts, you will need a structure to divide your services into various categories. This is necessary because different rules apply to different types of services. Examples of such guidelines are: simple service composition of two automated services should be done in OSB for performance reasons; elementary services should not contain process logic; and business services should be implemented in BPEL or BPM.
A best practice is to have a simple and concrete service categorization when you start your SOA effort. Have an overall view of the types of services, and map these to the middleware components to realize them and identify the rules that govern the different service types.
You don’t need to devise a full-blown SOA Governance scheme, but a few naming conventions and categories will help.
Granularity of services
How granular should services be? This is one of the most frequently asked questions in the SOA space. There is no single, definitive answer, but there are a few guidelines that might help to find the answer that is right for your situation.
Thinking about granularity from the start of your SOA effort is important. Although refactoring is a part of life for software developers, we don’t want to structurally refactor all our composites at every stage, simply because we haven’t thought through how small or big our composites should be. The most important aspect in determining granularity is probably reusability: if a piece of software is (or will be) used by more than one consumer, this piece of software should be a separate service. If not, we can incorporate this piece of software in a larger composite to hide it.
More specifically, for SOA composites this would mean:
- If a component (e.g., a business rule, a BPEL component, etc.) is reusable, create a dedicated SOA composite to contain it.
- If not, embed the non-reusable component in an existing SOA composite or merge it with another SOA composite to create something that is (re)usable.
This approach will also help to hide small pieces of software that the outside world doesn’t need to know about – this is also called “encapsulation” or “implementation hiding.” You should only expose components as services when they are actually invoked by the outside world; other components should not be exposed. 
While re-usability is an important key for defining the granularity of SOA composites, it isn’t the only factor. You should also take into account organizational and administrative considerations. For instance:
- Rate of change: fast changing components should be separated from more stable ones.
- Availability: it makes sense to create a separate SOA composite for something that needs to be available 24/7 in order to keep it separate from other components that are less critical and not in constant use.
- Ownership: ideally, a service or process should have a single (business) owner. If more than one owner is identified, the composite might be too big.
Decoupling through events
Synchronous interactions tend to be simpler to understand and often constitute the bulk of simpler integration projects. However, asynchronicity is key to scalability. In addition, businesses tend to think in terms of events: think of the arrival of a new invoice, a check on a bag on a flight, or the hiring of a new employee. It make sense for your SOA to mimic this model.
Consider the following:
- Human interactions cannot be synchronous – they must be asynchronous.
- In a synchronous world your resilience is dictated by the weakest component in the system: a failure from that component implies the need to reprocess all the steps in the chain.
- In a synchronous world your scalability is dictated by the slowest service in the chain: what happens if a service that can handle a burst of millions of messages in an hour invokes a service with a lower throughput?
- The list of interested parties in any given message might be long – and not finite (it might grow overtime). Are we going to extend a Mediator (fan-out) for every new service that wants to be notified of these messages?
Building asynchronicity through business events or messaging will provide the level of decoupling needed to cope with the above. More specifically:
- Decoupling. Publishing services don’t need to be aware of consuming services and their operations. Publishing services can merely publish events and go back to processing the next without having to worry about who might need the event. This reduces the number of interdependencies between services.
- Ease of deployment and administration. Messages will not be lost if the receiving end is down. The messaging middleware will store these events until the receiver is up again in case of durable subscriptions (supported by AQ and JMS; not by EDN). Administration teams can take some services offline without having to stop the entire SOA infrastructure.
- Throttling. Message queues can hold a buffer of messages when more messages are produced than receivers can handle. Of course, this is only useful for bursty temporary situations. If such a situation is permanent, there is no magic: you will need to scale-up your consumers. Also, remember that a queue is not a bottomless pit: it can get full as well (e.g., when storage runs out).
Versioning can mean different things in SOA: artifacts versioning (the usage of a versioning control system to store and manage your software artifacts), or runtime versioning (to put a different label on services that behave differently). While the two are not completely disconnected, each typically evolves at its own pace. This section focuses on runtime versioning.
Consider a DocumentService that is only capable of storing PDF documents in version 1.0, adds support for DOCX in version 1.1, and eventually supports all extensions in version 2.0. What is interesting is that multiple versions of the same service can coexist in production at any given point in time. This allows consumers to upgrade at their own pace: whenever they need access to the new capabilities (support for the new doc types) and they are ready to handle them on their end.
Oracle SOA Suite is flexible in terms of versions and will let you roll out your favorite versioning conventions, be it “3″ or “3.0.0″. This flexibility also means that you should put in place some convention to ensure consistency. Beyond the consistency question, lack of clarity in that area could result in more serious problems: for instance, the introduction of a new service that doesn’t preserve backward compatibility but whose version number doesn’t clearly reflect that fact.
A best-practice is to include versioning information (i.e., the version number) in contracts (WSDL) and message definitions (XSD). A good way to achieve this is to include version numbers in your namespaces. The W3C also uses this approach in the namespaces of XSLT, XSD, and so on.
Some versioning tips and tricks:
- Decide whether you want to change the version number when you introduce bug fixes for which the contract and interface haven’t changed.
- Consider overwriting the default service version for short-lived and fully automated services if the contract and interface are backwards compatible. Note that it is advised to upgrade services during quiet hours when the service is not used.
- Specify the maximum number of concurrent versions you will support for services, typically two or three. The key is to find the right tradeoff between manageability (administrators don’t want to have too many processes deployed at any given time) and flexibility (your service consumers will want some lead time to be able to adapt to the new version of a given service).
- Agree upon a fixed period for which services remain active after the deployment of a new version, e.g., consumers are guaranteed that a given version will be live for three months after the publication of a newer version.
- Define a simple service lifecycle for your services and keep track of service status (e.g., on a wiki). An example could be: identified → under development → tested → in production → deprecated → unavailable. Publish such information for your operations and development team and (possible) consumers.
- Define a versioning scheme. For example, x.y.z, in which x stands for major releases, y for minor releases, and z for bug fixes.
Registry and repository
A SOA environment consists of several services that are integrated to provide business value. It is usually impossible for a given individual to have complete knowledge of the entire service landscape and dependencies. That’s where a registry and repository come in handy. A registry is basically a listing of all services and their key characteristics, while a repository stores important physical service artifacts such as WSDLs, XSDs, SLAs, and so on. Stakeholders can use a registry and repository to gain insight into available services and how to consume them.
Here are some pointers on the service information that you should document when you start your SOA initiative:
- Organize your information by service type: e.g., differentiate between business service, organizational service, and application service.
- Document dependencies between services: what other services invoke a particular service? On what other services does that service rely?
- For every service, list service name, owner, version, state (identified, in development, in production, deprecated, etc.), and description.
- For automated services, add the associated WSDLs, XSDs, and endpoint locations for every environment.
When you start a SOA effort you do not always have access to a full Governance repository, such as Oracle Enterprise Repository (OER). In this case, a wiki can be an appropriate choice to document your SOA landscape. However as your SOA effort grows, keeping track of things like dependencies in a simple wiki page will become exponentially difficult. At that point you will probably need to upgrade to a full-fledged repository tool, such as OER, and seed it with the information captured on the wiki. The most important thing is to ensure that this information is always current by making updates a mandatory part of bringing any service to production.
This article explores a list of topics that readers should consider before engaging in a deployment of Oracle SOA Suite. All these questions are fairly easy to address when considered early enough, but can become surprisingly problematic in later stages of a SOA project. As is often the case, awareness, consistence, and documentation are the best recipes for success with Service-Oriented Architecture.
The current version of Oracle SOA Suite at the time of this writing is 184.108.40.206 (11gR1 PS4). All the Oracle documentation links below point to that version of the documentation library. We also provide navigation path from the top level library. We encourage you to visit the Oracle SOA Suite product page on OTN: http://www.oracle.com/technetwork/middleware/soasuite and navigate to the “documentation” tab to find the latest and greatest version of the documentation library.
- Oracle SOA Suite 220.127.116.11 (PS4) documentation library
- Enterprise Deployment Guide (EDG) for SOA
(doc library > Oracle SOA Suite, Business Process Management and Web Services > Oracle SOA Suite > Enterprise Deployment Guide)
- Oracle Fusion Middleware High Availability Guide
(doc library > Cross Suite Documentation > Cross Suite Documentation)
- Fault handling in Oracle SOA Suite 11g
ODTUG Kaleidoscope 2011 presentation by Ronald van Luttikhuizen and Guido Schmutz
- Oracle Fusion Middleware Administrator’s Guide for Oracle SOA Suite and Oracle Business Process Management Suite 11g: Chapter 5 Deploying SOA Composite Applications
(doc library > Oracle SOA Suite, Business Process Management and Web Services > Oracle SOA Suite > Administrator’s Guide)
- Oracle Fusion Middleware Oracle WebLogic Scripting Tool
(doc library > Use the Oracle WebLogic Scripting Tool (WLST))
- Canonical Best Practices
Blog post by Chris Judson
- Improve visibility with end-to-end tracing and composite sensors
Blog post by Demed L’Her
- Oracle Fusion Middleware Installation Guide for Oracle SOA Suite and Oracle Business Process Management Suite 11g – Appendix A: Oracle SOA Suite and Oracle Business Process Management Suite Installation Screens
(doc library > Oracle SOA Suite, Business Process Management and Web Services > Oracle SOA Suite > Installation Guide)
- Best practices 2 – Web Services
Blog post by Ronald van Luttikhuizen
- SCA to eliminate “over-servicing”
Blog post by Ronald van Luttikhuizen
- AIA 11g: Best Practices for Decoupling Services and Avoiding Invalid Composites at Server Startup
Blog post by Gerhard Drasch
- Overview of Eventing in Oracle SOA Suite 11g
ODTUG Kaleidoscope 2011 presentation by Ronald van Luttikhuizen
- Events and SOA
Blog post by Ronald van Luttikhuizen
- Best Practices for testing Oracle SOA Suite 11g based systems
Presentation by Guido Schmutz
- Testing Oracle Business Rules using Rules Designer
Blog post by Bob Webster
- Oracle Domain Value Maps and Business Rules runtime edit with SOA Composer
Blog post by Eric Elzinga