Archive for the ‘redundancy’ Tag

Taxonomy for Web Service Sources

Various taxonomies for web services are possible.  A focus on technology might produce classifications such as transport (e.g., HTTP, JMS), representation (e.g., SOAP, REST), response handling (e.g., blocking, asynchronous via polling, asynchronous via callback).  A focus on purpose might look more like data source vs. computational vs. legacy API exposure.

In the Web 2.0+ era, web services are proliferating wildly.  The mashup community is providing huge demand satisfied in part by sites such as ProgrammableWeb where over 1,000 web services can be found, and increasingly online services are opening their platforms via APIs.  Enterprise level SOA (Service Oriented Architecture) initiatives, while slowed by a slowing economy, are also beginning to consume external services as well as exposing their own services for internal use.

In recognition of this proliferation, I believe a new taxonomy is required that addresses the source of services from the perspective of the user, orthogonal to technology or purpose.  By “user” in this context, I am referring to the human developer of any application that consumes web services.

Source Taxonomy

The figure illustrates a draft web service taxonomy where services are classified by the nature of their sources or providers as seen by potential users.

ws-source-taxonomy

Classification:  Ownership

Ownership is the distinction between services that are sourced within the user’s organization (e.g., company, business unit) versus those sourced by parties unaffiliated with their organization.

Internal:  “Services sourced within the user’s organization implying some potential for control over their implementation.”  Examples include web service APIs to internal legacy systems as part of a SOA project.

External:  “Services sourced independent of the user’s organization.”  Examples include information services (e.g., news, market quotes, credit report) and APIs to platforms like Twitter or SalesForce.

Classification:  Provision

Provisioning refers to the relationship between the entity supporting the web service endpoint from the user’s perspective (i.e., the provider) and the entity that supplies the functional implementation of the web service (i.e., the source).  When the provider is the source, the service is said to be original.  Conversely, if the provider is some form of third party intermediary between user and source, the service is said to be syndicated.

External / Original:  “External services that are called directly by a user’s application.”

External / Syndicated:  “External services for which users call a third party provider or syndicator which would then call the original source on their behalf.”  Presumably in this type of structure, the syndicator would add some value for acting as intermediary.  For example, a syndicator could serve as a common front for many original service sources thereby presenting an additional interface abstraction and a common point of billing and support.

Internal / Original vs. Syndicated:  Based on the foregoing definitions, the notion of a syndicated internal service seems oxymoronic.  The taxonomic intent is to enable larger enterprises to make the distinction, for example, between point-to-point calling of a legacy API (i.e., original source) versus the use of an intermediate hub, messaging service, or some other abstraction layer (i.e., a form of syndication).

Classification:  Differentiation

This classification addresses the potential for multiple sources to provide functionally equivalent services and how the user perceives the relative value of those sources.  A service is referred to as a commodity if it is possible for multiple sources to provide functionally equivalent implementations of that service, whether or not multiple such sources actually exist.  In contrast, a service is referred to as branded if there can be only one source of that service.

External / Original / Commodity:  “Services provided by an original source that can also be offered by other sources.”  The functional equivalence of these services can enable a user to select a source based on non-functional factors such as price, performance, and reliability.  Examples might include data services such as weather or financial market data.  In this scenario, the commodity sourcing decision must be performed by the user.  Despite functional equivalence, each source may present differing interfaces to which the user must code.

External / Syndicated / Commodity:  “Services provided by a syndicator fronting for potentially multiple functionally equivalent sources.”  The key value of commodity syndication lies in the fact that functional equivalence does not imply interface equivalence.  For a given commodity service, a commodity syndicator has the opportunity to normalize interfaces across functionally equivalent sources, thus providing the user with a single stable interface per service.  This scenario would support the commodity sourcing decision being made either by the user or transparently by the syndicator.

External / Original / Branded:  “Services provided by an original source that can only be available from that source.”  The most common types of these services are APIs to specific applications or platforms.  For example, consider writing a mashup for your back office that uses the SalesForce API.  It cannot decide lightly to call a different CRM application since it is highly unlikely that its API will be functionally equivalent at the service level, not to mention the fact that your company’s data lives at SalesForce.

External / Syndicated / Branded:  “Services that can only be available from a single source, but are accessed through a syndicator.”  This class is included for taxonomic completeness although it is unclear what significant value the syndicator would provide in this case.  There may be some value in a single gateway to multiple branded services for billing, support, or auditing purposes, but this alone hardly seems compelling relative to the overhead.

Classification:  Session

This classification recognizes that certain logical operations may require multiple web service calls.  While this may seem like a technical distinction, its relevance to this taxonomy is in the context of commodity source selection.

… / Commodity / Stateless:  “Completely independent web service calls enabling commodity sourcing decisions on a per call basis if desired.”  This is the finest granularity of web service commoditization.  An example of this might be a request for a stock quote for a known ticker symbol.  A single call does the job and there is any number of functionally equivalent service sources for this information.

… / Commodity / Statefull:  “A logically related group of web service calls that all must be made to the same source, thus necessitating a single commodity sourcing decision for the group.”  An example might be obtaining a credit report on a company.  A first call requests a “list of similars” based on the company name.  The returned list includes a set of possible matches with additional data for disambiguation and source specific IDs.  After selecting the desired company from the list, the second call requests the actual report based on the ID.  The user may not care which source is used, but having made the sourcing decision for the first call, the rest of this conversation must return to the same source since it carries source specific information.

Summary

The last 5 years have seen the rapid proliferation of available web services and a growing appetite of Web 2.0+ developers anxious to consume them.  Thus far, the focus has been on what mashups and service oriented applications can do and how to achieve them functionally.  Going forward, we will see increased attention to qualities of service, stability, and source redundancy analogous to that of cloud computing.  Lack of maturity in these areas is among the factors holding back enterprises from full scale consumption of external web services in their business applications.  Concepts such as syndication and commoditization can play a key role in breaking through this barrier.

The Redundancy Principle

Architecting complex systems includes the pursuit of “ilities”; qualities that transcend functional requirements such as scalability, extensibility, reliability, maintainability, and availability.  Performance and security are included as honorary “ilities” since aside from being suffix-challenged, they live in the same family of “critical real-world system qualities other than functionality”.  The urge to include beer-flavored took a lot to conquer.

Reliability, maintainability, and availability have some overlap.  For example, most would agree that availability is a key aspect of reliability in addition to repeatable functional correctness.  Similarly, a highly maintainable system is not only one that is composed of easily replaceable commodity parts, but one that can be serviced while remaining available.

As an architect, designing for availability can be great fun.  It’s like a chess game where you have a set of pieces, in many cases multiples of the same kinds.  Your opponent is a set of failure modes.  You know that in combating these failures, pieces will be lost or sacrificed, but if well played, the game continues.

We [Don’t] Interrupt this Broadcast

Every component in a system is subject to failure.  Hardware components like servers and disk drives carry MBTF (mean time before failure) specifications.  Communication media and external services are essentially compositions of components that can fail.  Even software modules may be subject to latent defects, memory leaks, or other unstable states however statistically rare.  Even the steel on a battleship rusts.  Failures cannot be avoided.  They can, however, be tolerated.

The single most effective weapon in the architect’s availability arsenal is redundancy.  Every high availability system incorporates redundancy in some way, shape, or form.

  • The aging U.S. national power grid provides remarkable uptime to the average household in spite of a desperately needed overhaul. At my house, electrical availability exceeds the IT-coveted five nines (i.e., 99.999%) and most outages can be traced to the local last mile.
  • The U.S. Department of Defense almost always contracts with dual sources for the manufacturing of weapon systems and typically on separate coasts in an attempt to survive disasters, natural or not.
  • The Global Positioning System comprises 27 satellites; 24 operational plus 3 redundant spares. The satellites are arranged such that a GPS receiver can “see” at least 4 of them at any point on earth. However, only 3 are minimally required to determine position albeit with less accuracy.
  • Even the smallest private aircraft have magnetos; essentially small alternators that generate just enough energy to keep spark plugs firing in case an alternator failure causes the battery to drain. Having experienced this particular failure mode as a pilot, I was happy indeed that this redundancy kept my engine available to its user.

Returning to the more grounded world of IT, redundancy can occur at many levels.  Disk drives and power supplies have among the highest failure rates of internal components and thus RAID technology and dual power supply modules in many servers and other devices.  Networks can be designed to enable redundant LAN paths among servers.  Servers can be clustered assuming their applications have been designed accordingly.  Devices such as switches, firewalls, and load balancers can be paired for automatic failover.  The WAN can include multiple geographically disparate hosting sites.

Drawing the Line

The appropriate level of redundancy in any system reduces to an economic decision.  By definition, any expenses incurred to achieve redundancy are in excess of those required to deliver required functionality.  Although in some cases, redundant resources used to increase availability may provide ancillary benefits (e.g., a server cluster can increase availability and throughput).

Redundancy decisions really begin as traditional risk analyses.  Consider the events to be addressed (e.g., an entire site going down, certain capabilities being unavailable, a specific application becoming inaccessible; for some period of time).  Then determine the failure modes that can cause these conditions (e.g., a server locking up, a firewall going down, a lightning strike hitting the building).  Finally, consider the cost of these events each as a function of its impact (e.g., lost revenue, SLA penalties, emergency maintenance, bad press) and the probabilities of its failure modes actually occurring.  The cost of redundancy to tolerate these failure modes can now be made dispassionately against their value.

As technologists, our purist hearts want to build the indestructible system.  Capture my bishops and rooks and my crusading knights will continue processing transactions.  However, the cost-benefit tradeoff drives the inexorable move from pure to real.

The good news is that many forms of redundancy within the data center are inexpensive or at least very reasonable these days given the commoditization of hardware and the pervasiveness of the redundancy principle.  Furthermore, if economics keeps you from realizing total redundancy, do not be disheartened.  We’re all currently subject to the upper bound that we live on only one planet.