Archive for the ‘scalability’ Tag

Dimensions of Scalability

Designing for scalability is one of the primary challenges of system and software architecture.  For those of us who practice architecture, it’s also great fun thanks the high number of variables involved, the creativity required to discover exploits, the pattern matching to apply tricks and avoid traps, and the necessity to visualize the system in multiple possible futures.

In the broadest terms, “Is it scalable?” = “Will it break under growth?”  A few manifestations that are a bit more useful include “Will performance hold up as we add more users?”, “Will transaction processing time stay flat as the database grows?”, and “Will batch processing still complete within the allotted window as the size of our account base, data warehouse, or whatever multiplies?”.  Architects imagine the kinds of demand parameters that might occur over the life cycle of the system and incorporate mitigation plans.

These examples all pertain to the performance characteristics of a system.  However, there are other dimensions of scalability that are equally important when considering that system in a business context.

Strategic Dimensions

  1. Performance Scalability:  “An observation about the trend in performance in response to increasing demands.”
    Demand can refer to any of several parameters depending on the system such as number of concurrent users, transactions rates, database size, etc.  Performance measures may include event processing time, batch throughput, user perception, and many others.  In any case, we consider a system to be scalable if we observe a flat or nearly flat performance curve (i.e., little or no performance degradation) as any given demand parameter rises.  In reality, even highly scalable systems tend to be scalable through some finite range of demand beyond which some resource tends to become constrained causing degradation.
  2. Operational Scalability:  “An observation about the trend in effort or risk required to maintain performance in response to increasing demands.”
    This may be best illustrated by example. Consider a web application that is experiencing sharp increases in usage and a mid-tier performance bottleneck as a result.  If the application was designed for mid-tier concurrency, the mitigation effort may be simply adding more application servers (i.e., low effort, low risk).  If not, then significant portions of the application may need to be redesigned and rebuilt (i.e., high effort, high risk).  The former case is operationally scalable.  As with performance scalability, operational scalability occurs in finite ranges.  Continuing the previous example, at some point the database may become the bottleneck typically requiring more extensive remedial action.
  3. Economic Scalability:  “An observation about the trend in cost required to maintain performance in response to increasing demands.”
    We consider a system to be economically scalable if the cost of maintaining its performance, reliability, or other characteristics increases slowly (ideally not at all, but keep dreaming) as compared with increasing loads.  The former types of scalability contribute here.  For example, squeezing maximum performance out of each server means buying fewer servers (i.e., performance scalability) and adding new servers when necessary is cheaper than redeveloping applications (i.e., operational scalability).  However, other independent cost factors can swing things including commodity vs. specialty hardware, open source vs. proprietary software licenses, levels of support contracts, levels of redundancy for fault tolerance, and the complexity of developmental software which impacts testing, maintenance, and release costs.

Rocky Roads

Since the underlying theme of these additional dimensions is business context, it should be noted that rarely does an architect get to mitigate all imaginable scalability risks.  Usually this is simple economics.  In the early days of an application, for example, the focus is functionality without which million-user performance may never get to be an issue.  Furthermore, until its particular financial model is proven, excessive spending on scalability may be premature.

However, a good technology roadmap should project forward to anticipate as many scale factors as possible and have its vision corrected periodically.  Scalability almost always comes down to architecture and an architectural change which is usually pervasive by definition is the last thing you want to treat at a hot-fix.

Rational Scalability

The Chief Technologist of any highly successful Web x.0 company one day must come to grips with the horror of success.  Such companies can find themselves swept within the compressed bipolar extremes of “make it function and be careful with pre-revenue cash” and “holy crap, half the Internet is hitting us”.  It’s like a wormhole suddenly bringing together two points in space that were previously vastly distant and in the process evaporating the time to prepare for arrival.  A good problem to have, right?  (clearing my throat, loosening my collar)  It certainly beats the alternative, but without a successful crossing, the alternative returns.

Extending the Wormhole

Several factors can influence the apparent flight time through the metaphorical wormhole aside from writing big checks on speculation.

Functional Realism:  Understand what the business does and why.  This one seems more obvious than history would indicate.  A deep understanding of how users will approach the system will help bring likely scale issues into focus.  For example, if operation X is 10 times slower than operation Y, conventional wisdom in isolation says “focus on tuning X”.  However, in the field if Y will be called 10,000 times more often, perhaps Y should get the attention.

Early Warning Radar:  Find leading indicators of usage up-ticks that are solid enough to trigger investment or other proactive steps.  At matchmine, our users created traffic to our platform via a network portfolio of media partners with whom we had B2B relationships.  The time it takes to establish these relationships provides a useful response buffer.

Capacity on Tap:  Maintain as much in reserve infrastructure as possible without paying for it.  For example, the bandwidth contract on our main pipes had minimum charges plus various overage terms.  However, all aspects of the technical infrastructure could handle about 100 times these minimums with only a billing change.  Other areas to consider are cloud computing (e.g., Amazon EC2) and edge caching (e.g., Akamai).

Architecture:  If a young Web x.0 company is fortunate enough to apply some serious architecture practice in its early days, the wormhole effect can be greatly reduced.  CPU, storage, and bandwidth are commodities that by and large can be added by releasing funds.  However, silicon, iron, and glass can only take an inadequate architecture so far and good architecture doesn’t happen overnight or under fire.

Rational Scalability

One of the most challenging aspects of planning for scale is how to rationalize large resources against a need that hasn’t happened yet.  Few things will make a CFO’s spinal chord vibrate like a proposal to build a system that will support 10,000,000 users when you’re still at 1,000.  And is 10,000,000 even the right number?  Could it be an order of magnitude either way?  Overspending to this level of uncertainty simply converts technical risk into financial risk.

The key is to find something on which to rationalize or if possible compartmentalize scale.  This may be along functional lines, classes of users, or qualities of service, for example.  Depending on the application, however, this may take a place of prominence on the easier said than done list.  The user base of many applications is simply huge and rather monolithic often leading to other modes of partitioning (e.g., grouping user accounts into separate databases by alphabetical ranges).

At matchmine, we could exploit the aforementioned B2B layer.  All of our users hit our platform in the context of using one or more of our media partners.  This provides a very natural partitioning opportunity.

Consider the matchmine Discovery Server.  This is the platform component that services content retrieval requests in all their variations.  These operations return only content from the partner’s view of the catalog; the subset of our catalog that they sell, rent, share, discuss, or otherwise service.  These subsets are much smaller than the whole rendering them much easier to memory-cache.

Thus the architectural exploit.  The Discovery Server can be refactored into a controller and a set of servicing nodes.  The controller is the web service endpoint that handles front line security, metrics collection, and request routing to nodes.  The nodes service the actual content retrieval operations, but on a partner-specific and highly cached basis.

The caching dramatically improved performance over the monolithic approach and also provides the perfect hunting ground for rational scalability:

  • We know the sizes of content and supporting data for each partner. Therefore, we can determine the best mapping of partners to nodes based on the objective of memory caching their data. Alternately, we can size virtual servers to specific partners.
  • We know the user base size of each partner. Therefore, we have an upper bound for estimating Discovery Server usage per partner and thus can determine how many nodes to allocate per partner to handle their throughput.
  • We know the growth rates of content and user base of each partner. Therefore, we can predict how the foregoing two points will change over time since Discovery Server usage growth is bounded by these rates.
  • We know the total traffic hitting the Discovery Server. Therefore, we can determine how many controllers we’ll need. While controller traffic is non-partitioned, the controllers are functionally very light, stateless, and thus easy to scale.

As this example illustrates, rational scalability is the art of tying the various dimensions of growth to externally observable and/or predictable factors.  The natural regulators of B2B growth can assist greatly whether the business itself is B2B or a B2B layer wraps the user base.  In a pure B2C play with a single large class of users, this may be somewhere between difficult and impossible, but the importance of trying cannot be overstated.