Deployment Strategies

There are various strategies for deploying a cloud application. Some of them are described below.

Red/Black Deployment

In a red/black deployment, a new server group is added to the cluster, deploying newer version of the application. The new server group keeps the name, stack and detail elements but increments the version number.

This is also referred as blue / green deployment.

Once deployed and healthy , the new server group is enabled and starts taking traffic. Only once the new server group is fully healthy does the older server group get disabled and stop taking traffic.

This procedure means deployments can proceed without any application down‐ time—assuming, of course, that the application is built in such a way that it can cope with “overlapping” versions during the brief window where old and new server groups are both active and taking traffic.

If a problem is detected with the new server group, it is very straightforward to roll back. The old server group is re-enabled and the new one disabled.

Applications will frequently resize the old server group down to zero instances after a predefined duration. Rolling back from an empty server group is a little slower, but still faster than redeploying, and has the advantage of releasing idle instances, saving money and returning instances to a reservation pool where other applications can use them for their own deployments.

Alternatives to Red/Black Deployment

Variations on this deployment strategy include:

Rolling push

The machine image associated with each instance in a server group is upgra‐ ded and then restarted in turn.

Rolling red/black

The new server group is deployed with zero instances and gradually resized up in sync with the old server group being resized down, resulting in a grad‐ ual shift of traffic across to the new server group.

Highlander

The old server group is immediately destroyed after being disabled. The name comes from the 1985 movie of the same name, where “There can be only one”! This strategy is usually only used for test environments.

Cross-Region Deployments

Deploying an application in multiple regions brings its own set of concerns. At Netflix, many externally facing applications are deployed in more than one region in order to optimize latency between the service and end users.

Reliability is another concern. The ability to reroute traffic from one region to another in the event of a regional outage is vital to maintaining uptime. Netflix even routinely practices “region evacuations” in order to ensure readiness for a catastrophic EC2 outage in an individual region.

Ensuring that applications are homogeneous between regions makes it easier to replicate an application in another region, minimizing downtime in the event of having to switch traffic to another region or to serve traffic from more than one region at the same time.

Active/Passive

In an active/passive setup, one region is serving traffic and others are not. The inactive regions may have running instances that are not taking traffic—much like a disabled server group may have running instances in order to facilitate a quick rollback.

Persistent data may be replicated from the active region to other regions, but the data will only be flowing one way, and replication does not need to be instantaneous.

Active/Active

An active/active setup has multiple regions serving traffic concurrently and potentially sharing state via a cross-region data store. Supporting an active/active application means enabling connectivity between regions, load-balancing traffic across regions, and synchronizing persistent data.