An Architectural Evolution for Networking in the Public Cloud

sherry
By Sherry Wei
Founder and CTO, Aviatrix
October 23, 2017

As cloud technologies evolve, so do the requirements for cloud networking. Now, it’s time for an entirely new architecture for networking in the public cloud.

The Shared Service Architecture for DevOps

Consider the evolution of Amazon Web Services (AWS). In 2006, AWS started with EC2 Classic where there was not much networking to do: an EC2 instance either has a public IP address or a 10.0.0.0/8 network.

Later in 2009, AWS introduced Virtual Private Cloud (VPC). Nowadays you cannot launch an EC2 instance without specifying which region, VPC, and subnet the EC2 instance will be deployed on. You can think of a VPC as a virtual datacenter; it’s got routing, subnetting, Internet gateway, NAT gateway, and other capabilities. While it takes years to build a physical datacenter, it takes minutes to build a VPC with everything in it. That’s how convenient public cloud has become.

AWS started with developers and Internet-born or all-in-cloud companies. To run their production or service in AWS, the DevOps folks figured out to isolate production accounts from development accounts and testing/QA accounts. That led to multiple accounts and multiple VPCs.

To manage all the instances in different VPCs for patching, updating, and scanning, a hub-and-spoke network architecture has emerged. There is one shared service VPC or management VPC that hosts DevOps tools, and the shared service VPC connects to other spoke VPCs where workload EC2 instances are run.

This shared service architecture works well for operations within the cloud.

Enterprise Transit VPC Network

Enterprises with on-prem datacenters have different connectivity requirements when adopting AWS.

In the early stages, an enterprise might dabble with AWS using a few VPCs, each connecting to on-prem resources. As deployment starts to scale, building many connections to on-prem networks becomes difficult to manage. The AWS Global Transit VPC solution automated the hybrid secure network connectivity issue for larger-scale number of VPCs that need to connect back to on-prem. 

What’s Wrong with the Transit VPC Network

While the Transit VPC solution solves the problem of automation of secure connectivity, that is not a sufficient network architecture to run an enterprise cloud operation.

That’s because enterprise cloud operation teams consist of DevOps folks and networking engineers, each with different skill sets and functional responsibilities. Issues with Transit VPC solution include:

  • Skill set. Transit VPC requires in-depth knowledge in Cisco CSR, BGP, IPsec, and VRF to maintain the network. DevOps and less-experienced networking engineers find it challenging to be self-sufficient in operating even the cloud-only part of networking.
  • No isolation between VPCs. Transit VPCs automatically build a full-mesh connectivity between VPCs, even when VPCs are owned by different business groups. This solution approach breaks down an enterprise’s need to isolate particular workloads for security purposes.
  • Double egress charges for spoke-VPC-to-spoke-VPC traffic. For spoke-VPC-to-spoke-VPC traffic, packets are being charged twice: once leaving the spoke VPC and once leaving the Transit VPC.
  • Transit VPC becomes a performance bottleneck for all traffic. Because spoke-to-on-prem and spoke-to-spoke traffic all goes through the Transit VPC, Transit VPC CSR becomes a performance bottleneck.

Introducing a New Architecture for Enterprise CloudOps.

What enterprise cloud operations teams need is to decouple the cloud transport network from the cloud service layer, so that each architecture level—and the technical team that handles it—can work optimally within a hybrid cloud environment.

The result is an architectural diagram as shown below.

 

This new architecture decouples the transport network architecture and the service network architecture, creating a “line” separating the two.

The transport network, which connects with the on-premises datacenter, is managed by the network team. It serves the requirements of connecting spoke VPCs to on-prem resources. Only traffic traveling to and from the on-prem resources goes through the transit VPC. In the transport architecture, the networking team is responsible for determining the type of connectivity — private Direct Connect or Internet-based IPsec — and building and operating it.

The service network meets the needs of DevOps teams. A shared service hub VPC connects to spoke VPCs to provide Ops teams with tools, updates, and patches without going through the transit hub VPC. For situations such as disaster recovery (DR), the service network can connect any two spoke VPCs on demand, without going through the transit hub.

The new architecture is run by a central controller, requiring no border gateway protocol (BGP) in the cloud. In this approach, BGP is used for connecting between transit VPCs and on-premises routers; the cloud teams do not need to use BGP at all. Instead, with a central controller, routing in the services architecture part is policy-driven and software-defined.

The Benefits of the New Architecture for Enterprise CloudOps

Decoupling the networking transport and service architectures offers several benefits for organizations implementing hybrid cloud or multi-cloud environments.

Internally, this cohesive architecture reduces friction between the network and DevOps teams. Network teams can focus on circuits and direct connections for the hybrid connectivity, while cloud teams can manage cloud networking without proprietary Cisco CSR equipment and in-depth networking expertise.

From a technical perspective, this cohesive architecture encompasses the transport and service architecture layers to form a new, consolidated architecture. This new architecture makes it simple to add new VPCs where needed to support an enterprise’s growth.

The Aviatrix software-defined cloud networking architecture:

  • Aligns teams and skills. Aviatrix aligns with the DevOps architecture through shared service VPCs and spoke VPCs, which improves self-sufficiency and agility of cloud teams.
  • Provides isolation by default. No inter-VPC connectivity happens unless specified by policy.
  • Reduces egress charges. Inter-VPCs traffic in the service architecture level incurs half the egress charge compared to Transit VPC.
  • No single performance bottleneck. Inter-VPC traffic does not need to go through the Transit VPC.
  • Aligns with data replication. Inter-VPC traffic is direct and requires no extra hop.
  • Improves visibility and troubleshooting. Cloud teams gain visibility across hybrid and multi-cloud environments from a single, unified console.

In summary, the Aviatrix architecture is much more suitable for enterprise CloudOps teams, enabling them to operate their growing cloud network environment themselves — rapidly, easily, and without depending on mastering networking complexities or relying on networking IT experts.

To learn more about making networking simple, secure, and automated in even the most complex cloud environments, visit the Aviatrix website.


Add a Comment