Transit VPC — Management and Troubleshooting

jorge
By Jorge Bonilla
Cloud Network Architect, Aviatrix
July 7, 2017

We have recently heard from many customers that the Transit VPC architecture is hard to manage and troubleshoot. On one hand, there is the management of the third-party software at the hub and Amazon’s VGW from every spoke. On the other hand, there is the troubleshooting of multiple software, vendors, routing tables and lack of visibility overall.

And on top of that, as cloud adoption increases, customer are now dealing with tens or hundreds of tunnels between all their VPCs, manual and complex solutions are not the right approach.

This blog only discusses the hub and spoke architecture, the full mesh architecture discussion is a blog for another day.

Management

The third-party hub software as described in the Transit VPC document includes usually a traditional firewall OS product ported to be available in Amazon Web Services as an image. That software (1) carries all the legacy code that might or might not be relevant for your use case, therefore ends up requiring a higher (read costlier) type of instance, (2) requires expertise to be able to configure the firewall product and (3) usually requires manual configuration or external scripts or more third-party software to manage the “cloud networking” parts, like the VPC’s Route Table updates.

Aviatrix solve all those issues because it’s native cloud networking. To start, deploying an Aviatrix Gateway is as simple as a few clicks or an API call, furthermore, there is no need to configure the underlying OS. Gateway instances can be scaled up, down or out, depending on your application’s live demand, by simply changing the type of the existing gateway instances or deploying more gateway instances. And finally, creating Peering between two gateways and its corresponding cloud networking configuration, takes even less clicks than the prior two steps.

The Aviatrix dashboard allows you to keep an eye on all your network at a glance: what’s connected to what? are the tunnels across your network up or down? are they transmitting data? how much data has it been transmitted? All these questions can be answered quickly with the dashboard.

With Aviatrix, creating 1 or 100 tunnels is the same experience, which allows for consistent deployment and management of all those tunnels. Scaling is all about programmatically being able to create the necessary infrastructure, with our APIs, large infrastructures can be created on demand and at scale.

Troubleshooting

When issues arise as they tend to do, the multivendor approach rarely leads to quick and easy to solve support calls, and rather becomes a finger pointing exercise. Using third party vendors at the hub and VGW on the Spoke side, is a recipe for a long support call.

Aviatrix deploys the same type of gateway on both ends of a peering therefore alleviating the possibility of incompatibility and making the troubleshooting approach consistent across hub and spoke.

The third-party hub+VGW solution usually requires manual intervention of the VPC’s Routing Table, we all know that human errors happen, but the harder part is to identify and find the error within hundreds of VPCs, that was just made 5 minutes prior, due to the lack of sleep or pressure to get the system up and running again. Automation is the clear solution to this issue, but then again, why reinvent the wheel when Aviatrix has already solved for this issue: as part of the peering process, Aviatrix modifies the VPC’s routing table for you, eliminating the chance for errors in routing the relevant traffic to the respective gateway.

The troubleshooting of cloud networking can be exceptionally daunting since most of information provided by the Cloud Provider, it’s relevant to, well, the Cloud Provider. By design, there is little to no information about the overlay network and therefore customer’s data, which is where the interesting bits are.

Aviatrix can generate overlay level syslog messages and even packet captures relevant to the business traffic, along with latency data, max, min and average throughputs that could be polled via the API in order to feed your existing management and alerting tools.

In conclusion, Aviatrix Systems solution provide a sort of Transit VPC 2.0 with improved management, visibility and troubleshooting information that will make your large operation a whole lot easier.


Add a Comment