What are the best practices for maintaining and operating a Global Transit VPC
Aviatrix Software-defined Cloud Routers are a cloud-native networking framework for public clouds. The solution is also recommended by AWS. It is packed with benefits that enable customers that want to scaling in the cloud. More information on how the solution works can be found on the AWS transit solution Answers page.
Once the Aviatrix solution is implemented, it is important to take a few (easy) steps to help operationalize your Software Defined transit network:
Enable Gateway High Availability
Aviatrix Gateways should be deployed in an HA fashion. Depending on criticality of the VPC connectivity, aviatrix offers various levels of autohealing features in case of downtime. For most critical deployments, we recommend a cross-Availability Zone HA implementation. The controller takes care of switching between gateways if the primary tunnel or gateway where to go down.
Gateway High Availability is described here: https://docs.aviatrix.com/HowTos/gateway.html#high-availability
Controller Access Lockdown
It is a good idea to lock down controller access to necessary personnel that will be operating on the aviatrix solution. There is a documented checklist of items to consider including automatic security group management, two factor authentication and creating sub user groups. The list of options for controller access hardening are described here: https://docs.aviatrix.com/HowTos/FAQ.html#how-do-i-secure-the-controller-access
This article covers the following topics:
- Enable Controller Security Group Management
- Use signed certificate
- Remove less secure TLS version(s)
- Enable LDAP or DUO second factor to login in
- Create Read-only accounts
- Remove admin account login
Controller Backup (and Restorability)
Controller (operating in the data plane) is a hardened EC2 appliance. It is a good idea to enable backup on the controller. This sets up a recurring backup job where the controller configuration is captured and stored in a secure S3 bucket. Note: a controller coming down does not affect the datapath (Gateways). Traffic will continue to be routed.
More information on how to set up controller back up and how to restore it documented here: https://docs.aviatrix.com/HowTos/controller_backup.html
Troubleshooting Tools (Know Your Options)
Aviatrix solution leverages its central controller model to help cloud and network teams quickly troubleshoot connectivity issues. Using these UI tools do not require special networking skills.
One obvious thing to check the status of the tunnels. This can be found in the Peering page. If the Tunnels are “UP” there are no issues with the tunnel status. You can also run some diagnostics on the tunnels from this page.
The other tools in the debugging process enables the user to debug beyond Aviatrix Components. You can do traceroute/ping tests from any gateway to specific hosts right from the controller UI using the Gateway Utility (Troubleshoot -> Diagnostics -> Network tab) . You could also consider a connectivity test using the Network Connectivity Utility or using the Packet capture utility to download a PCAP file from any gateway for further analysis.
Leverage Visibility for Performance and Cost Control
Use the Aviatrix Controller to track connectivity provisioned to date and monitor the performance. The Dashboard shows a geographical view or a topological view of the transit network and other connectivity provisioned via Aviatrix Gateways. It also provides network statistics by gateway. Set up the dashboard to reflect your prefered view.
The dashboard can also provide critical information on:
- Performance (bandwidth) tracking on the gateways.
- Cost control to make sure the gateways are optimally sized.
Set Up Alerts Using Log Analytics Platforms
Aviatrix Controller and gateways can log packet statistics, security events and status events to log analytics platforms. Set up the logging mechanism and alerting to provide critical proactive monitoring for your business.
Learn about how to set up logging to log analytic tools: https://docs.aviatrix.com/HowTos/AviatrixLogging.html
Set up alerting in the logging platform. Here is an example of how to set up realtime alerts using Splunk: http://docs.splunk.com/Documentation/Splunk/7.1.1/Alert/DefineRealTimeAlerts
Leverage Automation and Network as Code
Aviatrix enables networking to be provisioned and managed along with your cloud deployments (Infrastructure as Code + Network as Code).
The controller is capable of orchestrating network actions through REST API calls. Some examples of how to leverage APIs are documented here: https://docs.aviatrix.com/HowTos/aviatrix_apis_datacenter_extension.html
If your team uses terraform, Aviatrix has a terraform provider to accomplish network as code: https://github.com/AviatrixSystems/terraform-provider-aviatrix
These items are suggested configurations to help you with your Day 2 operations of the transit network and multi-cloud connectivities.
Setup Email Alerting and SMTP Server
Aviatrix controller allows you to set up Status Change Event Alert email address. This is the email (or alias) that will get alerts on Gateway health, Tunnel health and other maintenance issues. You can also customize the SMTP server that sends out the emails. Note: If you do not configure your own SMTP settings, by default, the controller uses an Aviatrix-owned SMTP server.
To make these changes, log in to the controller. Go to Settings -> Maintenance -> Controller -> Email. Here you can make configure your email alerts and SMTP settings.
If you have any questions on any of these items or about the Aviatrix Solution in general, please contact [email protected]
Other resources: https://docs.aviatrix.com