Blog

VyOS HA in AWS

by | Jan 20, 2020 | Developer, Stratio | 2 comments

 

TL;DR

This post shows how to solve a recurrent problem when using highly-available virtual routers in AWS: floating IPs.

This approach uses a python script for the new master router to claim an EC2 Secondary private IP in the failovering transition.

 

Motivation

For certain AWS architectures we need to deploy a managed virtual router (EC2 instance) to handle tunneling termination, BGP sessions, NATing, etc. In a production environment, High Availability for these network functions is clearly a must, so the services have a minimal impact in case one of the routers fails.

I’ve chosen VyOS for this scenario since it is an open-sourced fork of Vyatta. VyOS is an Operating System for network appliances with multiple capabilities such as routing, firewalling, vpn, vxlan, BGP peering, etc., which allows it to be used in projects with managed infrastructure. It is worth mentioning its easy-to-use command-line interface and extensive documentation.

Another added complexity we can find in these kinds of deployments is the fact that AWS does not support multicast traffic.

 

Architecture

A specific problem I’ve faced when designing a solution with managed routers in AWS was a NATed outgoing traffic from the on-premises private environment with the BGP sessions.

To replicate this scenario, I’ve set up a first tunnel against AWS infrastructure, and a second one between an on-premise VyOS router (shown without HA to simplify the diagram) and the highly-available AWS counterpart.

 

 

VIPA in AWS

For a regular active/passive cluster configuration like this one, we will need, apart from the routers’ IPs, a virtual IP address to float between them in a failover scenario.

AWS doesn’t provide this kind of floating IPs, but all the IPs in the VPC range must be assigned to an EC2 instance.

To solve this problem, I’ve created a script (vrrp-master.py) to be configured in both routers, which will claim (reassign to self) the IP designed as VIPA during failovering.

This script manages the VIPA assignment automatically, so any manual assignment of this IP is strictly discouraged in order to avoid any human error (like forgetting to allow re-assignation).

 

Considerations

Because this script uses the boto3 python module (Amazon Web Services SDK for Python), we must install it in the VyOS router:

Since there is already a private connectivity between the on-premise facilities and AWS (using a Customer Gateway attached to a Transit Gateway), we don’t want to assign a public IP to the EC2 instances. Therefore, we create an EC2 VPC Endpoint (e.g. com.amazonaws.eu-central-1.ec2) making sure that the “Private DNS Name” is enabled, so that the endpoint will be resolved as a VPC private IP.

Currently, the STS (Security Token Service) only allows the creation of a VPC Endpoint for the Oregon region (com.amazonaws.us-west-2.sts), so if we want to use STS roles in another region, routers must have internet access (not really an option for production environments). To overcome this, we created a user with the following policy directly attached (limited to the “vyos-ha” user and VPC “vpc-0a6f6a161f5ae1fc2”):

Cluster

According to the VyOS website, this is the recommended method since it allows us to have a service as a cluster’s resource, associated with the VIPA.

Unfortunately, the available VyOS version in the AWS Marketplace doesn’t allow unicast traffic in this mode:

VRRP

Luckily, unicast traffic for VRRP is implemented for the VyOS version in AWS.

Here is the VRRP configuration for both routers: 

Verifying the configuration

To check the VRRP status we can use this command:

To test the failover, we can restart the MASTER node:

 

Once the master node is powered off, the slave will become the new master and the mentioned script will claim the VIPA to the EC2 VPC Endpoint:

 

When the previous stage finishes, the VIPA (100.80.33.100 in the example) will show up configured as the Secondary IP from the eth0 NIC:


This can be verified listing the eth0 interface within the router:

In the AWS console, we can also see the VIPA at the new MASTER’s “Secondary private IP” field (EC2 instance, Description tab):

 

 

vrrp-master.py

You can use this simple script to claim the VIPA (it must be scp’ed to both nodes with exec permissions).

Conclusions

We’ve seen how to get around an AWS limitation when deploying a highly-available VyOS router.

Since AWS doesn’t provide floating IPs, the VIPA failover is done using the python’s SDK and a user with a restrictive policy. We couldn’t use STS since the VPC Endpoint is not available outside Oregon’s region and communication with routers directly from the internet is unacceptable.

Unfortunately, we cannot use the VyOS cluster mode since it is currently not supported in the latest AWS AMI version, so we have opted to use VRRP unicast instead.

Both routers were deployed with the VyOS AMI, so we need to install the boto3 module beforehand. This can be done connecting them to an Internet Gateway (test) or downloading the packages from a private and secured packages repository (prod).

That’s all for now, I hope you’ve enjoyed it and if you have any trouble testing or deploying this architecture, feel free to leave a question in the comments section.

About cookies on this site

We use our own and third party cookies to enhance your browsing experience. By using this website you agree to our use of cookies.

Privacy Settings saved!
About cookies on this site

When you visit any web site, it may store or retrieve information on your browser, mostly in the form of cookies. Control your personal Cookie Services here.

These cookies are necessary for the website to function and cannot be switched off in our systems.

In order to use this website we use the following technically required cookies
  • wordpress_test_cookie
  • wordpress_logged_in_
  • wordpress_sec

Decline all Services
Accept all Services
X