Setup an OpenVPN server in Amazon EC2 with Ansible

Background

Recent changes in internet Policy in the US has made the internet a much less friendly place, specifically in regards to our Internet Service Providers and what they are allowed to do with our traffic and data.

I decided I wanted to regain some control over my traffic/data by using a VPN to tunnel some of my lower volume traffic and DNS queries out of view of the snooping eyes of my ISP.

Additionally I wanted to be able to connect to services running inside my home network when away from home, and I didn’t want to put a public facing DNS entry against my home cable modem (I don’t have the ability to change my routers IP and I’d like to keep it as private as possible).

My unique routing requirements prevented me from using a commercial VPN provider, but this github document also summarizes very well why you maybe wouldn’t want to use them anyway: https://gist.github.com/joepie91/5a9909939e6ce7d09e29

Design

 1 2 3 4 5 6 7 8 910111213
    ________                                              ____________
   |        | <-- Primary Client (Cable) 172.22.10.x --> |            |
   |   EC2  | <-- Backup Client (4G)     172.22.20.x --> | pfSense    |
   |        |                                            | 172.20.x.x |
   |________|                                            |____________|
        ^
        | Public Internet Client(s) 172.22.30.x
    ____|____
   |         |
   | Phones/ |
   | Laptops |
   |_________|
        

The trick here is I have a /16 network provisioned at my house (172.20.x.x) and I wanted to be able to access all the services in my house from the public clients (phones/laptops). HOWEVER, I have 2 WAN connections from my house, a primary cable modem, and a backup 4G wireless modem, and I want to make sure traffic will flow down the backup pipe if the primary is down.

This presented me with some challenges initially with just one OpenVPN server running on the EC2 instance. Essentially client routing is handled internally by the OpenVPN server, and you have to tell it what client should be sent which traffic using the iroute config. So say traffic comes from one client (my phone) for 172.20.x.x, how would openvpn know which client should get that traffic? You setup a client to announce it wants that traffic via iroute in a file in a client config directory (ccd)

1
You will need client-config-dir /path/to/ccd/ in your server config file to enable ccd entries. ccd entries are basically included into server.conf, but only for the specified client. You put commands in ccd/client-common-name, and they are only included when the client's common-name matches the name of the file in ccd/. 

This created a problem for me having two clients that I wanted to handle the same route, because if both my primary and backup links connect and announce an iroute entry for 172.20.x.x, OpenVPN does not give me a mechanism for setting a priority (if this even works at all, I don’t remember what happens if two clients announce the same iroute)

The easiest way I found to solve this problem was to run multiple OpenVPN servers on my EC2 instance with separate subnets, then I could use the linux kernel routing tables to assign a route for 172.20.x.x network out the two connected VPN clients, using a metric on the routes to only send traffic out the backup route if the primary is down.

So now traffic would come in from my phone (172.22.30.x) destined for 172.20.x.x, the kernel handles routing this to the correct openvpn tunnel:

123456789
[ec2-user@ip-10-20-30-119 ccd_backup]$ ip route
default via 10.20.30.1 dev eth0
10.20.30.0/24 dev eth0  proto kernel  scope link  src 10.20.30.119
169.254.169.254 dev eth0
172.20.0.0/16 via 172.22.10.2 dev tun1  metric 100
172.20.0.0/16 via 172.22.20.2 dev tun0  metric 200
172.22.10.0/24 dev tun1  proto kernel  scope link  src 172.22.10.1
172.22.20.0/24 dev tun0  proto kernel  scope link  src 172.22.20.1
172.22.30.0/24 dev tun2  proto kernel  scope link  src 172.22.30.1

Picking the route which has the lowest metric (in this case tun1 which is the primary tunnel)

You still need the iroute entry within OpenVPN so that when the server receives this traffic it knows which client to send it out.

There were some challenges setting up the kernel routing too; we are adding routing based on a client (transient) so I couldn’t just add these routes with Ansible during the initial setup, luckily OpenVPN has a solution for this via something called client connect scripts (and client disconnect scripts) which execute when a client connects/disconnects

This still required a little trickery as the OpenVPN servers do not run as root for security reasons, so I had to give sudo permissions to each of their users to run the client connect and disconnect scripts as sudo as root is required to edit the kernel routing table

Implementation

I had one main requirement for the implementation, provisioning the VPN must be entirely scripted such that tearing down and creating a new ec2 instance running the VPN server could be scheduled and run automatically and routinely

I went with my old standby Ansible to accomplish this, the result is a handful of playbooks and roles.

You can check them out here: https://gitlab.com/slim-bean/ansible

Specifically ec2_rotate_instance.yml is the playbook I run to spin up a new ec2 instance and rotate it in to replace an old.

There is unfortunately a chicken/egg problem here, as I currently don’t have a playbook that would let you start from scratch, through the evolution of this process that playbook fell behind and I removed it… Sorry… If you were somehow trying to use my setup, you can start with the ec2_rotate_instance.yml file and add tags or comment out the sections that deal with the previous instance until you have one up and running.

This is basically how it works:

  1. Find the current instance and save off the IP and instance ID into variables
  2. Provision a new instance
  3. Update it
  4. Install the 3 VPN servers with scripts/routing/sudo perms etc
  5. Reboot the instance and wait for it to come back up
  6. Once it’s up, go over to pfsense and update the primary VPN client with the IP of the new server, poll it to wait for it to come up
  7. Then change the IP for the client VPN to the new server and wait for it to come up
  8. If they are both up, rename the tag on the incoming VPN server to be the main, and terminate the old instance.

pfSense

Configuring VPN

There were some unique challenges on my home network side. I have a pfSense device which controls my home network, and I love it, mostly…

One big downside is a total lack of an API or any programmatic way to configure it, which really put a damper on my requirement of scripting this entire process.

Until I found something called the pfsense PHP Shell https://doc.pfsense.org/index.php/Using_the_PHP_pfSense_Shell

It’s not pretty… but after scouring through the php source files for the web configurator pages and looking at a few of the other examples of how pages update configs, etc. I was able to hack together my own “php shell” scripts for checking the status of the VPN and updating the remote server IP for VPN clients:

I then figured out it was possible to use Ansible with pfsense if you provide it a path to the python binary, this can be configured in the hosts file. (You also need to go the normal route of putting an authorized_keys entry for your ansible control machine which is done in the GUI in the config for each user, add your pub key to the ‘admin’ user)

So now I can have ansible deliver those files to be used as “macros” from the shell executed by Ansible.

Configuring traffic to use the VPN

My home network setup is HIGHLY segmented, I think I’m up around 30 VLAN’s now, so I went through and did some pick and choose on which VLAN’s would egress via the VPN and which would go out via normal traffic (visible to my ISP).

High bandwidth devices like the XBox and Chromecast were sent straight out the network, however, vlans with laptops and phones/tablets were sent out over the VPN.

In pfSense you can configure the gateway a network uses, so for the networks I wanted to go out the VPN I would choose the OpenVPN client interface as the gateway.

Additionally, to make this work you also have to do some Source NAT’ing, forcing the traffic out the VPN works however the default outbound NAT rules in pfsense will assign the packets the WAN IP address and your packets will get lost forever. I work around this by using the Hybrid NAT mode in pfsense’s outbound NAT settings, and add rules for the networks I’m sending out the VPN to get an outbound NAT address of the VPN client address making sure they return to the correct place.

Conclusion

Cost

I’ve been running this for almost a year now and really it’s been working very very well. I throttled back to a t2.nano instance and have not had any issues with CPU/throttling via Amazon. The t2.nano costs me just under $5 a month (which could be reduced by almost half if I reserve it for a 3 year period), and bandwidth costs just under 10 cents per gigabyte and on average I’m using 30-50 GB’s of bandwidth a month costing me $3-$5.

So currently I’m averaging just under $10 a month to run my own VPN service, and with it working as well as it has, I will likely reserve a t2.nano instance and cut that in half.

Speed

The t2.micro and t2.nano instances do have some bandwidth limitations on them and it seems to be around the 60-70MBps mark (though it seemed the same to me for both the micro and nano). My cable internet provider is 60MBps down and regular speed tests are showing that full bandwidth through the VPN without issue.

If you did need more throughput than that, you would need to step up to a t2.medium which is quite a bit more expensive, but some basic speed testing I did right from the server showed throughput closer to 200-300MBps

Other comments

About once a week I run the rotate playbook and get myself a brand new instance with a new public IP, I have a dynamic DNS entry updated by pfsense so I can find my ec2 instance from my road warrier laptops/cell phones.

All my DNS traffic generated at home or via the clients connected is sent out the ec2 instance hiding it from any ISP (until you get to the Amazon level, but I’m pretty small fish coming out of their Virginia datacenter)

The only issue I’ve run into on occasion are websites which don’t like traffic originating from an Amazon IP (LOOKING AT YOU LOWES YOU JACKWADS), these are pretty few and far between, and I can kind of understand why they are doing it, but generally things have been pretty flawless!!

comments powered by Disqus