Amazon AWS: Return of Experience
After spending a good part of 2011 working with Amazon AWS, EC2 and so on (and ending up not migrating to it for reason not related to the quality of their service, which is good), I've been asked to provide a return of experience on it.
I haven't been very talkative on this blog in the past few months, so I figured I could share it to the world.
Disclaimer: It's biased, and only my own personal opinion. Take it as it is, and feel free to disagree.
This recommended setup gives a good picture of what AWS is all about, but it doesn't tell you two things:
- Your reliability is going to go down, big time. You need to design for failure not because it's the right thing to do, but because it's going to happen. An EC2 instance can go down at any moment, often without warning. It's not different from a regular server, but the frequency is higher.
- The performance of a single EC2 instance is a lot lower than its equivalent in regular servers. That's not an issue for web nodes, because you can just scale them horizontally, but if you have a monolithic database it will drive you crazy.
That said, AWS gives you all the tools needed to work around those issues. I like the setup showed in this diagram, but I would make a few changes to it:
- Do the DNS yourself, with a 3rd party like DNS Made Easy. DNS is easy and important enough so that you don't want to give it to amazon.
- AWS load balancer is nowhere close to a good old haproxy. I would run a frontend haproxy+varnish (reverse proxy cache) in front of a farm of apache+php nodes, with your memcache instances next to them, and the DB in the back.
Speaking of the DB: in my experience, this is the most difficult part to migrate to AWS. EC2 instances give you 2 types of storage: ephemeral (local disks attached to the instance at startup and flushed at shutdown) and EBS (persistent, network attached storage). Ephemeral disks have roughly the performances of a regular desktop sata drive, and EBS volumes fluctuate from the floppy disk to the sata drive. If you have one big monolithic DB and you need more than 10,000 iops to maintain your response time below 5 seconds, you'll suffer from it. I would advice to :
- Reduce to a minimum the usage of databases, relational or key-value. If something looks like a file and can be stored as a file, then it's a file. Put it in S3 and access it from the S3 API (or if it's a public file, serve it via the cloudfront CDN). And you do also want to look into s3fs tools. s3fs allows you to mount a S3 bucket as a regular linux mount, very handy to rsync a local storage with S3 without having to code an API client.
- If you must have a relational database, try to use either SimpleDB or RDS. I've spent way too much time trying to find the good combination of EBS volume, aggregating them in RAID1 or RAID10. Same with ephemeral volumes: they are not persistent, they are lent to your by the hardware your instance is running on at a given moment. I never really managed to move our huge oracle DB to EC2. It worked, with EBS volumes, but the impact on performances was so high that we pretty much gave up. EBS volumes would have to improve by 500%, at least, before you can consider that option. However, I suspect that Amazon does not virtualize their RDS environments. They give you a database that has been designed to run on their infrastructure and it's always going to be faster than doing it on your own. Plus, they take care of the replication, which is nice.
Regarding EBS volumes: I'm guessing that at some point you will want to benchmark those yourself, so maybe my notes will help. http://wiki.linuxwall.info/doku.php/en:ressources:articles:benchmark_ebs
A word about the network: when AWS was first designed, they had this ridiculous idea that instances should get a random hostname and a random IP at startup, and the customers shouldn't have any control over that. Well, in the real world, it sucks ! And there has been enough complaints about it so that Amazon released the VPC. It's a private network inside their EC2 environment. It's basically free (as in: no additional charge) and you can control the subnets and the IPs of your instances. You still have access to the same feature a regular EC2 setup provides, to the exception of some type of instances (the cluster ones) that are not available there. It takes a little bit of time to figure out the configuration of the routing and the internet gateway, because not all instances can access the internet by default. The VPC behaves more like a regular datacenter's network, where you have a nentry point that receives the traffic and routes it to your nodes. You can then decide to protect your backend nodes from internet traffic (something you cannot really do with the regular AWS), and even connect your office network to the VPC using a regular pptp endpoint. I haven't looked recently, but I think each availability zone has its own VPC these days. So you can (should) have a mirror of your active VPC environment in a passive availability zone, ready for failover.
Regarding the firewalling and security groups: this is one really nice feature of their infrastructure. You build your firewall policy into security groups. For example, you will have 4 security groups:
And your policy will look like:
- frontend SG: accept from 0.0.0.0 to 80,443
- web-nodes SG: accept from frontend to 80
- memcache-nodes: accept from web-nodes to 1234
- storage-nodes: accept from web-nodes and memcache-nodes to 6543
And you can dynamically add instances into each group without having to specify their individual IPs. The firewall policy is completely abstracted from the physical implementation. It's very nice and flexible to use. I don't have any particular comment to make on that, except to keep it simple and clean (but that's true for every firewall).
About the AMIs: I chose to roll out my own, based on a basic centos image. Building AMIs is easy enough if you start from an existing AMI (preferably EBS backed) and customize it. The Amazon 64 bits AMI is a clone of RHEL and is good enough. You can probably reuse some of the init scripts available out there to populate your AMI at startup, but quickly enough you will want to write your own. It works as follow: when you initialize an instance from an AMI, you have the possibility to pass user-data to the instance. In the user-data field, you can put a configuration file and have a script inside the AMI download it and parse it. In the user-data, you put the backend server to connect to, what services to start, where to get more configuration from, etc etc.... There is no predefined format for that, you can design your own as long as your script can parse it. The top of the top is to have an init script download the user data, do some basic initialization (identify itself, download access keys, etc...) and then connect to a master server and complete the configuration of the instance using puppet or chef.
Each instance can access its own user-data at http://169.254.169.254/latest/user-data (granted that you passed user-data at the initialization of the instance). User-data are private to the instance, so you can put credentials in it, but one drawback is that anybody with an access to the instance can download those data. So if a hacker breaks into your website and launches a curl command from within the web interface, he will download your user-data. Don't put your root password in there !
Using custom AMI with user-data and initialization scripts is the most powerful feature of EC2. Ones you can initialize instances on-the-fly, have them auto configure themselves and join the pool, the rest is easy. You plug whatever monitoring tool you use to your init script, and when the overall load of your pool is too high, fire up a new instance that will join the pool automatically. It's not easy to achieve (and you will always have a few minutes of delay between the launch command and the availability of the instance) but it's definitely how EC2 should be used.
It also allows you to design for failure. Your instances will crash on a regular basis (Netflix called that the Chaos Monkey), so you shouldn't store anything of value in them. Instead, have them download the last revision of the php code as part of the initialization process, and you can restart crashed node almost without feeling anything. Whether you use the AWS load balancer or Haproxy, you can add and remove nodes without service interruption. Best solution: store the user sessions contexts on a backend cluster (memcache, redis, whatever), so you can load-balance incoming request to any web node without losing the session when you switch from a node to another.
There are quite a few companies out there that provide AWS assistance. In my opinion, you will be better off architecting it yourself. Your needs are probably too specific to fall into generic canvas, and AWS is already constraining your enough so that you don't want to add another level of abstraction to your infrastructure.
That's a lot of info, I hope it clarifies your vision of AWS a bit. I should finish by saying that AWS is a lot of fun ones you gain enough control over it. Let me know how it goes, and if I can help in any way.