A Non-Tech Introduction: Amazon Web Services

13 May 2014

There is a whole other side to Amazon that most people don't know about. Along with selling all kinds of physical goods, Amazon has also been providing a vital infrastructure service to the internet. What this means to internet consumers like you and me is that a lot of the internet depends on Amazon. Just how much of the internet? A couple of notable outages of this service in 2012 took down popular internet behemoths like Netflix, Pinterest, AirBnB and Instagram.

This post is going to give a non-tech-heavy introduction to this service. In order to understand why it exists and its implication, we'll need to explore a little bit more about how the internet works. But, following the vein of this series, this post won't require any background computer science knowledge.

Some background: How a website works

We need to understand a bit about how the internet works in order to appreciate the service that Amazon has created. To start off, here is one fundamental fact about the internet that is not immediately obvious to everyone: everything on the internet has to "live" on a computer.

This is a simplification in at least two ways. First, we're not really talking about "computers" like the kind you and I use on a day-to-day basis, but rather a more specific kind of machine. These servers might have many similarities to your Macbook, but are generally dedicated to a specific task, such as holding or hosting a website.

Second, what does it mean for a website to "live" on a computer? Roughly, this means that a machine has to know what web pages it's supposed to hold, and how to get those pages to a consumer when needed. For example, a Facebook machine has to understand that it holds code about Facebook, and, when I try to access my Facebook profile, this machine has to be able to send back the information that I am requesting.

Diving a little deeper, when I type in a URL like facebook.com, my computer will try and find the correct machine on the internet to ask for the information I want (in this case, the Facebook home page). But, how does it know where this machine is on the internet? The short answer is that the Facebook machine can be identified with an IP address. Every computer connected to the internet (whether it be Facebook's machines or your own personal laptop) has one of these numerical addresses, and then the challenge of finding another computer on the internet can be boiled down to some kind of map that tells you "IP address belongs to Allen, and IP address belongs to the URL www.facebook.com which belongs to Facebook".

When my computer finds the Facebook machine and asks it for the information I want, the machine will then run Facebook's code and give a response back to my computer in the form of a package of HTML, Javascript, and other code that my computer can then render in a browser like Chrome. The first Facebook machine I contact might not be able to generate the entire answer on its own; in that case, it can contact other Facebook machines to calculate a more complex answer before delivering the entire response back to my computer. (Splitting the workload of calculating a response across several different machines, each of which has a unique function, is a web systems infrastructure philosophy called service oriented architecture.)

OK, so what's the problem?

There is one question we haven't addressed so far: if everything on the internet has to live on a machine, where are these machines physically? In the olden days (which was really not that long ago in years), companies wanting to host a website for themselves would have to buy these machines. They were big, expensive, and finicky, so companies would have to hire large teams to set everything up and maintain everything over time.

But then a bright idea came about: since everything is connected via the internet anyways, why don't companies that want these kinds of machines just rent them instead of buy them? If Company X wanted to launch a website, they could write all the code of that website, and then upload that code to a machine they've rented in the cloud. (The cloud, appropriately, is a nebulous concept, but let's think of it as a lot of machines available to serve different purposes via their connection to the internet.) These rented machines could then be connected to the internet and respond correctly if somebody was looking for www.companyx.com. This way, Company X could eliminate a lot of the work around buying and maintaining these expensive machines.

This is where Amazon stepped in with their Amazon Web Services (AWS) product. AWS is essentially the service of providing rentable machines connected to the cloud. True to Amazon's style in their consumer product, the cost of renting computing power via AWS is cheap...very cheap.

The nearly all-in-one solution

So let's say I've decided to start the Bits-n-Bites catering company, and want to create a website for it. After I painstakingly write all the code for it, I ask AWS to rent a small machine (small in terms of its computing power), and I upload the code for my website onto it. After a bit of systems infrastructure magic, it works, and people can visit bitsnbites.com. The machine wasn't even that expensive.

Later, after a lot of great press, that one small machine is no longer able to handle all the web traffic I'm getting, so I have AWS automatically scale all of my computing resources up (e.g. switch to bigger, more powerful machines). These cost more money, but are still very cheap.

Then I realize I'm posting a lot of pictures of my food, so I look into dedicated storage for my (digital) images. AWS has that as a product too: S3 (for Simple Storage Service). This is also cheap.

A few months later, if I decide to start collecting lots of detailed data on my customers, I need a service that provides a database. AWS has that too, and even has different options to choose from (such as DynamoDB, RDS and, for "big data", Redshift). If I want to run analysis on this data, I can use AWS's Elastic MapReduce or Kinesis products, and if in the future I wanted to back-up or find long-term storage for my data, I could use the AWS product called Glacier. These are all also cheap.

There are a lot more products offered under the AWS umbrella, and amazingly, they are all cheap. In fact, they are getting cheaper - Amazon often announces price cuts to these services.

The combination of low price and wide feature set makes AWS extremely appealing for start-ups wanting to move quickly. In the olden days, creating a website and a web-based product would require a lot of back-end, infrastructure work wrangling with servers. Today, AWS provides a solution to this set of needs, and drastically reduces the cost (in time, money, and employees) needed to work on the internet. In fact, the ability to leverage "cloud computing" like what AWS offers is one of the reasons why there are so many tech start-ups today - it's just so much easier and cheaper now to get something working online. And to boot, as a startup grows, the resources of AWS can keep up with the company.

AWS is already a multi-billion dollar business that is pretty far ahead of its rivals (which include competing products from other tech giants like Google and Microsoft). Its low costs have meant that a lot of today's startups are using AWS. The consequences of this become apparent when there's a blackout, like the ones in 2012 that simultaneously brought down some popular consumer web products. But as long as AWS provides a cheap and full-featured set of products, it will remain a popular choice for tech companies of all sizes.

When I started working in the tech industry and first learned about AWS, I was shocked at how much it offered, and how important it was to the backbone of the internet. It's an amazing product (from a technological but also a business standpoint) that not many consumers get to see.

comments powered by Disqus