When I first started building my website, I didn’t spend a lot of time thinking about the way that I would host it. Originally, the site was a hybrid of Jekyll (a static website generator) and Sinatra (a barebones Ruby web-framework). I used Jekyll to generate my HTML, and Sinatra to serve it and other assets while also handle some convoluted redirect logic.
Originally, I kept all of my source assets, markdown, and images in my Git repository and used a tiny, single-core Digital Ocean server to handle both my Jekyll builds and Sinatra stack. To deploy a new version of the site, I’d make my changes locally, push them to Github, ssh into my cloud server, pull in my changes, rebuild the site, and restart Sinatra.
This worked for a few months, but the trouble was, as I wrote new posts, more and more images started ending up in my github repository. On top of this, the plugin I used to generate small, web-friendly images from these source images was processing so many images each build, that my underpowered cloud server would run out of memory before the build could complete.
Additionally, serving my images and other assets on the same machine being used to run my web server created unnecessary load. Every page load meant a dozen or more requests to the same server, and given the relatively low power of my instance this meant that a few simultaneous requests were enough to make the site essentially unavailable.
Eventually I decided to come to start paying back the tech debt that my initial inexperience had created. My entire website was really just a collection static files generated only at build time - not request - so there wasn’t really any reason to involve a web framework like Sinatra at all. The entire site could be hosted on anything that was capable of serving static files.
There are a few different services purpose-built for hosting static files, but Amazon’s Simple Storage Service (S3) is by far the most popular. As its name suggests, S3 is stupid-simple to use. You create a ‘bucket’ via their online console (or any of their countless client libraries), upload your files to it, and reference your S3 files’ URLs in your HTML.
S3’s pricing model is based on a few things, but the two primary factors are the amount of space needed (you pay per GB) and the number of requests for each object per billing cycle. For most people this means that in addition to being very easy to setup, S3 is also rather cheap.
One of the great things about using Amazon Web Services (AWS) is that it’s relatively easy to integrate with other AWS services. In the case of S3, Amazon Cloudfront is the perfect complement.
Cloudfront is an “edge-caching”, content distribution network (CDN). Combined with S3, it takes the contents of your ‘bucket’ and distributes them to servers all around the world. When a client requests a file from your Cloudfront-cached S3 bucket, Cloudfront will determine the fastest place to return the file from. For example, a user in Kuala Lampur might request http://your.website.com/image.jpg, and have it returned from a Cloudfront server in nearby Singapore, whereas somebody in Auckland will receive the same http://your.website.com/image.jpg from a server in Sydney. All the while, the original image file is on an S3 server in Virginia data center. This makes the website’s performance relatively constant regardless of a visitor’s location.
Cloudfront’s pricing model is similar to S3 - so using it will probably increase your bill significantly - but Cloudfront also charges to “invalidate” objects in its cache. Say for example, you make changes to index.html and push them to your S3 bucket. If you chose not to invalidate your cache on this object, it can take up to 24 hours for the changes to be pushed to all of Cloudfront’s servers. This means users in some places will see the old version of index.html while users in others will be seeing the new one. If you choose to invalidate, however, you can have the updated copies distributed across Cloudfront in under an hour.
I haven’t gone into depth on the setup details of these AWS services mostly because there are hundreds of tutorials and guides that already do. I will say that the documentation has only gotten better since I first moved to AWS. S3 now has a built-in option for hosting a bucket as a website, which makes the process even simpler.
Another thing to consider is that AWS is not the only option for cloud-based hosting. Both Google and Microsoft have really stepped up their cloud services game in the past couple of years and their services may be cheaper or more catered to some people’s needs. I chose AWS because of its immense popularity. It essentially created the market so there are more tools, tutorials, and people with experience using AWS than there are for any of its competitors.