Shadministration — Byte-sized adventures in tech

AWS Load-Balanced WordPress… The Storage Dilemma

ebs replication aws

In a preview of another blog post coming very soon, I’ve decided to do a short post on an architecture dilemma I encountered while configuring a load-balanced WordPress website in AWS – specifically regarding storage. My plan is to have two public EC2 web servers handling incoming web traffic via a load balancer. The database I’ll have running on RDS with a read-replica in another availability zone (AZ). Finally, I will offload my media files to an S3 bucket. I got all of that configured and working, but soon realized it was going to be a pain to update two or more servers manually every time I made a plugin or theme change, so I started searching for better ways to do this.

The most obvious solution, or so I thought, was to migrate the WordPress application files to an EFS (Elastic File System) volume and share it between the two web servers. This is actually AWS’s recommended solution. However, EFS has a latency problem. I started opening article after article on Google search about how people tried migrating to EFS and the site performance degraded to the point of being almost unusable. This is due to the many small PHP files that WordPress needs to load to deliver a page which EFS doesn’t handle well. The recommended solution for this is to use Opcache to cache the content on each server and then disable, or write to the local volume, all logging and write-intensive plugins/functions. I didn’t really like this solution for a few reasons:

  1. Added complexity I don’t feel is necessary for two or three servers.
  2. Increased cost for EFS storage.
  3. If I want to run a plugin like WordFence with a lot of logging, I’ll need to modify the plugin to write somewhere else which might not be simple or best practice
  4. In many of the articles/post I was reading, people stated that even after performance tuning, their websites weren’t super quick using EFS

One solution I came across from A Cloud Guru was creating a master write-node and syncing the WordPress application files from that system to an S3 bucket via a Chron job. The other nodes in the load balancer group would then sync down the changes in the S3 bucket to their volumes every minute or so via a Chron job. This seems like a good solution for my application. For my setup though, I want to exclude S3 and pull the current WordPress application files directly from the master node volume to the read-only node(s) using rsync over SSH. I’ll show you specifically how I did this – just note that I am using a Bitnami WordPress image on AWS, so these steps might need to be different for a different environment.

First, I created an ssh keypair on the master server (Web01) using the following command:

ssh-keygen -f ~/.ssh/wp_sync -q -P ""

This creates a keypair with “wp_sync” and “wp_sync.pub” in the ~/.ssh directory of the current user. According to Bitnami, you also have to modify the /etc/ssh/sshd_config file as well and add the following lines (so I did that):

RSAAuthentication yes
PubkeyAuthentication yes

While still on Web01, I installed rsync (apt-get update, apt-get install rsync).

Next, I copied the private key over to my read-node server (Web02) and dropped it in the same ~/.ssh/ directory on that instance. After installing rsync on Web02 and confirming that the security group for my servers allowed SSH to/from their respective subnets, I was ready to create a connection to the master server (Web01) from Web02. Below is the generic rsync command I am using:

rsync -avz -e "ssh -i ~/.ssh/wp_sync" --progress user@10.10.10.1:/opt/bitnami/apps/wordpress/htdocs/wp-content/ /opt/bitnami/apps/wordpress/htdocs/wp-content --delete

This command establishes the rsync connection over SSH using the private key file from Web01 and then (recursively) syncs the contents of the wp-content directory on Web01 to the same directory on Web02. The “–delete” flag at the end instructs rsync to remove files/directories in the destination directory if they are not in the source directory. To schedule this command to run periodically, I simply included this command in a Crontab entry. I currently have it scheduled to run every 5 minutes.

I plan to publish a more in-depth post soon detailing the architecture for my high-availability WordPress site and how I set it up on AWS, but I thought this was worth its own post. Hopefully this will help someone trying to do something similar – stay tuned for more!

Leave a Reply

Your email address will not be published. Required fields are marked *