Architectural Design Reference for Large Zimbra Deployments in Amazon Web Services

Architectural Design Reference for Large Zimbra Deployments in Amazon Web Services

This post documents a high-level architectural design reference for hosting large Zimbra deployments in Amazon Web Services cost-effectively, with a high degree of security, performance, redundancy and resiliency — and with very short RPO (Recovery Point Objective) and RTO (Recovery Time Objectives) targets.

For single-server installations, you can still use the architectural recommendations here to deploy your single Zimbra server in a single Availability Zone, or; you can increase redundancy and resiliency by adding, say, a second Zimbra LDAP/MTA/Proxy in a second Availability Zone.

Background:
In 2018 I was asked by Synacor to present on AWS Hosting Best Practices at the North American Zimbra Partners Conference.  I had been hosting my own Zimbra Hosting Partner infrastructure on AWS and had helped a number of customers (with smaller on-premises Zimbra systems that were due for a hardware or software life cycle event) to migrate to AWS.  I’d even helped a number of Zimbra Hosting Partners to migrate their infrastructure to AWS.

In the more than two years since then, my own hosting farm has grown, and I’ve had the opportunity to work with several larger AWS multi-server deployments, so I thought it would be a good time to share some updated AWS hosting best practices.

AWS Costs and Cost Management:
AWS has incredible levels of security, redundancy and resiliency built in, so comparing on-premise hosting costs with AWS hosting costs is not doing an apples-to-apples comparison — unless you have fully redundant and fault-tolerant SANs, networking, firewalling, power and cooling spread over at least two data centers.  AWS continuously lifecycles networking, compute, and storage hardware behind the scenes in a way that causes zero downtime.  If you don’t need those levels of redundancy and resiliency, that’s OK (you can probably stop reading right here!).

Your single-biggest cost for hosting Zimbra on AWS will be for storage, and my earlier blog post on Disk Reference Architecture for AWS shows you how to cut your vanilla storage budget typically by more than 40%.

Compute will be your next biggest expense, and here the key to saving 40% to as much as 63% on compute costs is to “right-size” your Zimbra servers, and then buy Reserved Instances (or make a total spend commitment, a new savings plan that doesn’t limit you to specific instance types).

Once you get your storage and compute costs down, and factor in the costs of all of the hours your networking/infrastructure guru has to spend taking care of your own infrastructure, I expect you will find that AWS’s costs actually are quite comparable, if not less than, on-premise hosting.  Of course, if you have recently purchased a lot of hardware and/or VMware licensing, or signed a long colocation lease, then that’s another story.

AWS Prerequisites and Basic Concepts:
AWS has their own lingo, so let’s document that now…

  • Availability Zone.  An “AZ” maps to a data center.  Separate Internet connectivity, power, cooling, etc.
  • Region.  A Region is made up of several AZs.  If one AZ dies, it won’t impact the other AZs.
  • Virtual Private Cloud (“VPC”). This is hard to define simply, because while a VPC describes the networks within a region and their routing tables, a VPC also includes constructs like Security Groups and Network ACLs (see below).  Essentially, our Reference Architecture here domiciles all of your Zimbra instances within a single VPC; think of a VPC as your own Virtual Private Cloud. Note that all AWS instances get RFC 1918 private IP addresses assigned to their NICs. The NAT’ing (if any) is handled automagically by the VPC.
    • Public Subnets. These have an AWS-supplied gateway and routing table to support both inbound traffic from, and outbound traffic to, the public Internet.
    • Private Subnets.  Instances in a private subnet can talk to instances in public subnets and each other, but the private subnet itself has no route to nor from the public Internet.  Further, instances in a private subnet cannot be given public IP addresses ever.
    • NAT Gateway. An AWS service you can configure for your private subnets, to allow only outbound connections from instances in the private subnet to reach the public Internet.  This is useful for mailbox servers to be able to run commands like “apt-get update && apt-get dist-upgrade” while preventing any bad actors on the public Internet from initiating a connection to your mailbox servers.
    • DHCP Options. AWS is a DHCP server for your Zimbra server clients. We use specific DHCP options to control the settings in /etc/resolv.conf on each of the Zimbra servers.
  • Instance. An AWS virtual server. Instances and their EBS volumes live in a single AZ. Instances can not be migrated across AZs (nor can EBS volumes), but this isn’t a bad thing.
  • Elastic Block Store (“EBS”). EBS volumes are block storage, i.e. hard disks, that live in just one AZ and which come in several flavors at different performance:cost combinations:
    • st1. Throughput-optimized but generally slow disks. These are half the cost of gp2 disks (you pay per GB provisioned).
    • gp2. Standard SSD disks. You get 3 IOPS for every GB you provision, so a 250GB disk provides 750 IOPs.  Like st1 disks, you pay per GB provisioned.
    • gp3. Standard SSD disks – Next Generation, introduced December 2020.  You get a base 3,000 IOPS regardless of the size provisioned, for 20% less cost than gp2 disks, and; like io2 disks, you can dial up the IOPs you need to 16,000 (for more money of course).
    • io2. Provisioned IOPs disks. Here you pay a blended cost based on the number of GB provisioned plus the number of IOPs provisioned.  Need a 100GB disk with 10,000 IOPs?  No problem!
    • Fun Fact: EBS volumes are continuously replicating between at least two storage backends at all times.  Like DHCP leases that expire, periodically a pair of replicating storage backends will end their mutual replication and find new storage backends with whom to replicate.  It’s a bit promiscuous, yes, but it all happens behind the scenes with no service interruptions to your instances, and, it allows AWS to life cycle storage backends again with zero downtime to customers.
  • Elastic IP Address (“EIP”). A static IPv4 address you obtain from AWS’s pool of available IP address that is for your exclusive use.  We’ll need an EIP for each of our MTA servers.
  • Object Storage (“S3” or Simple Storage Service). S3 “buckets” do not support POSIX filesytems like EBS volumes.  They hold objects, like mail blobs and other file-like things, but how you interact with objects is different than how you interact with regular files.  Basically, you can create folders within a bucket, and then you can PUT, GET or DESTROY objects within the root of the bucket or within any folders in the bucket.  Some key facts about S3 storage:
    • S3 buckets are replicated across all AZs within a Region, so data durability is more than eleven 9’s. That’s higher than even some of the most expensive SANs out there.
    • When you snapshot an EBS volume (disk), the snapshot goes to S3.  So, you can restore that snapshot to a new EBS volume in any AZ within a Region.  EBS snapshots are terrific to use for your disaster recovery plan if you are using EBS volumes for your Zimbra backup disks.  But, now that Zextras fully supports doing Zimbra backups to S3 directly (just released this month), you can save even more money on storage (and shorten your RTO) by migrating your disk-based backups to S3.  Zextras includes a command line utility to automate the conversion of disk-based backups to S3-based backups.
    • S3 supports auto-tiering, with lower performance tiers (e.g. “Infrequent Access”) costing less. You can even have a policy that moves really old objects to Amazon Glacier (not recommended except perhaps for backups of Zimbra Archive mailboxes).
    • AWS algorithmically throttles the rates at which you can issue PUTs, GETs and DESTROY commands, so forget about disk-like concepts such as IOPS with block storage.  This fact is why, before you can use S3 storage within Zimbra, you need to create the /opt/zimbra/cache and /opt/zimbra/incoming directories on your mailbox servers.
  • Security Groups. A set of inbound and outbound firewall policies that are applied uniformly to every network interface on all of your Instances within a VPC.  Security Groups are used to open public-facing ports to your Zimbra servers, and to allow inter-server traffic between your Zimbra servers.
  • Network ACLs.  These look like the policies in a Security Group, but they apply to individual subnets (i.e. at the router level, not the NIC level) within a VPC.  And, you can only have 20 rules within a single ACL.
  • Network Load Balancer.  A fault-tolerant, scalable Layer 4 managed service we will use for load balancing the inbound traffic to our public-facing Zimbra servers.
  • Identity and Access Management (“IAM”).  The service used to create Roles and Users with various permissions for accessing AWS resources.  Junior system administrators who log in to the AWS Console to manage your Zimbra servers should have an IAM account that, for example, only lets them start and stop the Zimbra servers, and not destroy them. Nor would you want these junior system admins to be able to modify your Security Groups.  In our architecture, we use an IAM account that has no rights to log in to the AWS Console at all — this account can only access the S3 service, so we use this account for Zimbra to be able to access S3 storage buckets programmatically.
  • Route 53. AWS’s hosted DNS service.  While you can use Route 53 to host your public zones, in our Zimbra architecture we need Route 53 to host the Private forward (A records) and reverse zones (PTR records) for our Zimbra servers. Route 53 enables whatever caching DNS server we may install on the Zimbra servers themselves to use AWS’s Route 53 as a forwarder.

High-Level Architecture Plan – Goals
Zimbra scales from a single server by adding more resources to a singles server (RAM, CPU and Disk), and then eventually by moving certain Zimbra Services to separate Zimbra Servers.  So this plan presumes we need multiple mailbox servers to handle all of our thousands of mailboxes, with different Zimbra services (MTA, Proxy, LDAP) on different Zimbra servers.

In our architecture plan, we are worried about four things:

  1. Security.
  2. What to do when a single Zimbra server fails.
  3. What to do when an AWS Availability Zone fails.
  4. Maximizing the cost:benefit ratio by increasing performance, reliability, security and decreasing disaster recovery RPO (recovery Point Objectives and RTO (Recovery Time Objectives) cost-effectively.

High-Level Architecture Plan – Design
At its core, this plan is no different than the usual Zimbra multi-server architecture plan, comprised of multiple Zimbra MTA, Proxy, LDAP and Mailbox servers.  The first difference here is that we are going to spread these servers across at least two Availability Zones within a Region. The second difference is that we are going to use AWS Network Load Balancers to load balance all inbound connections.

A Note for companies with just a Single Zimbra Server: If you have a small deployment that can fit on a single server (say, 500 busy users or less), then you can still use this architecture for your single server in a single Availability Zone. I’d still keep the Network Load Balancer, because it will only cost about $20 per month or less; make migrations to a new or replacement Zimbra server easier, and because Network Load Balancers provide some protections against malicious traffic.

The design at a high level is straightforward:

  1. You’ll create an AWS VPC with a /16 private network (like 10.0.0.0/16) and then create, say, smaller /26 subnets in each of the Region’s Availability Zones.
    1. If you decide to create both Public (MTA servers) and Private subnets (Proxy, LDAP and Mailbox servers), you’ll need to deploy an AWS NAT Gateway so the Proxy, LDAP and Mailbox servers can download the Zimbra installer and get operating system and Zimbra updates.
    2. Hint: use the VPC Wizard so all of your routing tables and the default public gateway are configured correctly.
  2. You’ll create an AWS Security Group:
    1. To allow inbound traffic to only the public-facing ports/protocols on your Zimbra servers, like TCP ports 80, 443, 25, 587, 993 and 9071 (you may want to open more ports, for example for your TURN server, or if you support POP3S access);
    2. To allow inter-server traffic across all of your subnets, and;
    3. To allow all outbound traffic from your Zimbra servers to the public Internet.
  3. Within a single region, you’ll distribute Zimbra servers across at least two Availability Zones.
    1. Each Availability Zone will have at least one MTA, Proxy, LDAP MMR and mailbox server.
    2. Unless you have more than, say, 5,000 active mailboxes, you can collapse the MTA, Proxy and LDAP services onto a single Zimbra server (one MTA/Proxy/LDAP MMR server in each of at least two Availability Zones).
    3. For each of your MTA servers, you’ll obtain an Elastic IP address.  So, if you plan on having three MTAs in three Availability Zones, get three Elastic IP addresses.
      1. Hint: Check each Elastic IP to see if it is currently listed on any block lists, and if so, get rid of it and get a different one until you have Elastic IPs that are “clean”.
    4. Since there will be an LDAP MMR server in each Availability Zone, make sure the ldap_url and ldap_master_url localconfig attributes in all Zimbra servers list the LDAP server in the same Availability Zone first!
  4. You’ll have a single FQDN (like, “mail.mycompany.com”) for all of your inbound traffic. The A records for this FQDN will point to the public IP addresses of the AWS Network Load Balancer system (Layer 4, not Layer 7!).
    1. AWS’s Network Load Balancers create a load balancer instance in each Availability Zone in which there are Targets. So if you have MTA, Proxy and LDAP MMR servers in three Availability Zones, you will have three public IPs for the load balancer system, and so three A records for “mail.mycompany.com” pointing to each of these three IP addresses.
    2. The private IP addresses for the load balancer instances should be added to the global variables zimbraHttpThrottleSafeIPs; their networks should be included in zimbraMtaMyNetworks.
    3. Again, you can use AWS’s Route 53 DNS service the create the Forward (A records) and Reverse (PTR records) for all of the Private IP addresses used by the Zimbra instances and the AWS Network Load Balancer instances.
  5. Amazon S3 Buckets are Region-wide, so you’ll want to create a bucket for Zimbra Secondary Volumes (HSM), and another bucket for Zimbra Backups.
    1. Note: At this writing, the External Backup feature had just been released.  If you are still using EBS volumes for your Zimbra backup disks (see my blog post for optimal disk layouts on AWS), that’s OK, just create a Lifecycle Policy to take periodic snapshots of your backup disks for disaster recovery purposes.

 

So what does this all look like?  Let’s examine individual servers and components in a system that spans three Availability Zones:

Server FQDN Function Public IP Private IP Example Availability Zones and Subnets
mail.mycompany.com AWS Network Load Balancer Three public IPs (AWS assigned), one for each NLB instance in each of the three Availability Zones. Three private IPs (AWS assigned), one for each NLB instance in each of the three Availability Zones. us-east-2a :: 172.16.0.0/26 (Public)
us-east-2b :: 172.16.0.64/26 (Public)
us-east-2c :: 172.16.0.128/26 (Public)
proxy1.mycompany.com
proxy2.mycompany.com
proxy3.mycompany.com
Zimbra Proxy Servers None. (Note: If you are worried about the Network Load Balancers all failing simultaneously, you can put the Proxy servers in a Public subnet.) Three private IPs (AWS assigned), one for each Proxy instance in each of the three Availability Zones. us-east-2a :: 172.16.1.0/26 (Private)
us-east-2b :: 172.16.1.64/26 (Private)
us-east-2c :: 172.16.1.128/26 (Private)
mail1.mycompany.com
mail2.mycompany.com
mail3.mycompany.com
Zimbra MTA Servers Three public IPs (AWS Elastic IP addresses that you acquire), one for each MTA instance in each of the three Availability Zones.

Be sure to fill out the AWS form to remove Port 25 restrictions and get PTR records created for these servers!

Three private IPs (AWS assigned), one for each MTA instance in each of the three Availability Zones. us-east-2a :: 172.16.0.0/26 (Public)
us-east-2b :: 172.16.0.64/26 (Public)
us-east-2c :: 172.16.0.128/26 (Public)
ldap1.mycompany.com
ldap2.mycompany.com
ldap3.mycompany.com
Zimbra LDAP MMR Servers None Three private IPs (AWS assigned), one for each LDAP instance in each of the three Availability Zones. us-east-2a :: 172.16.1.0/26 (Private)
us-east-2b :: 172.16.1.64/26 (Private)
us-east-2c :: 172.16.1.128/26 (Private)
mailbox1.mycompany.com
mailbox2.mycompany.com
mailbox3.mycompany.com
mailbox4.mycompany.com
mailbox5.mycompany.com
mailbox6.mycompany.com
Zimbra Mailbox Servers (Note: one of these will also be the Logger host.) None Six private IPs (AWS assigned), one for each Mailbox instance in each of the three Availability Zones (two mailbox servers in each Availability Zone). us-east-2a :: 172.16.1.0/26 (Private)
us-east-2b :: 172.16.1.64/26 (Private)
us-east-2c :: 172.16.1.128/26 (Private)
backup.mycompany.com Amazon S3 bucket for Zimbra backups (You may want to sync your metadata here frequently, to shorten your RPO.) N/M None S3 is a Region-wide service
hsm.mycompany.com Amazon S3 bucket for Zimbra Secondary Volumes (Hierarchical Storage Management) N/M None S3 is a Region-wide service
public.mycompany.com A public, read-only Amazon S3 bucket for storing your customized App Banner and Login Banner logo files; perhaps your custom zimlets too. N/M (There is no A record per se, but https calls are routed correctly.  It’s magical…) None S3 is a Region-wide service
AWS NAT Gateway To allow the Zimbra servers on the Private subnets to “reach out” to the public Internet to get Zimbra and operating system patches and updates. N/M N/M The routing tables and other attributes of your VPC are configured automatically when you use the Wizard to create a NAT Gateway.

Disaster Recovery Scenarios
How do we handle Disaster Recovery?  There are four key Use Cases:

  1. A mailbox is deleted/destroyed/corrupted.
    1. Just use the Zextras (Backup NG) Recover Deleted Mailbox method.
  2. A Proxy, MTA or LDAP server is corrupted or destroyed.
    1. Remove all LDAP and localconfig references to the affected server, and then at your leisure build a new one.
      1. The AWS Load Balancer system will immediately stop referring connections to a downed proxy or MTA server.
      2. Zimbra’s mailbox servers will fail over to using the remaining MTA and LDAP servers.
  3. A mailbox server is corrupted or destroyed.
    1. Delete from LDAP all of the mailboxes, distribution lists, resource accounts that were domiciled on that server followed by a “zmprov ds <fqdn>” of the downed mailbox server, and either:
      1. Build a replacement mailbox server (or use a warm standby server) and restore everything there from the S3 backup, or;
      2. Restore everything to the remaining mailbox servers from the S3 backup, build a replacement mailbox server at your leisure, and then live migrate mailboxes to even things out again.
        1. Only users whose mail was on this mailbox server will be impacted.  All other users will be able to use Zimbra normally during the recovery process.
  4. An AWS Availability Zone is corrupted or destroyed.  It’s never happened, but why not be prepared — just in case?
    1. Use the recovery steps in #2 and #3 above, combined.

One More Word  of Caution About Disaster Recovery On Premises…
I have had a number of customers believe their on-premise Zimbra systems to be well-protected by products like Veeam.  Veeam is a great product, but using Veeam, indeed using any product that performs so-called “crash-consistent” backups, is problematic with Zimbra, for two reasons:

  1. First, Zimbra keeps the working sets for all of its databases (LDAP, Lucene, MariaDB) in RAM, so a Veeam snapshot of a running Zimbra server’s disks won’t capture all the changes to these databases.  While MaraDB’s InnoDB databases are ACID compliant, at the time of a snapshot, transactions that hadn’t been committed to disk will be rolled back when MariaDB next starts (after you do a Veeam Restore), and so you can lose the “harmony” between what’s in MariaDB and the blobs on disk (i.e. orphaned blobs and missing blobs).  Lucene mailbox indexes I have seen be restored in an inconsistent state, requiring mailboxes to be reindexed (annoying, but no data loss).  But LDAP sometime after a restore will fail to restart, requiring you to restore LDAP from the previous day’s backup (and losing a day’s worth of changes…), or, taking an LDAP dump from the remaining LDAP MMR servers and importing it into the restored LDAP server.  Typically, to avoid these complexities, I just build a new replacement LDAP server, but if you have a just a single Zimbra server, you are stuck doing a restore.
  2. Second, if you build your Zimbra server running LDAP with a 100GB partition for /, and then large partitions for /opt/zimbra/store, /opt/zimbra/db, /opt/zimbra/index and /opt/zimbra/backup, your Veeam restore will always fail.  This is because Veeam has no concept of Linux sparse files, and Zimbra’s LDAP uses an 80GB (96GB on earlier versions) sparse file for the /opt/zimbra/data/ldap/mdb/db/data.mdb file. If you do an ls -alh against that file, you’ll see it reports as 80GB, but if you do a du -csh against that directory, you’ll see that you are using a lot less than 80GB.  At last summer’s AWS Summit in New York City, I had an opportunity to discuss this with a senior Veeam engineer, who confirmed that Veeam restores sparse files just fine, but as regular files.  Meaning that 100GB partition upon restore will have probably 40GB of log files, cached updated packages, Zimbra software, and an 80GB LDAP data.mdb file.  But of course, 80GB + 40GB = 120GB, which doesn’t fit on a 100GB partition, so the Veeam restore fails.

Sure, you can shut down your Zimbra servers and then take a snapshot to avoid the first of these problems, but if your email needs to be up and running all the time…

 

Conclusions
With Amazon Web Service’s high levels of redundancy, resiliency, performance and security,  we can build a very cost-effective multi-server Zimbra system in AWS, typically for less cost than an on-premises system given comparable levels of hardware redundancy and performance.

Further, we can make our system incredibly secure just by relying on AWS’s built-in security capabilities which have a zero-dollar cost.

Finally, this architecture provides extremely low RPO/RTO targets much more cost-effectively than most on-premise systems, without having to rely on any third-party backup tools (which in any event won’t always work as well as we would like them to).  There are various architectural options to balance costs against either shortening (or lengthening) your RPO/RTO targets; we are happy to help.

If you’d like help with your Zimbra deployment on AWS, just fill out the form and we’ll get right back to you!

Number of mailboxes expected in your AWS deployment?(required)

Hope that helps,
L. Mark Stone
Mission Critical Email LLC
16 November 2020

The information provided in this blog is intended for informational and educational purposes only. The views expressed herein are those of Mr. Stone personally. The contents of this site are not intended as advice for any purpose and are subject to change without notice. Mission Critical Email makes no warranties of any kind regarding the accuracy or completeness of any information on this site, and we make no representations regarding whether such information is up-to-date or applicable to any particular situation. All copyrights are reserved by Mr. Stone. Any portion of the material on this site may be used for personal or educational purposes provided appropriate attribution is given to Mr. Stone and this blog.