Migrating a VMWare SharePoint Dev VM to Azure – Part 2

Update: I’ve added disk performance figures for an Azure ‘D’ series virtual machine.

In a previous post I described a process for migrating an on-premises, VMWare based SharePoint development environment up to Azure. Whilst functional in a pinch, performance of the resulting virtual machine left a little to be desired.

The bottleneck seemed to be disk performance. Unsurprising given that the original VM was designed and optimised to run on a single SSD capable of a large number of IOPS, rather than a single, relatively slow disk as offered by an A servies VM in Azure.

First of all, I thought I’d gather some metrics on the performance of the respective disks I was using. Microsoft offers a great tool called DiskSpd for doing exactly this.

Here are the salient results after running DiskSpd on some representative disks:

Storage I/O per sec MB/s Avg. Latency (ms)
Local SSD 28,813.07 225.10 2.22
Local HDD (7,200RPM) 514.08 4.02 124.87
Azure A series OS disk 515.71 4.03 123.36
Azure A series temp disk 67,390.80 526.49 0.95
Azure A series data disk 492.16 3.84 82.76
Azure D series OS disk 972.78 7.60 65.56
Azure D series temp disk 128,866.03 1,006.77 0.495

Little wonder the VM wasn’t performing brilliantly: we’ve got an incredibly heavy workload, (remember, this dev environment is running Active Directory Directory Services, DNS, SQL Server, SharePoint features including the Distributed Cache, Search & workflow as well as client applications like Office, a browser with dozens of tabs open and Visual Studio – just to name a few), all running off a single disk which is only capable of approximately 500 IOPS. The SSD it runs off when stored locally is able to handle this load as it offers a blistering 28,000+ IOPS but what can we do in Azure to improve the situation?

Use the temp disk

Looking at the DiskSpd testing results above, it’d be tempting to try to use the temp disk for as much of the load as possible – it offers an incredible 67,000+ IOPS afterall! (As an aside, I can only imagine this is either a RAM disk or something with crazy fast caching enabled, either way, it’s cool, huh?)

Unfortunately, Microsoft go to great lengths to point out that the temp disk assigned to Azure VMs isn’t persistent and shouldn’t be relied upon for storing data long term (more here):

Don’t store data on the temporary disk. This disk provides temporary storage for applications and processes and is used to store data that you don’t need to keep, such as page or swap files.

This rules the temp disk out for much of what we’re doing, however there’s a clever technique which enable us to configure SQL Server to use the Azure temp disk for it’s TempDB logs and files.

This is certainly something which would have a positive impact on performance (both the TechNet documentation on Best practices for SQL Server in a SharePoint Server farm and Storage and SQL Server capacity planning and configuration (SharePoint Server 2013) state that TempDB should be given priority for the fastest available storage) and the nature of the TempDB (that it’s re-created on start-up of SQL Server) lends itself well to this solution.

Other than storing TempDB, there isn’t anything else I could think of which was transient and would suit storing on the Azure temporary drive.

Add more disks

As the testing with DiskSpd shows, each data disk we attach to our Azure virtual machine provides ~500 IOPS. Different sized Azure VMs allow you to attach different numbers of data disks. We’re using a relatively large virtual machine (an A6) so can attach up to 8 additional disks. How we use these disks is where things get interesting:

  • We could attach disks, allocate each a drive letter, then move SQL databases and log files on to their own drives. For example; one drive for content database log files, one for search databases, one for content databases and one for all the other databases). This would give 500 IOPS to each.
  • We could attach multiple disks, then use Storage Spaces to pool them all, creating a single volume made up of multiple disks. This one disk would have a higher number of IOPS, although it isn’t a linear 500*number-of-disks relationship. I tested 1, 2, 3 and 4 disks pooled and found IOPS went from 492, to 963, to 1434, to 1919.
  • Finally, we could use a hybrid of the two outlined approaches: for performance dependent files pool multiple disks, for less IOPS sensitive data, use just a couple of pooled disks, and for latency agnostic requirements, use single disks.

Note: it’s important to remember that where Storage Spaces is sometimes used to provide resiliency through RAID like features such as Mirroring and Parity, we’re only interested in the performance improvements it gives us – resiliency is afforded through the underlying Azure infrastructure (whereby writes to VHDs are replicated synchronously to 3 locations in the current datacenter and potentially to a further 3 locations in a paired datacenter asynchronously).

Use an Azure virtual machine with faster storage

Instead of an ‘A’ series virtual machine, we could look at using the already available ‘D’ series. The primary difference these offer is an SSD based temp disk. From the performance figures the ‘D’ series disks return, we see a slight increase in performance of the OS disk and a frankly staggering performance of the temp disk. Although they’re considerably more expensive, the faster storage included with a ‘D’ series VM would clearly help.

Incidentally, Microsoft recently announced a new tier of virtual machines called the ‘G’ series. These are primarily billed as providing larger local SSD storage options. For our purposes, (where we can only really take advantage of the temp disk for our SQL Server’s TempDB), I don’t think this will help us much.

When they announced the ‘G’ series VMs, Microsoft also announced a new, SSD based, ‘Premium Storage’ offering. Details are trickling out but it sounds like we will be able to use employ those techniques already described, (attaching additional data disks – this time SSDs – and striping across them using Storage Spaces), to provide very fast storage for our SQL requirements.

Consider another cloud provider

The final approach to consider, if we want to provision a virtual machine with potentially faster storage, is an entirely different public cloud provider. Among the options is Amazon’s AWS offering. I haven’t used AWS a huge amount but their i2.xlarge VM instance offers 4 CPUs, 30GB of RAM and a single 800GB SSD which sounds a great fit for what we need.

SQL legend Brent Ozar recently performed some testing of AWS VM performance and compared and contrasted it with Azure. His findings, for the specific scenario he considers, show AWS to be cheaper and better performing.

Conclusion

We can either increase the IOPS available to our virtual machine for little additional cost by adding disks, striping across them & reconfiguring SQL to spread the load across them or we can look to take advantage of higher performance public cloud storage – though this may incur an increased cost.

For my own scenario, I can see that having a SharePoint development environment which works reasonably well and can be spun up quickly will enable developers to start work right away whilst appropriate hardware is ordered and provisioned for them to use on-premises in the longer term.