While our engineers have been hard at work preparing the bits for our site-to-site replication offering, I have been testing the technology in preparation for a slew of technical collateral on the feature. In addition to introducing Coho SiteProtect here on my blog, I want to share with you a quick overview of the architecture. You can find more on this feature at the Coho Data blog here and here. Stay tuned for more on this topic in future posts!
Replication is something I am extremely passionate about and I’m very happy to talk about it with whoever has interest. I’ve witnessed firsthand what having a solid DR plan can mean to a business, and I and many others rely on it to deliver data in any circumstances, both predictable and unplanned, to their customers today more than ever before.
Now, let’s dive into the architecture…
Coho’s SiteProtect replication implementation reflects the unique features of our patented scale-out system architecture. The two most notable elements of SiteProtect are dynamic data replication and lightweight snapshots.
Dynamic Data Replication
For Coho, replication is a core architectural pillar that not only replaces technologies like RAID for data protection, but also is used in scaling out the capacity of your cluster when you add nodes and for data re-balancing across those nodes in times of congestion. Additionally,we use replication when decommissioning nodes or during a failure of a node to rebuild a replica of data on the surviving nodes. Because we replicate objects in the Coho Bare Metal Object Store, we can do this virtually at the block level as new files are created or as old files are modified. We keep the data synchronously updated so that the workloads never skip a beat.
For data availability in the event of a disaster, we have extended this functionality to other clusters at remote sites. Because distance typically introduces latency and bandwidth challenges, we shift to an asynchronous approach for remote replicas. This prevents the performance issues you may see when the primary workloads are competing with synchronous replication traffic, not to mention saturating your network links.
Our snapshot implementation leverages copy-on-write clones of the original VMs. That means, storage capacity consumed is proportional to the amount of data changed since the previous snapshot was taken. The DataStream replicates snapshots at regular, user-selected, intervals, so each subsequent data transfer only replicates the changes since the previous one. Add to this the fact that we compress the data over the wire and you’ll see significant reduction of bandwidth usage.
The real-world benefit is alignment to application Recovery Point Objective (RPO) needs. It can be as frequent as a few minutes to days or weeks. Coho SiteProtect does not force you into one size fits all.
Failover & Failback
To recover workloads, you simply clone the replicated copy into vCenter at the remote site. It will immediately inherit the original snapshot/replication schedule, providing the ability to failback when the original site comes back online. This provides a Recovery Time Objective (RTO) in the order of seconds for your critical workloads. If the workload already exists in vCenter, we will simply update the storage configuration to reflect the latest replicated snapshot. If you want to run on an older snapshot you can do that as well.
Finally, while a good disaster recovery plan is important, testing replicated data isn’t always easy. Replicated Snapshots are immutable and a simple clone of a snapshot can be used for DR testing. The clone can safely be discarded after DR testing has completed.
- Asynchronous, snapshot-based – provides fast recovery
- Active/Active sites – delivers efficiency
- Granularity at the virtual machine – provides control
- SSL data transport – ensures security of your data
- Replicate only changed data – bandwidth efficient
- Compression – Reduced bandwidth usage
For more information on Coho SiteProtect, click here!
10,640 total views, 3 views today