Archive | Storage

Why I’m Excited About The Coho DataStream 2.5 Release!


A lot of engineering work has gone into the Coho v2.5 software release. Add to that the fact that we now have 3 distinct hardware offerings and we’ve got a pretty extensive portfolio now. I’ve been involved with the testing on this release since the Alpha days, and I can honestly say it’s our best release yet. I could tell from the beginning, as the quality of the code was much more robust (vs. some of the releases from 6 months to 1 year ago) based on my initial testing.

Here are the top 3 reasons why I’m most excited about this release:

#1 – Flashfit Analytics (Hit Ratio Curve)


We showed a technical preview of this at VMworld 2014 as well as Storage Field Day 6 and I think it’s a really unique differentiator in the market right now. Our analytics are extremely detailed and can pinpoint the exact amount of flash that will benefit workloads on a per-workload basis. We are able to see so much detail about the flash usage that we could make an educated guess about the application running in the workload. A bit more work is required before you do this, but the fact that we can says a lot about the level of detail captured here. The idea with Flashfit is that we give a customer the data to choose whether they have sufficient amounts of flash for their current working set, need to add more capacity (hybrid node) or need to add more performance (all-flash node). This will work it’s way into QoS and storage profiles as we move forward with development of the feature. When you combine this with the ability to choose an all-flash or hybrid node, we give the customer unparalleled economics and TCO/ROI.

#2 – Data Age


The Data Age view is something that we also previewed an early version of a while back. It’s a bit more abstract, but interesting in that we are able to show a cluster-wide view of how old the data is. You’ll find that this graph gives more supporting evidence around the the flash working set on the system and proves that in all but the busiest of customer environments, the amount of flash that’s accessed frequently is a mere fraction of the total flash on the system. In other words, we give you real-time supporting evidence showing that: 1) You probably don’t require an all-flash array 2) If you decided to go with an all-flash option, you’re paying a lot of money for a very, very small portion of your hot data. All of the rest would be better served by higher density mediums.

#3 – Scalability Improvements

When I first started at Coho, approaching a year-and-a-half ago now, we admittedly had some challenges around scalability. This new release introduces an improved global namespace that allows for orders of magnitude more objects in the cluster and thus many, many more VMs (workloads). I’m happy to have been a small part of reporting my findings and getting this prioritized and fixed. I can honestly say that we are truly living up to the promise of a scale-out storage system.

Well, that’s it for now. I’m curious what other features of DataStream OS v2.5 that you’re most excited about! Respond in the comments.

5,455 total views, no views today


“Pets” vs. “Cattle”… In the Context of Storage?


By lamoney ( [CC-BY-SA-2.0 (], via Wikimedia Commons

I’ve been thinking a little bit (more than usual) lately about the crossroads we are at in the IT industry today. I’ve been reflecting back to some early posts that I shared way back when virtualization was the tech de rigueur. Not only that but the fact that my current company Coho Data is at the nexus of this crossroads, if you will. Since we talk “web-scale”, “scale-out” and a multitude of other buzzwords in today’s IT world, it’s interesting to explore some of those in the context of cloud and distributed systems that form the new reality of enterprise IT computing.

When dealing with cloud computing proximally or otherwise, it’s likely that you fall within either the VMware camp or the OpenStack camp (or both) today. Some would say these solutions are at opposite ends of the cloud software spectrum. You may also have heard the term: “Pets vs. Cattle” in reference to your servers, i.e. a Pet has a name, requires constant patching, updating and altogether expensive maintenance… whereas Cattle are nameless, can be removed from the system and replaced with new gear and be online again doing their job without skipping a beat.

Well, what if it were possible to have a zoo and a farm all-in-one? and what about for storage?!

Normally when you think of storage, its persistent nature requires it to be a Pet and not Cattle, but with today’s more modern storage architectures, I’d like to propose that this isn’t necessarily the case. You can have both persistence of data and statelessness of the underlying components at the same time. Bear with me for a minute while I reason through this…

With a scale-out, shared-nothing node architecture, you have the ability to add and remove nodes on the fly without worrying about the health of your data. As you scale to larger number of nodes, you care about each node even less. Despite the fact that you have a greater quantity of data in the system, the importance of any one individual storage node is reduced. Add to that the fact that a well-built self-healing, auto-scaling system can heal itself faster when there are more “cattle” on the farm.

As a function of this architecture you can also remove nodes in much the same way, allowing you to return leased equipment or installing newer, more dense and performant nodes into the system with everything working in a heterogenous fashion, and without skipping a beat. This is great from a TCO perspective as well.  It’s much better than being locked in with a fixed amount of high performance flash and capacity spinning disk for the next 3-5yr spending cycle. Extend this even one step further and you can imagine being able to automatically order new hardware to expand the system, adding it, then shipping back the old to the leasing company in a regular, predictable fashion.

One element of cloud scale systems that allows this to happen is extensibility, being able to easily extend a system beyond it’s original reason for being. Typically this is enabled via APIs and all of today’s next generation storage systems are build from the ground up to support this type of integration. Being able to organically adjust to customers’ needs quickly by offering APIs, toolkits and frameworks, is a key ingredient in delivering web-scale!

The interesting part of this whole discussion is that despite the importance of persistence in the storage world, given the right architecture we CAN indeed have the best of both worlds. Look at Coho’s scale-out enterprise storage architecture and you can see that we very much have a combination of the elements of both Pets and Cattle. We support the best from either architecture as well as any modern storage system should

Here are some examples:

  • Pets like NIC bonding for high availability – we’re cool with that
  • Pets like to be managed carefully and thoughtfully – we build intelligence into our storage, but also give visibility to the admin
  • Cattle can be auto-scaled by just plugging in a new node and allowing the system to grow – we do this as well
  • Cattle are designed to accommodate failures – we build our failure domains across physical boundaries so that their is no single point of failure
  • Pets like to have constant uptime – refer to previous feature of cattle above; accommodating failure means the system stays online if a component fails
  • Pets like to have high availability – we do this as well, allocating a minimum of 2 physical nodes in a single but shared nothing hardware design
  • Cattle work only when there is shared nothing architecture – utilizing independent nodes with object-based storage allows us to provide this as well!

At Coho we see the need for these differing approaches to computing from a storage perspective. We started out providing storage for VMware workloads and customers seem to like how we’re delivering on that so far. In addition, we see the need to support OpenStack from a storage perspective as well and are currently offering a tech preview of our OpenStack support. As a matter of fact, if you’re interesting in becoming a BETA participant for OpenStack, you should definitely get in contact with us.

Thanks for reading!

5,922 total views, no views today


Horizon View 6 – Reference Architecture

thumbnail of coho-vdi-vmware-horizon6-1000hSince I joined Coho (back in March of last year; time flies), I’ve been hard at work to deliver on our technical marketing solutions collateral, specifically with regard to VMware integrations, among many other duties. (Coho is a start-up, first and foremost.) My first order of business was to tackle a Reference Architecture of VMware Horizon View 6 on the Coho DataStream platform. This was not without its challenges, but through much blood, sweat and tears from Engineering and me, we finally have something of value!

Fast forward to now and we have another code release (our 2.4 release) under our belt, and we’re ready to share our findings and performance numbers for VMware VDI solutions on top of the Coho Data storage solution. I can’t wait to share all the advantages of Coho’s combination of speedy PCIe flash alongside our scale-out architecture, all the while leveraging some cool SDN hotness!

This is only the first in a long line of deep technical collateral coming rapidly down the pike to help our field and especially our customers to truly leverage what “web scale” really means. Looking for more? Stay tuned!

5,553 total views, 5 views today


Coho DataStream 2000f All-Flash Storage Offering


It’s been a while since my last post, but I couldn’t pass up this chance to talk about the latest news here at Coho. The announcements we made today have been in the works for a while, and I am proud to have been a small part of bringing the all-flash offering to market, having tested it extensively for the last several months. The series-C funding round is nice, too, of course!

Our all-flash offering enters a crowded market to be sure, but instead of becoming just another all-flash storage vendor, we’re approaching all-flash a bit differently here at Coho. We believe that whether to go “all-flash” or “hybrid” shouldn’t be a decision the customer has to make themselves, but rather, with intelligence about your workloads and how much flash they “really” need to function at peak performance, we can allow both all-flash and hybrid to co-exist within a single cluster and single namespace for a balance of unmatched performance and economics.

Our Cascade tiering technology allows each AFA chassis to balance two different types of flash, NVMe flash cards for the upper tier, and up to twenty-four 2.5” SSDs for the lower tier. All told, you can fit up to 50 TB of usable capacity in each 2U chassis (that’s before calculating space efficiency from compression, etc.)

Here’s a statement from our Technical Product Manager, Forbes Guthrie‘s own blog post about the release, and this is a key point I use when talking to others about some of Coho’s key value:

“Coho realized early on, that when you’re building a storage system with outrageously fast flash devices, and you have hungry servers waiting with insatiable appetites; don’t set out to funnel that I/O through an obvious choke point.  Storage systems that come with pair of controllers, with no way to grow alongside your expanding storage needs, are a short-sighted design choice. With AFA storage, this is (obviously) way more critical.

All Coho DataStream systems are comprised of shared-nothing “MicroArrays”. Each 2U disk chassis contains 2 independent storage nodes; each have their own pair of 10 GbE NICs, their own CPUs and memory. As you add our disk chassis to a cluster, any type of Coho chassis, you’re adding controller power and I/O aperture.”

Well, I think that statement speaks for itself. I am really excited for what the future holds here at Coho. I feel that rounding out our catalog with this all-flash offering in the 2000f, entry-level hybrid offering in the 800h, to go along with our 1000h puts us in a very good place in the market right now. Add to that the fact that we use data intelligently to predict how much flash the customer needs for their workloads along with the ability to place it intelligently within a single namespace, across the different types of storage, provides us with the efficiency to compete very well against the competition, and I say: “Bring it on!


5,801 total views, no views today


Implementing Site-to-Site Replication with Coho SiteProtect

Now that I’ve given you a quick overview of the architecture of Coho SiteProtect, I’d like to provide you with the basics for implementing SiteProtect in your data center. This is the 2nd in my series of posts on our site-to-site replication offering. As I discover the best practices for deploying SiteProtect in various infrastructures and scenarios, I’ll document those here as well, so stay tuned for those…

Without further ado, here is the step-by-step set-up procedure for SiteProtect…

Pairing the Sites

The first step in setting up remote replication is establishing a trusted relationship from the local site to the remote site. This is done from the Settings > Replication page in the Coho web UI, indicated by the gear (settings) icon (Figure 1).


Figure 1: Settings > Replication page

From here, click the “Begin replication setup” link which brings you to the configuration screen for the local site (Figure 2).


Figure 2: Settings > Replication > Local Site page

Here, you’ll specify the network settings for the site to site communication. It is worth noting that the replication traffic is sent on a VLAN to simplify network management for enterprise environments.

Here you can also configure bandwidth throttling for outbound traffic in case you need to limit the usage of the site to site interconnect. The same can be done on the remote site which means that both incoming and outgoing throughput can be controlled. Bear in mind that by limiting the traffic, you may increase the time it takes for a workload to finish replicating, in other words, increase the RPO.

Once that’s complete, you’ll click “Next” and specify the IP and password of the remote DataStream. Click “Next” again to proceed (Figure 3).


Figure 3: Settings > Replication > Remote Credentials page

Once the wizard confirms a connection to the other side, you’ll specify the remote system’s VLAN, replication IP address, and netmask, as well as the default gateway for the other side and click “Next” (Figure 4).

Note: On this page the bandwidth limit relates to outbound traffic from the remote site; or put another way, the inbound replication traffic arriving at the local site.


Figure 4: Settings > Replication > Remote Network page

Finally, you’re brought to step 4, which is the “Summary” page and allows you to review the configuration before applying the settings. Click “Apply and Connect” to complete the wizard (Figure 5).


Figure 5: Settings > Replication > Summary page

From this point forward, you’ll be presented with the following view when you go to the Settings > Replication page. You can see here (Figure 6), the IP of the remote node and that replication is active.


Figure 6: Settings > Replication page (completed)

Configuring Workloads and Schedules

Now that the initial pairing is complete, you’ll visit the “Snapshots and Replication” page to customize which workloads are replicated as well as the snapshot & replication interval for each (Figure 7).


Figure 7: Snapshots / Replication > Overview page

Here (Figure 7), we provide an overview of the workloads. This is a dashboard which tells us the number of VMs with snapshots as well as replicated snapshots. For all of a site’s workloads to be protected, they should all have replicated snapshots, ensuring that any of those workloads can be recovered on the remote site in the event of a disaster.

We also provide a summary of the workloads covered by replication, how many bytes have been transferred as well as the average replication time. These statistics provide the assurance that replication is functional, and also the rate of change of the data, allowing you to determine if your replication interval is appropriate for the bandwidth you have available. If your average replication time is greater than your snapshot schedule, you can modify it accordingly.

To configure or modify workloads, proceed to the “Workloads” page (Figure 8).


Figure 8: Snapshots and Replication > Workloads page

Here (Figure 8), we denote the local vs. the remote workloads, provide a record of when the last snapshot was taken, and display the assigned schedule.

Note: VMs which have been deleted are denoted with a strike through the name.

Under “Snapshot Record”, you can click on the calendar icon to view snapshot date, name and description, as well as the status of replication. In this example, we have recently enabled the workload for replication denoted by the word “Scheduled” (Figure 9).


Figure 9: Snapshots and Replication > Workloads > Snapshot Record page

To manually protect a specific workload, click the camera icon next to that workload. This will allow you to take a manual snapshot and replicate that snapshot (Figure 10).


Figure 10: Snapshots and Replication > Workloads > Snapshot page

Most users will want to protect a number of VMs at once. The best way to do this is from the “Default Schedule” page (Figure 11).


Figure 11: Snapshots and Replication > Default Schedule page

In this example we have selected a RPO of 15 minutes by replicating the snapshot every 15 minutes. The frequency of snapshots is best determined by the needs of the application and the automated snapshot schedule for Coho offers flexibility, from minutes to months.

Note: Quiescing snapshots puts the system in a state that maintains application consistency before taking the snapshot, however this is only available in the daily and weekly schedule. Taking quiesced snapshots more frequently may cause significant performance penalties. These performance penalties are not related to the Coho storage but to how snapshots are executed within the VMware environment. A crash consistent snapshot (no quiesce) can be done very frequently on the Coho storage without performance penalty.


In the event of a disaster you’ll want to be be able to bring up your applications in the remote site. This is done from the “Failover/Failback” view (Figure 12).


Figure 12: Snapshots and Replication > Failover/Failback page

Initially, failover and failback are disabled in order to protect you from instantiating multiple copies of the same VM. You make the decision (from either location) to put the disaster recovery plan in-motion. If you’re ready to proceed, click the “Enable” button to enable failover (Figure 13).


Figure 13: Snapshots and Replication > Failover/Failback page (enabled)

You can now go to the remote DataStream and clone your replicated workloads to the remote system. Open up the web UI of the remote DataStream and, again, go to the Snapshots and Replication > Workloads page (Figure 14).


Figure 14: Snapshots and Replication > Workloads page (remote)

Click the “Remote Workloads” checkbox to filter by those workloads. These are the workloads available for failover from the primary to the disaster site. Choose the workload by clicking the calendar icon. Browse the recent snapshots and choose one to clone from, by clicking the clone icon (Figure 15).


Figure 15: Snapshots and Replication > Workloads page (failover)

Once you’ve selected the desired snapshot, enter a VM name and choose a target vSphere host. Click “Clone” to clone it and recover it to the destination site. The workload is now failed-over to continue serving data to your users. Just power it on in vCenter and you’re ready to go.


If at some point, the primary site comes back online, we support failing workloads back to their original location. This is done from the Snapshots and Replication page. On the workload that you’d like to failback (Figure 16), click the calendar icon to view the available snapshots, then click the red arrow to sync the snapshot to the original VM. Once the VM is powered on, your app will be back in the original location with all of the changed data from snaphots replicated from the remote site since the failure occurred; simple and easy just like it should be.


Figure 16: Snapshots and Replication > Workloads page (failback)

Well, that’s it for the initial implementation. As you can see, Coho SiteProtect is easy to get set-up and configured in any environment. Next, we’ll dive into some of the best practices of how to configure SiteProtect for optimal performance for environments of various sizes and requirements.

Until then, if you’d like more info about Coho SiteProtect, click here!

8,105 total views, no views today


Powered by WordPress. Designed by WooThemes