Web-scale Economics… and Innovation?!


What is Web-scale?

A good percentage of those of us out there in the trenches of enterprise IT have probably heard the term “Web-scale” thrown around. It, like many IT terms, is equal parts marketing term and technical term, hence, not-so-well-defined… and as a result, open for interpretation. My take on Web-scale is that it’s, first and foremost, a way to architect IT systems for enterprise, incorporating the best elements of public clouds. While it is very hard to mimic the architectural scale and resiliency of public and private clouds from AWS, Google, Facebook and others, one can easily see the benefits of distributed, shared-nothing architectures, API-driven automation and orchestration, self-healing application stacks… and in Coho‘s case, closer integration of the network with the storage.

The Coho approach to Web-scale has some unique elements that separate us from the other vendors that purport to do it. Hyperconverged vendors are for the most part confined to growing all datacenter resources simultaneously. Scaling all datacenter resources at the same time doesn’t necessarily make sense, unless your environment has very uniform workloads. My guess is that if you are a typical small/medium or enterprise, your compute, network and storage requirements don’t scale at an identical rate, thus performance gets left on the table, or you end up licensing software that you don’t need in order to grow your footprint. With Coho, we allow the customer to scale the compute independent of the network and storage. As you add building blocks to a Coho scale-out cluster, you add 40Gbps (or more) of network bandwidth along with multiple TBs of PCIe NVMe flash. This is a hard requirement if you expect the cluster to exhibit linear performance scaling as you add capacity. Adding flash without the adequate network bandwidth to push the bits over the wire is a lost war before the battle even begins!

This brings us to the economics part of the discussion as it relates to Web-scale…

Converged (non-hyperconverged) systems that incorporate increased network capacity along with the storage, such as Coho, give customers the ability to incorporate the best elements of public clouds with the security and performance that can only be achieved with on-premises infrastructure. This simple fact has afforded us an opportunity to talk to customers in the terms of $/GB/mo that they are likely to see quoted from Amazon and others. The shift toward OPEX pricing is already top of mind for a great many CIOs, so it serves as a convenient reference point for us when we talk with customers. Even with operational costs figured into the economics, we often talk about prices that are 1/2 to 1/3 the cost of AWS. Now let’s put a qualifier here… we’re not talking Amazon Glacier or the cheapest of the cheap that Amazon offers, but rather AWS EFS (Elastic File System) service which is advertised at around $.30/GB/mo, all-the-while preserving the jobs of the internal IT teams, and preserving corporate IP (intellectual property) security and providing better performance! Don’t even get me started on the costs associated with getting data into/out of AWS once it’s in their cloud. You ever heard of data gravity?

But wait, there’s more…

Since Coho is innovating by creating unique storage services directly on the array, by leveraging Docker, Kubernetes, VXLAN and other cutting edge technologies, we are able to offer alternatives to AWS, without the need to move to the public cloud. This is the move toward “microservices” that you may have heard about. As a matter of fact, not only will Coho be demoing these technologies, in the form of on-the-fly transcoding, a search appliance and more, but our CTO, Andy Warfield will also present a breakout session discussing this very topic. Why bother going to AWS for services that you can get as free upgrades with a paid support contract?

In my opinion, Coho is not only at the forefront of what Web-scale was intended to deliver, but taking it to a whole new level. Look for us at VMworld (booth 1713) to find out more… we’re looking forward to talking with you!

4,431 total views, 3 views today


Come Meet the vSamurai at VMworld


If you’ve been following me via this blog or elsewhere on this series of tubes, you may know that I have taken on more of a customer and partner facing role in the past couple of months. This has given me the opportunity to do more of what I love, which is getting people excited as well as evangelizing what Coho Data is all about. The storage marketplace today is very crowded, with more and more start-ups on the scene on what seems like a daily basis. It’s hard for me to catch up with all that’s going on, let alone the buyers of the technology themselves. It’s not going to get any easier, until some major/sudden consolidation happens, but that’s mere speculation…

I’m looking forward to talking with members of the VMware and virtualization community, customers, partners, and anyone else that finds their way to VMworld this year. Recalling back to about a year-and-a-half ago, when I was making the decision to leave NetApp to join a company that is truly innovating in the storage market, I saw the potential that Coho has to offer. Now, I feel that a lot of the promise is being realized. This will be our 2nd VMworld and we’ll be upgrading from Silver sponsor last year to Platinum sponsor this year! We’re doubling-down on VMware and I am truly excited about what we’ll be showing and talking about this year. I’m privileged to share the details with you as we get closer to the show…

Until then, if you’d like to set-up a meeting to talk with me or another one of our other experts, head on over here.

Or if you’d just like to sign-up for a chance to win a free pass to attend the show, click below:


3,567 total views, no views today


Why I’m Excited About The Coho DataStream 2.5 Release!


A lot of engineering work has gone into the Coho v2.5 software release. Add to that the fact that we now have 3 distinct hardware offerings and we’ve got a pretty extensive portfolio now. I’ve been involved with the testing on this release since the Alpha days, and I can honestly say it’s our best release yet. I could tell from the beginning, as the quality of the code was much more robust (vs. some of the releases from 6 months to 1 year ago) based on my initial testing.

Here are the top 3 reasons why I’m most excited about this release:

#1 – Flashfit Analytics (Hit Ratio Curve)


We showed a technical preview of this at VMworld 2014 as well as Storage Field Day 6 and I think it’s a really unique differentiator in the market right now. Our analytics are extremely detailed and can pinpoint the exact amount of flash that will benefit workloads on a per-workload basis. We are able to see so much detail about the flash usage that we could make an educated guess about the application running in the workload. A bit more work is required before you do this, but the fact that we can says a lot about the level of detail captured here. The idea with Flashfit is that we give a customer the data to choose whether they have sufficient amounts of flash for their current working set, need to add more capacity (hybrid node) or need to add more performance (all-flash node). This will work it’s way into QoS and storage profiles as we move forward with development of the feature. When you combine this with the ability to choose an all-flash or hybrid node, we give the customer unparalleled economics and TCO/ROI.

#2 – Data Age


The Data Age view is something that we also previewed an early version of a while back. It’s a bit more abstract, but interesting in that we are able to show a cluster-wide view of how old the data is. You’ll find that this graph gives more supporting evidence around the the flash working set on the system and proves that in all but the busiest of customer environments, the amount of flash that’s accessed frequently is a mere fraction of the total flash on the system. In other words, we give you real-time supporting evidence showing that: 1) You probably don’t require an all-flash array 2) If you decided to go with an all-flash option, you’re paying a lot of money for a very, very small portion of your hot data. All of the rest would be better served by higher density mediums.

#3 – Scalability Improvements

When I first started at Coho, approaching a year-and-a-half ago now, we admittedly had some challenges around scalability. This new release introduces an improved global namespace that allows for orders of magnitude more objects in the cluster and thus many, many more VMs (workloads). I’m happy to have been a small part of reporting my findings and getting this prioritized and fixed. I can honestly say that we are truly living up to the promise of a scale-out storage system.

Well, that’s it for now. I’m curious what other features of DataStream OS v2.5 that you’re most excited about! Respond in the comments.

2,969 total views, no views today


“Pets” vs. “Cattle”… In the Context of Storage?


By lamoney (http://www.flickr.com/photos/lamoney/97461242/) [CC-BY-SA-2.0 (http://creativecommons.org/licenses/by-sa/2.0)], via Wikimedia Commons

I’ve been thinking a little bit (more than usual) lately about the crossroads we are at in the IT industry today. I’ve been reflecting back to some early posts that I shared way back when virtualization was the tech de rigueur. Not only that but the fact that my current company Coho Data is at the nexus of this crossroads, if you will. Since we talk “web-scale”, “scale-out” and a multitude of other buzzwords in today’s IT world, it’s interesting to explore some of those in the context of cloud and distributed systems that form the new reality of enterprise IT computing.

When dealing with cloud computing proximally or otherwise, it’s likely that you fall within either the VMware camp or the OpenStack camp (or both) today. Some would say these solutions are at opposite ends of the cloud software spectrum. You may also have heard the term: “Pets vs. Cattle” in reference to your servers, i.e. a Pet has a name, requires constant patching, updating and altogether expensive maintenance… whereas Cattle are nameless, can be removed from the system and replaced with new gear and be online again doing their job without skipping a beat.

Well, what if it were possible to have a zoo and a farm all-in-one? and what about for storage?!

Normally when you think of storage, its persistent nature requires it to be a Pet and not Cattle, but with today’s more modern storage architectures, I’d like to propose that this isn’t necessarily the case. You can have both persistence of data and statelessness of the underlying components at the same time. Bear with me for a minute while I reason through this…

With a scale-out, shared-nothing node architecture, you have the ability to add and remove nodes on the fly without worrying about the health of your data. As you scale to larger number of nodes, you care about each node even less. Despite the fact that you have a greater quantity of data in the system, the importance of any one individual storage node is reduced. Add to that the fact that a well-built self-healing, auto-scaling system can heal itself faster when there are more “cattle” on the farm.

As a function of this architecture you can also remove nodes in much the same way, allowing you to return leased equipment or installing newer, more dense and performant nodes into the system with everything working in a heterogenous fashion, and without skipping a beat. This is great from a TCO perspective as well.  It’s much better than being locked in with a fixed amount of high performance flash and capacity spinning disk for the next 3-5yr spending cycle. Extend this even one step further and you can imagine being able to automatically order new hardware to expand the system, adding it, then shipping back the old to the leasing company in a regular, predictable fashion.

One element of cloud scale systems that allows this to happen is extensibility, being able to easily extend a system beyond it’s original reason for being. Typically this is enabled via APIs and all of today’s next generation storage systems are build from the ground up to support this type of integration. Being able to organically adjust to customers’ needs quickly by offering APIs, toolkits and frameworks, is a key ingredient in delivering web-scale!

The interesting part of this whole discussion is that despite the importance of persistence in the storage world, given the right architecture we CAN indeed have the best of both worlds. Look at Coho’s scale-out enterprise storage architecture and you can see that we very much have a combination of the elements of both Pets and Cattle. We support the best from either architecture as well as any modern storage system should

Here are some examples:

  • Pets like NIC bonding for high availability – we’re cool with that
  • Pets like to be managed carefully and thoughtfully – we build intelligence into our storage, but also give visibility to the admin
  • Cattle can be auto-scaled by just plugging in a new node and allowing the system to grow – we do this as well
  • Cattle are designed to accommodate failures – we build our failure domains across physical boundaries so that their is no single point of failure
  • Pets like to have constant uptime – refer to previous feature of cattle above; accommodating failure means the system stays online if a component fails
  • Pets like to have high availability – we do this as well, allocating a minimum of 2 physical nodes in a single but shared nothing hardware design
  • Cattle work only when there is shared nothing architecture – utilizing independent nodes with object-based storage allows us to provide this as well!

At Coho we see the need for these differing approaches to computing from a storage perspective. We started out providing storage for VMware workloads and customers seem to like how we’re delivering on that so far. In addition, we see the need to support OpenStack from a storage perspective as well and are currently offering a tech preview of our OpenStack support. As a matter of fact, if you’re interesting in becoming a BETA participant for OpenStack, you should definitely get in contact with us.

Thanks for reading!

3,239 total views, 3 views today


Horizon View 6 – Reference Architecture

thumbnail of coho-vdi-vmware-horizon6-1000hSince I joined Coho (back in March of last year; time flies), I’ve been hard at work to deliver on our technical marketing solutions collateral, specifically with regard to VMware integrations, among many other duties. (Coho is a start-up, first and foremost.) My first order of business was to tackle a Reference Architecture of VMware Horizon View 6 on the Coho DataStream platform. This was not without its challenges, but through much blood, sweat and tears from Engineering and me, we finally have something of value!

Fast forward to now and we have another code release (our 2.4 release) under our belt, and we’re ready to share our findings and performance numbers for VMware VDI solutions on top of the Coho Data storage solution. I can’t wait to share all the advantages of Coho’s combination of speedy PCIe flash alongside our scale-out architecture, all the while leveraging some cool SDN hotness!

This is only the first in a long line of deep technical collateral coming rapidly down the pike to help our field and especially our customers to truly leverage what “web scale” really means. Looking for more? Stay tuned!

2,798 total views, 3 views today


Powered by WordPress. Designed by WooThemes