In this blog

WWT partnered with Paul Johnson, from Equinix, to author this article.  

Since the dawn of centralized data storage, we have been having conversations in IT about two points of concern: performance and costs. More recently, two other topics of concern are consistently arising: data locality and data sovereignty. We will examine recent trends in each of these areas and the compromises that can negatively impact the other areas. We will also examine strategies that can bring architectural stability (along with control and flexibility) to your organization's long-term digital infrastructure strategy.   

Answering the data locality question can also have a direct impact on (and be influenced by) data sovereignty concerns. More and more government entities are requiring data that is generated, use or pertains to citizens of certain locales to be kept in those locales. Storage controlled by your organization, within the locale of concern and in a location that is as close to the public clouds as possible is a great way to quell the concerns of an organization's legal department while getting all benefits already mentioned.

The last decade has brought about significant innovations that have minimized bottlenecks when it comes to storage performance. We have seen the introduction, as well as the continual improvement, of all-flash storage arrays. Storage density in all mediums has compounded multiple times, and compute parallelism has improved storage access times. The introduction of NVMe over fabric protocols has revolutionized storage access methodologies. There are far too many storage innovations to list here, but you can see that the past decade has brought about vast improvements in the performance of applications in the data center. 

Each new technology has brought an initial increase in cost, but we have seen dramatic improvements in per-unit costs (GB of storage, MB of throughput, etc.) over time. Unfortunately, this improvement in per-unit costs has run into the problem of even more vast amounts of data to be stored. While most organizations have come to realize the value of information that can be obtained from data (as demonstrated by companies like Facebook that own little to no physical assets but achieve massive value from data manipulation), the sheer volume of data being stored is still enough to motivate organizations to optimize to drive down costs of data storage.

The main trade-offs

The public cloud is generally the first path organizations explore to save costs on storage. The scale of public clouds is very good at driving down per-unit costs, but this is done by leveraging over-subscription in an environment hosting thousands of different customers. Side effects of this over-subscription, when compared to onsite storage, can be inconsistent performance or performance that simply does not meet application requirements (trade-off number one). 

When examining the costs of storage in the public cloud, there are many other items to keep in mind. Public cloud was initially designed to provide on-demand compute resources that could be turned up for short periods when there is high demand and then turned off (i.e. rent the peak). Storage is different from compute in the fact that demand for storage only goes one direction: up. There is no peak to rent, and there is just an ever-increasing monthly bill with the cloud provider (trade-off number two).  

Furthermore, as previously mentioned, your data is incredibly valuable. The cloud providers clearly know this as evidenced by cloud providers' practice allowing you to move data into the cloud for free, then charging on a per GB basis to move that data out of the cloud. For every GB you move into the cloud, you become more locked into that cloud provider (trade-off number three).

As organizations examine the trade-offs with cost and performance of putting storage in the public cloud, it is not uncommon to hear a statement similar to: "The public cloud provider is not anywhere close to my existing data center that houses my storage today. If I move the application to the cloud to take advantage of rent-the-peak economics and keep my storage where it is, the application will break." 

We are still universally subject to one bottleneck: the speed of light. We can only move data at the speed light, and the more distance data must travel, the more latency (or delay) we introduce. Most applications "break" (and from an end-user point of view, slow is broken) once there is too much latency between the application and its storage. This is the data locality conversation.

An answer to your questions around data locality

How does one solve data locality constraints, you might ask? WWT is excited to introduce our partner: Equinix. Equinix allows you to put your storage as close to the cloud as possible without putting it in the cloud, thus eliminating the latency that would break the application(s). Putting storage in close proximity to the public cloud also allows organizations to take advantage of all the storage performance improvements (control), get the best benefits of using the public cloud (flexibility), avoids cloud lock-in (flexibility/control) and enables multicloud options for integrating with the hyperscalers (flexibility/control).

Answering the data locality question can also have a direct impact on (and be influenced by) data sovereignty concerns. More and more government entities are requiring data that is generated, used or pertains to citizens of certain locales to be kept in those locales. Storage controlled by your organization, within the locale of concern and in a location that is as close to the public clouds as possible is a great way to quell the concerns of an organization's legal department while getting all benefits already mentioned.

One last thought on costs: some may be concerned that cloud adjacent storage does not provide the pay by the drip benefits of the public cloud. In the last few years, we've seen most storage manufacturers roll out Storage as a Service (STaaS) that customers can leverage in combination with Equinix services to achieve an OpEx model, but at rates lower than the cloud providers.

If you or your organization are having (or would like to have) conversations about data storage costs, performance, locality or sovereignty, we offer several workshops and engagements that can help you get a clear picture of your options. Engage your local account team to find out more.

Technologies