Unlock the Power of Unstructured Data with Dell EMC PowerScale
In This Article
If data is the new oil, then the world is seemingly awash in untold riches. A massive amount of data -- roughly 2.5 million terabytes -- is produced every day. And while a large part can be attributed to the normal data-generating habits of consumers (e.g., sending texts, conducting Google searches, posting on social media placing online orders), an increasing amount is generated by internet of things (IoT) sensors and other unmanaged edge devices. This evolving data complexity makes it progressively harder for organizations to extract value out of the data deluge--an effort akin to drinking out of the proverbial water hose.
Specifically, the increasing volume of unstructured data poses continuing challenges to enterprises looking to extract value from this data "oil." It is estimated that unstructured data accounts for nearly 80 percent of an organization's data footprint. This is expected to grow year-over-year and is increasingly spread out across data centers and clouds, creating even more complexity for organizations.
The management and processing of unstructured data pose significant challenges to data-driven enterprises. Compared to structured data with its easily searchable, pre-defined data structures (e.g., tables, columns and rows), unstructured data is--as its name implies--without a specific model type, making it significantly harder to query, process and extract meaning. Some examples include photos, videos, audio, emails and Microsoft Office documents. Unstructured data accounts for as much as 90 percent of all data and continues to grow at a rate of 55 to 65 percent each year due to the rise of IoT.
Fortunately, innovations in infrastructure technology have kept up with these rapidly evolving enterprise data needs. Whereas analyzing structured data is as simple as running a report within a relational database (RDBMS), unstructured big data requires artificial intelligence (AI) tools to extract deep, meaningful insights coupled with a high-performance storage solution to meet availability needs. To this end, network attached storage (NAS) was designed to address these enterprise storage use cases.
NAS devices provide file-level data access to a heterogeneous group of clients across the network. Unstructured data is a natural fit for NAS because it typically arrives in file-based formats; NAS devices can read/write files at high speed at scale, without the need for additional server hardware. This makes them ideal for storing unstructured big data with high-performance computing requirements, as they enable both collaboration among users as well as high availability (HA) for mission-critical data applications.
NAS devices also provide dedicated, centralized storage for unstructured data--this makes for easier management, no matter how large and/or complex the data environment becomes in the future. Organizations can seamlessly scale out capacity and performance on-demand to eliminate bottlenecks and improve overall storage performance.
Formerly known as the Dell Technologies Isilon platform, PowerScale is the industry's leading line of scale-out NAS platforms for high-volume storage backup and archiving of unstructured data. The storage solution suite offers several deployment options:
- All-flash nodes: for supporting accelerated file workloads with battle-tested performance and efficiency
- Hybrid nodes: for handling a myriad of large-scale data workload types while reducing costs
- Archive nodes: for efficiently supporting both active and cold archives
- Cloud support: for supporting cloud-based data workloads (e.g., AWS, Microsoft Azure, Google Cloud, Oracle)
From on-premises to the cloud, PowerScale supports the entire range of enterprise data workloads via a seamless scale-out NAS platform.
Aside from the name change, the rebranded PowerScale offers numerous substantial improvements over its predecessor. Isilon initially offered support for all-flash, hybrid and archive scale-out NAS nodes deployment models; along with these options, Dell PowerScale adds cloud-based object support (e.g., AWS S3) for unstructured data workloads beyond the corporate data center. With this unified file and object storage in PowerScale, Dell has effectively geared its Isilon solution to perform on PowerEdge server hardware.
It is worth noting that PowerScale offers some novel benefits while enhancing the merits of its OneFS parallel distributed networked file system. Introduced with Isilon, OneFS continues to serve as the basis for Dell PowerScale's single-file system and single-volume architecture.
Along with OneFS, some notable PowerScale benefits include:
PowerScale makes it easy to manage various data storage resources under one namespace. Organizations can seamlessly scale out with Dell PowerScale by adding additional nodes -- up to 252 nodes per system -- in a matter of minutes without downtime or migration. Once a new node is added, PowerScale automatically performs node rebalancing/de-duplication for reaching up to 80 percent utilization levels.
PowerScale allows organizations to reduce costs by utilizing a policy-based approach for inactive data. Based on thresholds set by the organization, PowerScale automatically moves inactive data to more cost-effective storage instances. Additionally, SmartPools software provides native, policy-based tiering capability--effectively enabling multiple performance/protection/storage density levels to co-exist on the same file system.
OneFS integrates with several industry-standard protocols, including Hadoop Distributed File System (HDFS). In these scenarios, organizations can take a scale-out data lake approach that makes Hadoop data accessible across applications, eliminating the need to manually move data around to support business analytics. Additionally, PowerScale in HDFS environments allows Hadoop's compute and storage to be independently scaled.
OneFS includes FlexProtect, a data protection technology, that allows storage administrators to protect specific files with higher protection levels than others based on data sensitivity. OneFS also enables data-at-rest encryption (D@RE) for tightened security against potential data loss. Additionally, PowerScale integrates built-in ransomware protection with smart Airgap technology and Ransomware Defender, a real-time event processing solution that leverages user behavior analytics (UBA) to monitor for ransomware attacks on business-critical data.
Organizations should take into consideration several factors when identifying the most efficient and cost-effective storage option. In the context of PowerScale, firms stand to benefit the most when addressing the following use cases:
File sharing data is ubiquitous, comprised of essentially any text, program, or directory data created by users (e.g., Excel spreadsheets, PDFs, Word documents, PowerPoint presentations). While these documents may not be exceptionally large on their own, the composite size of an organization's file sharing data can quickly add up with a large and/or expanding user base. Dell PowerScale is ideal for reducing data management overhead while ensuring users (e.g., employees, students, administrators) can quickly access what they need, when it is needed
OneFS supports Isilon Swift, an object storage interface for accessing file-based data stored on a PowerScale cluster as objects. Objects typically consist of data, metadata and a unique identifier.
Some examples of object data that can be stored natively on a Dell PowerScale cluster include:
- Healthcare content: Picture archiving and communication system (PACS) imaging is ideal for PowerScale, as healthcare workers require quick loading images (e.g., X-rays, MRIs, CT scans) to evaluate and treat patients.
- Call recordings: Retailers and financial institutions regularly record customer support calls for training and quality purposes. Depending on the call length, the associated sound files can be quite large. In the event of a problem or complaint, organizations require the ability to quickly pull and review the support call in question.
- Research: Scientific/academic research produces large amounts of data--sometimes over the course of several months or years. Dell PowerScale can easily scale to make room for new data being collected while remaining readily available to active researchers.
- Social media: Platforms like Facebook, Twitter and Instagram have countless users uploading pictures, videos, and audio on a constant basis. PowerScale can easily scale out to meet expanding user demand while enabling active users to enjoy a seamless, uninterrupted user experience.
- Media and entertainment: Creating full-production videos, movies and broadcast segments requires storing substantial amounts of footage. PowerScale enables data storage ease-of-use and availability across the production workflow, enabling film crews and editing teams to work simultaneously.
- Surveillance: Security footage requires quick storage and access capabilities to support post-incident investigative and forensic activities. For example, a retailer that experiences a burglary requires immediate access to surveillance footage recorded during the incident. Other examples include video footage from police body cameras and witness/suspect interviews, aerial video footage, in-vehicle security system monitoring, to name a few.
- Data analytics: Forward-thinking enterprises leverage data analytics to identify new opportunities and remain competitive. For example, retailers may target customers with instant offers and discounts via mobile app when entering their physical storefronts. This functionality requires intensive customer analytics and the underlying high-performance storage to provide the necessary real-time insights.
For remote offices and smaller organizations, Dell offers IsilonSD Edge: a software-defined storage solution for flexible, cost-efficient unstructured data management. Built specifically for supporting unstructured data needs at edge locations (e.g., remote or branch offices), IsilonSD Edge runs on customer-supplied commodity server hardware while leveraging the OneFS operating/file system.
OneFS' CloudPools feature is an extension of the previously mentioned SmartPools data tiering framework that enables data tiering to public/private cloud storage. Dell PowerScale can be used within a public cloud to run a full-cloud, scale-out NAS--while still leveraging the functionality of PowerScale. This can be a cost-effective approach for organizations wanting to use public cloud services to conduct data analytics and AI. PowerScale also supports hybrid architectures where PowerScale is implemented on-premises while data is backed up to the public cloud for disaster recovery (DR) purposes.
Organizations are often bound by stringent requirements dictating how they store and/or maintain user and customer data. Because of its 80-percent utilization rate, Dell PowerScale is ideal for storing archived data that must remain readily available at a moment's notice. For example, a healthcare organization might be required to store patient records to maintain HIPAA compliance.
The sheer volume and breadth of today's unstructured data workloads make AI/ML-assisted processing and analysis a necessity; suffice to say, the storage technologies that support these activities must perform and scale accordingly. Leveraging our Advanced Technology Center (ATC), our team can demonstrate PowerScale's vast capabilities and provide proofs of concept (POCs) to help determine if Dell PowerScale is the right fit for your organization. To find out if the Dell Technologies PowerScale is the right fit for your organization's data storage needs, request a briefing today.