Q&A interview

Understanding data and managing its lifecycle before and after moving it to the cloud is an important part of the IT Service Management (ITSM) process. Effective data management is the key to ensuring that IT services are reliable, secure and efficient.

I recently had the opportunity to gather insights from Randy Hopkins, the global sales engineering leader at Komprise, about this critical topic. You can read his responses to key issues and concerns below:

Cloud spending is back, per the latest financial reports from AWS, Google Cloud and Microsoft Azure. Did it ever really go away?

It hasn’t gone away – but there has been an integral shift for companies regarding cost economics and simplicity. While small companies continue to move to the cloud at the same pace, larger organizations have been hit with sticker shock while also realizing that managing cloud infrastructure is overly complex. This has caused some repatriation of workloads and data back to on-premises in some cases.

Enterprise IT has been struggling to manage storage costs in the cloud because their data has grown relentlessly —and the issue here is that the cloud is a bottomless bucket of resources. This is both a positive and negative reality.

It hasn’t gone away – but there has been an integral shift for companies regarding cost economics and simplicity.  While small companies continue to move to the cloud at the same pace, larger organizations have been hit with sticker shock while also realizing that managing cloud infrastructure is overly complex. This has caused some repatriation of workloads and data back to on-premises in some cases.

Enterprise IT has been struggling to manage storage costs in the cloud because their data has grown relentlessly —and the issue here is that the cloud is a bottomless bucket of resources. This is both a positive and negative reality.

Conversely, in a corporate data center, if you are running out of capacity you need to get approval to buy more storage. The costs are always transparent – no surprise bills.  The reason why this repatriation hasn’t affected the big cloud providers’ revenues is because increasingly, applications are moving to or born in the cloud. As well, AI tools and infrastructure are becoming cloud-centric, because it’s more affordable and scalable rather than trying to build these services and technologies internally.

While it makes sense that the global data footprint is exploding from consumer use– considering smartphones and social media and video streaming, it’s not as easy to understand why data volumes in all sectors are out of control. Can you explain? 

Data growth is now driven more by applications than human beings. These applications include edge and IoT apps, mobile apps, medical devices and other instruments that create data without human intervention.  If you consider how these sensors are collecting data and automating processes in electric cars to utilities and security in buildings to farming equipment, manufacturing processes and oil wells at the bottom of the ocean, it’s easy to see how data is exploding behind the scenes in every industry.  This data must be stored, at least for a time, and the cloud is offering this capacity because of its near-infinite ability to scale on demand.

How is cloud data management different from managing data on premises and what are the top challenges? 

Security is different for starters, so IT needs to understand security architecture in the cloud which varies across services and storage tiers.  A significant issue still today is that traditional monitoring and management tools for the on-prem world don’t work in the cloud and there is a lack of mature tools to manage data in the cloud.

Data sitting in on-prem storage vendors like EMC or NetApp has built-in monitoring tools by those vendors but in the cloud, customers need to track their data across many different technologies, tiers and even multiple clouds. Finding one tool to consistently manage data across these disparate environments is challenging.  Another challenge is that pricing and performance can vary widely across cloud storage tiers: there is on average a 20X price difference from cold to hot storage.  Enterprise IT customers need in-depth analysis and automation to manage this environment.

What are 2-3 things every company should be thinking about when it comes to modernization and preparing their data for use in cloud AI tools and applications? 

Everything is storage-agnostic now.  Tools need to access and manage data across different vendors and locations and deliver comprehensive visibility so that IT can properly manage data for risks and costs and ensure that data is always in the right place at the right time. Now that AI is a solid value proposition for the cloud, there is even more at stake. Yet identifying, classifying and leveraging data in a productive way can be challenging due to its sheer size in enterprises which is in the petabytes.

The 2-3 things that storage and IT managers should be thinking about are:

  1. Easy access and visibility to all data across all protocols and leverage that visibility to classify and categorize data.  This provides instant access and analytics across all data from a single pane of glass and it makes the analytics actionable with queries to mobilize the data.
  1. Automate as much as you can.  Leverage data characteristics and attributes to automatically place data in the right location or to the right cost or performance tier for its current use and requirements. Create policies as needed that drive these automations to be compliant and adhere to business and customer requirements.
  1. Simplify searches across all file metadata from a unified Global File Index. The data management solution should allow authorized users to easily copy, move, archive, tier and report on unstructured data files.

Finally, the data should be in a location that can be leveraged quickly, without egress charges, and with fast, low latency access.

How can companies avoid wasting money on cloud storage?  

It starts with knowing your data. From a data management perspective this requires visibility across the storage landscape and analysis showing key statistics such as age of data, time of last access, file types and sizes and data growth by department. This will help inform IT as to where to store data. A best practice is cold data tiering. Analysis showing data that has not been accessed for a year or longer can help you place that cold data – which can be up to 80% of all data in storage – to low-cost object storage in the cloud such as AWS Glacier.  If possible, discover and delete obsolete and zombie data (such as ex-employee data) before you move any data at all. It’s also of course critical to stay updated on the storage offerings and price-to-performance metrics for your cloud storage provider (s).

What user and application issues or considerations relate to moving data and workloads to the cloud? 

Data access needs to be transparent to the application or you’ll have turmoil from the end users. Evaluate carefully how your data migration/data management vendor tiers/archives data; solutions using proprietary stubs can break or interfere with access to data. Ideally, with transparent data tiering to the cloud, users can click on a link and access it at the same location as before. The user shouldn’t even know their data has moved. Cloud data migrations can also create significant havoc on both IT and end users. Delays and errors with moving data impede access and incur excessive costs for IT.

Large data migrations can take as long as a year to get done, but there are new technologies that make data transfer much more efficient over a WAN. Komprise Hypertransfer was developed with this in mind. And, as mentioned earlier, optimize data movements across cloud tiers.  Another consideration is that an organization may not move an application to the cloud, but just the data it needs; think carefully about network performance if data is going to need fast access back to your on-premises environment.

From a data management tools perspective, which features and capabilities are most important? What should IT and storage professionals be looking for as they evaluate solutions? 

You need a single control plane to manage files where they live, regardless of protocol, vendor and location. In other words, a storage-agnostic unstructured data file management platform. This way, you can analyze and manage data with a single multi-site management console. Look for a solution that moves data in a format which is native to each cloud. This is necessary for AI and data services in the cloud to operate on the data.

Ensure that users get seamless access to their data regardless of where you move it. Visibility and analysis of data across storage is critical in today’s fast-living world where data is always in motion. Finally, our customers specifically talk about the fact that they do not want a data management solution that sits in the hot data path and creates vendor lock-in. Ensure you always have control over your data and can easily move it where you need to without unnecessary hassle or punitive licensing fees and rehydration costs.

Finally, what advice do you have for IT and storage pros when it comes to understanding and making better use of corporate data? 

It’s time for IT to manage files and object data, not storage.  All data has personality and attributes which need to be considered versus lumping all data into one large entity that is managed the same way.

The only way to get that nuanced data management capability is through continual insight into your files and objects across all storage – from your data center to the edge to the cloud. Automate whenever you can and use analytics to drive your decisions.

Techstrong TV

Click full-screen to enable volume control
Watch latest episodes and shows

SHARE THIS STORY

RELATED STORIES