- With data decentralization, there would arise a need for data survivability, data residency, access control, and more.
- Three big factors create greater data decentralization.
The decade-long push to centralize the data storage in one warehouse has come to an end. This is because stashing everything into a “data lake” has caused more harm than good. For certain applications, the centralization of data true cloud storage services such as Amazon S3 and Snowflake works up to a certain limit. Simultaneously several important factors create greater data decentralization. Following are three of the biggest factors:
Customer data is scarce and is becoming more so:
The major advertisement platforms shared very few details about customer and performance data over the years. On a phone call with Procter and Gamble executive recently, they complained that even at the level of their spend, Google and Facebook expect them to “rent” data instead of being the owner of it. So a marketer would not be able to just build a giant data warehouse but also dump everything in it and run analytics on the data. Google and Facebook have taken over the advertising market. Because of this, the marketers can access only “cohort level” data. This analyzes one central data store next to impossible.
It appears that this friend won’t discontinue anytime soon since moves like banning third-party cookies. And, Apple’s move to ban IDFA only consolidate more power in Google, Facebook, and Apple. The scarcity of data has existed for a long time particularly when the retail sector is concerned. Retailers share as little as possible with the brands they work with which has been a continuous point of friction.
For instance, pharmaceutical companies don’t have any idea about where their medicines are prescribed and sold. Meantime vendors like CVS and Walgreens sell the prescription data to IQVIA, who eventually rents it back to the Pharma companies. They rent their data. The situation has worsened with E-Commerce stores as Amazon does not exchange any data with its sellers hence these brands receive less data than they used to receive from physical stores. The reason behind why traditional brands are eager to acquire direct-to-consumer brands is because they have direct access to the data on end consumers.
Highly regulated data transfer:
International privacy measures like GDPR and CCPA and other regulations have gotten more stringent with the laws about how you exchange the data both internally and between the companies. Also, the same loss is undergoing increased scrutiny. So far the companies have addressed this by requiring the software vendors to take the monetary risk of violating the regulations. The firm has decided to walk away from sum investment possibilities with software startups since they were too vulnerable on this front. They expect that there will be tech companies that fail due to regulatory exposure in the long run.
This may lead to changes in how these companies function. The first change would be a move from SaaS data tools to self-hosted tools in virtual private clouds. This is something that is already happening in the finance sector. Every large organization is one security breach away from bringing everything in-house. This has generated the requirement of a new generation of tools like OneTrust and BigID among others.
Customers demand vendors to be on the same cloud:
Retailers like Macy’s need a saas vendor tour on its application and store the data in Azure or Google VPC. In such a scenario the saas vendor needs to contemplate data partition, execution of software in multiple clouds, and sometimes about multiple zones of the same cloud.
Sometimes that kind of data is required to stay in a specific geographic location of the cloud provider. Such that the integration with other data of services can happen in the same vicinity. Apart from retail, this affects advertising though to a very less extent. At a minimum when a customer is provided a given cloud they expect the vendor to integrate with it. So if a customer runs its data basis in Azure, GCP, or AWS the vendor should follow the trail when the data export is concerned.
Another concern is the location of the data. Experts believe that big companies keep the data in the West Coast region of AWS or Snowflake. And, they use the software from Salesforce or another SaaS provider. When that SaaS provider exchanges data with another provider, both the companies have to find out ways to move or duplicate the data from region to region.
Looking ahead:
Growing movements towards data decentralization and cloud data migration contain use to create useful and profitable data tools specifically with the rise of the data science profession.
Moving ahead, the companies that provide applications, data transport tools, and data itself will see a hike in the demand. Data diversity is greater than the data scale. No-one possesses a data decoder ring. The transition to decentralized data may result in the requirement for data Management solutions in domains. Such as data survivability, data residency, access control, data masking, encryption, identity management, and much more.