Data gravity refers to the power of a data set to attract other information, applications, and services. The idea is similar to Newton’s Law of Gravity, which states that a particle’s ability to attract other objects is directly proportional to the product of their masses. The greater mass a piece of matter has, the greater its gravitational force, and the more objects it can draw to itself. 

In information technology (IT), this universal law translates to: The larger a data set is, the more it attracts other information and applications. Because of data gravity, applications, services, and even other information would naturally fall into the most massive data set.

Other interesting terms…

Read More about “Data Gravity

Data accumulates over time, and the emergence of big data and data science is proof of this. The ability of regular people to own personal computers (PCs) and access the Internet has resulted in billions of rows and columns of data. On top of that, the use of Internet of Things (IoT) devices increases the amount of data exponentially.

As if that’s not enough, the Theory of Data Gravity espouses that information has mass. As a data set grows, it becomes heavier and harder to move. As a result, applications and products attracted to it would have to move closer (you will find out later that this move means figuratively and physically).

History of Data Gravity

Data gravity is a term introduced in a 2010 blog post by Dave McCrory, who was then the vice president of Engineering at GE Digital. The actual entry can be accessed here. In the article, McCrory pointed out that data gravity is the reason companies like Salesforce were exploring cloud projects (database[.]com, to be specific). 

As more and more data move to the cloud, tools, services, and applications would also increasingly become cloud-based. That is data gravity, and we are witnessing it now. In fact, Cisco predicted in 2018 that by 2021, 94% of workloads would be cloud-based.

Fast forward two years later, IDG Communications’s 2020 Cloud Computing Survey further revealed that the IT environment of 92% of organizations is already in the cloud.

How does that affect our daily lives? Just think of portable storage devices, such as Universal Serial Bus (USB) flash drives and memory cards. When was the last time you actually needed one? We’re using them less and less because our photos, files, and other data are already in the cloud.

Two Crucial Effects of Data Gravity

McCrory likened data to a planet or an object with considerable mass and, therefore, has greater gravitational force. Two important points relate to data gravity:

  • Force: As the data builds mass by accumulating more information, more applications and services will be attracted to it. A perfect example is Google. The reason why it has almost all the answers is simply that it has huge amounts of data. As such, countless services and applications are now based on Google. Aside from Google-owned products, third parties go all out to ensure that their products are compatible with it.
  • Speed: As objects get closer to the source of gravitational force, it accelerates. As such, the closer an application is to the data mass, the faster it can process information. For instance, an application with a data center based in New York that has to access information from a database in Utah would have latency issues. However, if the data center is closer, data processing would become faster.

Data gravity implies that whoever has the most massive amount of data holds the greatest power. Other organizations would literally be pulled toward it so they can quickly tap into its data sets. And because of these data sets’ force, more applications and services are designed to fit them.
Therefore, data gravity is a powerful thing, which should not be a surprise. After all, the moon’s gravity can actually pull the earth’s water up, resulting in high tides.