A data mesh is a new approach to designing and developing data architectures. It lets users do away with the challenges involved in accessing data. It does so by creating a connectivity layer to control, manage, and support data access.
A data mesh stitches data stored in various devices and even organizations together. At its core, it makes data highly available, easily discoverable, secure, and interoperable with the applications that need access to it. It is not centralized and monolithic.
Other interesting terms…
Read More about a “Data Mesh”
The data mesh was born due to architectural and organizational pain points for businesses to become data-driven or compete and drive value. It helped companies access, sift through, and use data for decision making, planning, and campaign implementation.
But before we can dive into discussing a data mesh and how it works in detail, we need to distinguish between the two data types that it makes sense of—operational and analytical data.
2 Data Types
Organizations typically use these two types of data described in more detail below.
This data is stored in databases for business capabilities served with microservices. It is transactional in nature or has to do with buying and selling. As such, it needs to stay up-to-date to serve the applications that run businesses that rely on exchanging goods and money.
This data provides an aggregated view of the business over time. It is often used to provide retrospective or future-perspective insights for analytical reports. As such, it helps organizations study their past to increase revenue in the future.
A data mesh typically handles analytical data that is stored in a data lake or data warehouse.
What Is a Data Lake?
A data lake refers to a single repository that stores structured and unstructured data. Users can store data as-is in a data lake and run different analytics processes and systems to guide their decision-making. In some cases, a data lake can use machine learning (ML) to analyze log files and data from click-streams, social media, and Internet-connected devices to identify and act on opportunities to attract and retain customers, boost productivity, proactively maintain devices, and make informed decisions.
What Is a Data Warehouse?
A data warehouse is an optimized database that analyzes relational data from transactional systems and other applications necessary to keep the business running. The data in it is optimized for fast Structured Query Language (SQL) processing, whose results are typically used for operational reporting and analysis. It is cleaned, enriched, and transformed to become trustworthy to all users.
How Does a Data Mesh Work?
A data mesh makes the data in data lakes and warehouses accessible to all users who have a hand in a business’s operations. After analysis, the data can be transported back to the data lakes and warehouses for others’ use.
For a more thorough but brief explanation of how a data mesh works, watch this video:
Why Use a Data Mesh?
A data mesh tries to solve three challenges that come with using a centralized data lake or warehouse:
- Lack of ownership: Lets users know who owns the data—the source team or the infrastructure team.
- Lack of quality: While the infrastructure team is responsible for maintaining the data’s quality, it may not know the data as well as the source team.
- Organizational scaling: The central team or the one that runs the data lake or warehouse becomes the bottleneck.
When Do You Use a Data Mesh?
A data mesh can be used in various circumstances, such as:
- Connecting cloud applications to sensitive data that resides in a customer’s on-premise or cloud environment
- Creating virtual data catalogs from several data sources that can’t be centralized
- Creating virtual data lakes or warehouses for analytics and ML training without consolidating data into a single repository
- Giving application developers and DevOps teams ways to query data from a variety of storage devices without access problems
How do you know precisely if your organization needs a data mesh, though? Try answering these questions:
- Are your operational data owners, data engineers, and data consumers struggling to collaborate effectively?
- Are they having a hard time understanding one another?
- Is lack of business knowledge hampering your data engineers’ productivity?
- Is lack of business knowledge affecting your data consumers’ productivity?
- Does your company have a problem unifying data from different locations, business units, or departments?
If you answered “yes” to all five questions, then it is time for you to use a data mesh.