Erasure coding is a means to protect data. In it, data is broken into fragments, expanded, encoded with redundant information, and stored in different locations or storage media. And so, if a storage media fails or data is corrupted, the data can be reconstructed from parts stored in other storage media.
Think of any movie where the main character is framed for a crime and obtains evidence to exonerate himself. When he distributes different parts of the evidence and keeps these in other locations as insurance (e.g., a page of a document in a storage box per bank), he’s employing a form of erasure coding.
Read More about “Erasure Coding”
Companies that need a failure-free storage environment need erasure coding for their disk array systems (storage devices with several disks), data grids (service sets that let users access, change, and transfer large amounts of data from different locations for research), distributed storage applications, object stores (device sets that manage data as objects), and archival storage (devices that contain archives). To know more about how it works, watching this video may help:
Erasure Coding Does Not Replace Backups
While erasure coding protects data, it provides only one level of protection and can’t replace backups. It just protects against hard drive or solid-state drive (SSD) failures.
A backup is an independent copy of a file kept in a different device and/or location. Erasure coding, meanwhile, is just a part of that file that’s saved in another device and/or place. That makes it an unfit means of protection against threats like ransomware and site failures.
Benefits of Erasure Coding
Erasure coding provides several benefits that include better storage space usage, reliability, suitability, and flexibility. It also increases data redundancy without the overhead costs or limitations of Redundant Array of Inexpensive Disks (RAID).
Better Utilization of Storage Space
Erasure coding delivers better storage utilization compared with using RAID. While users can still keep as many copies, these consume less space because they are just fragments of files instead of the entire files.
Since erasure coding only keeps data fragments that serve as independent disaster copies, a failure in one won’t affect the others.
Erasure coding can be applied to files of any size, from kilobytes to petabytes. To put that into perspective, a 3-minute song is around 3,000 kilobytes. And 1 petabyte translates to 1,099,511,627,775.96 kilobytes.
Erasure coding requires only data subsets to recover files. It also allows users to replace only failed components without taking an affected system offline.
Is Erasure Coding the Same as RAID?
While erasure coding is often confused with RAID, they are very different. You don’t break data apart in RAID. You keep data intact but copy and store it in multiple devices and locations. In erasure coding, you break the data apart, expand and encode it, and store it in various places.
Erasure coding also has several benefits over RAID. One is that depending on how it is configured, erasure coding can rebuild failed disks much faster than RAID.
Plotting storage strategies requires organizations to consider several factors. They should consider protecting against data loss and providing disaster recovery.
Storage strategies can come in different forms, including data replication, RAID usage, and erasure coding. Each method has advantages and disadvantages, but the ever-growing data volume and migration to object storage is likely to contribute to erasure coding’s momentum. Like any technology, though, erasure coding needs to adapt to industry changes.