Data De-duplication is a hot technology in IT storage because with the new technology it enabled the companies to save a lot of money.  The benefit of data de-duplication is to reduced the data center foot-print, reduced backup cost, reduce the hardware costs, and increase the efficiency use of storage. Data De-duplication is a technology that reducing the storage needs by eliminating redundant data.


In short, Data De-duplication can generally operate at blocks or files level and then removing the duplicated object (copied) or non unique objects.  Block de-duplication looks within a file and save the unique iteration of each of the block.

Data De-duplication - Original Data and Deduplicates removed

Example of Data De-Duplication

Using email system as an example in this case. An Email system contain of 200 instances of the same 2 MB file attachment. Let’s say the email platform is archived or backed up, all the 200 instances will be stored, and it will require 200 MB of storage space. But with data de-duplication technology, there is only one instance of the attachment is actually stored; each subsequent instance is comparing and refer back to the one stored copy to reduce the storage usage and bandwidth demand to only 2 MB.


How De-duplication works?

Data De-Duplication – You can manually compare two files and delete the one that’s no longer needed or the older one. Today, the most of the Data De-duplication solution use the standard data encryption techniques to create a unique mathematical representation of the data in question – a hash - .

Data Encryption standard used in the Data De-duplication:

  •  Message Digest Algorithm 5 (MD5)

  • Data Encryption Standard (DES)

  • Secure Hash Algorithm (SHA)

  • Advance Encryption Standard (AES)