Secure Block-Based Data De-Duplication System Using Hash-Code Referencing Technology
Author: BISASO NICHOLAS
Supervisor: Rahman Sanya Bs.
There has been an increase in the design and development of data backup techniques both offline and on cloud especially in developed countries. Despite the usual problems of poor infrastructure and technologies, data de-duplication systems designed for the developing world need to conform to users with different models of data compression, security and usability than those designed for developed world.
This study investigated data de-duplication across stand alone work stations and mobile devices with a de-duplication model and an interface that supports the usage and concerns of low literacy users in developing countries. The main goal of the study is to minimize the amount of storage capacity that is taken up by files stored multiple times and may be later transferred over the internet hence consuming bandwidth.Using block based file chunk, referencing a hash-code for each and every file or data chunk is generated then stored. This unique hash-code is matched against all existing chunks. When a match is found, a confirmation of an already existing file is made and file can be backed up only once.
The findings of the survey of end-users were useful in understanding the current state of practice in data backup handling, understanding organizationalí needs and requirements, and deciding the nature of the data de-duplication system to be implemented. The design, development, implementation and evaluation of the Data De-duplication System (DDS) system were achieved through a User-Centered Design (UCD) approach, Hash-code Referencing Technology.
The experimental results of the data de-duplication system reveal that the system is useful to users. Results also demonstrate that DDS can be extended to personal mobile phones in future for data cost reduction and secure cloud storage of personal data.