BLOCK LEVEL DATA DEDUPLICATION

  • Unique Paper ID: 187154
  • PageNo: 4599-4606
  • Abstract:
  • In today’s world of massive data generation, efficient storage utilization has become a critical challenge. Data deduplication is a storage optimization technique that eliminates redundant data by retaining only one unique copy of repeating blocks. This paper presents the implementation of block-level data deduplication using Python, which divides files into smaller fixed-size or variable-size blocks, computes their hash values, and stores only unique blocks. The system uses hashing algorithms such as SHA-256 to identify duplicate blocks and a simple indexing mechanism to manage block references. The Proposed paper check identical data blocks are detected across files, storage space is reduced significantly, The Python-based implementation highlights the feasibility of deduplication in research and teaching environments, providing a foundation for extending this work to cloud storage systems, backup solutions, and file versioning platforms.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{187154,
        author = {Suvarna Lahanu Ghogare},
        title = {BLOCK LEVEL DATA DEDUPLICATION},
        journal = {International Journal of Innovative Research in Technology},
        year = {2025},
        volume = {12},
        number = {6},
        pages = {4599-4606},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=187154},
        abstract = {In today’s world of massive data generation, efficient storage utilization has become a critical challenge. Data deduplication is a storage optimization technique that eliminates redundant data by retaining only one unique copy of repeating blocks. This paper presents the implementation of block-level data deduplication using Python, which divides files into smaller fixed-size or variable-size blocks, computes their hash values, and stores only unique blocks. The system uses hashing algorithms such as SHA-256 to identify duplicate blocks and a simple indexing mechanism to manage block references. The Proposed paper check identical data blocks are detected across files, storage space is reduced significantly, The Python-based implementation highlights the feasibility of deduplication in research and teaching environments, providing a foundation for extending this work to cloud storage systems, backup solutions, and file versioning platforms.},
        keywords = {Data Deduplication, Python, Storage Optimization, Hashing, Cloud Storage, Block-Level Deduplication},
        month = {November},
        }

Cite This Article

Ghogare, S. L. (2025). BLOCK LEVEL DATA DEDUPLICATION. International Journal of Innovative Research in Technology (IJIRT), 12(6), 4599–4606.

Related Articles