Evaluation of Basic Data Compression Algorithms in a Distributed Environment
PDF

Keywords

 Distributed Computing, Compression, LZ, Shannon-Fano, RLE, Huffman, Scalability.

How to Cite

Minhaj Ahmad Khan. (2021). Evaluation of Basic Data Compression Algorithms in a Distributed Environment. Journal of Basic & Applied Sciences, 8(2), 362–365. https://doi.org/10.6000/1927-5129.2012.08.02.18

Abstract

Data compression methods aim at reducing the data size in order to make data transfer more efficient. To accomplish data compression, the basic algorithms such as Huffman, Lempel-Ziv (LZ), Shannon-Fano (SF) and Run-Length Encoding (RLE) are widely used. Most of the applications incorporate different variants of these algorithms. This paper analyzes the execution times, compression ratio and efficiency of compression methods in a client-server distributed environment. The data from a client is distributed to multiple processors/servers, subsequently compressed by the servers at remote locations, and sent back to the client. Our experimentation has been carried out using Simgrid Framework. Our results show that the LZ algorithm attains better efficiency/scalability and compression ratio, however, it works slower than other algorithms.

https://doi.org/10.6000/1927-5129.2012.08.02.18
PDF

References

Huffman DA. A Method for the Construction of Minimum-Redundancy Codes. Proceedings of the I.R.E. Sep. 1952; 1098-102.

Vitter JS. Design and Analysis of Dynamic Huffman Codes. J ACM 1987; 34(4): 825-45. http://dx.doi.org/10.1145/31846.42227

Blelloch GE. Introduction to Data Compression. Carnegie Mellon Univ., 2010.

Cormen TH, Stein C, Rivest RL, Leiserson CE. Introduction to Algorithms (2nd ed.). McGraw-Hill Higher Education 2001.

Shannon CE. A Mathematical Theory of Communication. Bell Syst Tech J 1948; 27: 379-23.

Ziv J, Lempel A. A universal Algorithm for Sequential Data Compression. IEEE Trans Inform Theory 1977; 23(3): 337-43.

Ziv J, Lempel A. Compression of Individual Sequences via Variable-Rate Coding. IEEE Trans Inform Theory 1978; 24(5): 530-36.

Leurs L. RLE Compression. http://www.prepressure.com/library/compression_algorithms/rle, 2012.

Bondi AB. Characteristics of scalability and their impact on performance. Proceedings of the 2nd international workshop on Software and performance, Ottawa Canada, 2000; pp. 195-203.

Fox G, Johnson M, Lyzenga G, Otto S, Salmon J, Walker D. Solving Problems on Concurrent Processors. Prentice-Hall, Englewood Cliffs, NJ 1988; Vol. 1.

Kumar V, Grama A, Gupta A, Karypis G. Introduction to Parallel Computing - Design and Analysis of Algorithms. The Benjamin/Cummings Publishing Company, Inc., Redwood City, CA, 1994.

Casanova H, Legrand A, Quinson M. SimGrid: a Generic Framework for Large-Scale Distributed Experimentations. Proceedings of the 10th IEEE International Conference on Computer Modelling and Simulation (UKSIM/EUROSIM'08), 2008.

Geelnard M. Basic Compression Library. 2006; Available: http://bcl.comli.eu/.