Lempel-Ziv Welch compression
(LZW) The
algorithm used by the
Unix compress command to reduce the size of files, e.g. for archival or transmission. LZW was designed by Terry Welch in 1984 for implementation in hardware for high-performance disk controllers.
It is a variant of LZ78, one of the two
Lempel-Ziv compression schemes.
The LZW algorithm relies on reoccurrence of byte sequences (strings) in its input.
It maintains a table mapping input strings to their associated output codes.
The table initially contains mappings for all possible strings of length one. Input is taken one byte at a time to find the longest initial string present in the table.
The code for that string is output and then the string is extended with one more input byte, b.
A new entry is added to the table mapping the extended string to the next unused code (obtained by incrementing a counter).
The process repeats, starting from byte b.
The number of bits in an output code, and hence the maximum number of entries in the table is usually fixed and once this limit is reached, no more entries are added.
LZW compression and decompression are licensed under Unisys Corporation's 1984 U.S. Patent 4,558,302 and equivalent foreign patents.
This kind of patent isn't legal in most coutries of the world (including the UK) except the USA. Patents in the UK can't describe
algorithms or mathematical methods.
[A Technique for High Performance Data Compression, Terry A. Welch, IEEE Computer, 17(6), June 1984, pp. 8-19]
[J. Ziv and A. Lempel, "A Universal Algorithm for Sequential Data Compression," IEEE Transactions on Information Theory, Vol. IT-23, No. 3, May 1977, pp. 337-343].