|
LZ77 and LZ78 are the names for the two lossless data compression algorithms published in
papers by Abraham Lempel and
Jacob Ziv in 1977 and 1978. These two algorithms form the basis for most of the LZ
variations including LZW, LZSS and others. They are both dictionary coders, unlike minimum redundancy coders or run length coders. LZ77 is
the "sliding window" compression algorithm, which was later shown to be equivalent to the explicit dictionary technique first
given in LZ78.
The LZ77 algorithm works by keeping a history window of the most recently seen data and comparing the current data being
encoded with the data in the history window. What is actually placed into the compressed stream are references to the position in
the history window, and the length of the match. If a match cannot be found the character itself is simply encoded into the
stream after being flagged as a literal. As of 2004, the most popular LZ77 based compression method is called DEFLATE; it combines LZ77 with Huffman coding.
While the LZ77 algorithm works on past data, the LZ78 algorithm attempts to work on future data. It does this by forward
scanning the input buffer and matching it against a dictionary it maintains. It will scan into the buffer until it cannot find a
match in the dictionary. At this point it will output the location of the word in the dictionary, if one is available, the match
length and the character that caused a match failure. The resulting word is then added to the dictionary.
LZ78 never became as popular as LZ77 because for the first few decades after it was introduced, LZ78 was somewhat of a
patent minefield in the United
States, while LZ77 is not patented. The most popular form of LZ78 compression was the LZW
algorithm, a modification of the LZ78 algorithm made by Terry Welch, which proved to be a patent minefield.
References
- Abraham Lempel, Jacob Ziv; A Universal Algorithm for
Sequential Data Compression , IEEE
Transactions on Information Theory, May 1977.
- Abraham Lempel, Jacob Ziv; Compression
of Individual Sequences Via Variable-Rate Coding , IEEE
Transactions on Information Theory, September 1978.
|