LZW coding

Creator
Creator
Seonglae Cho
Created
Created
2025 Mar 19 11:55
Editor
Edited
Edited
2025 Mar 19 12:6

Lempel–Ziv–Welch

  • Input is a set of 8-bit codewords, Output is a string of 7-bit ASCII characters
  • Maintains a symbol table mapping input characters/strings (keys) to fixed size output codewords (values)
 

Symbol table

  • Symbol table contains 256 entries
  • Initialize with the 128 single character entries
  • Reserve codeword 80 for End Of File (EOF)
  • Using hexadecimal notation:
    • Codewords 00-7F (0111 1111) represent the 128 single characters
    • Codeword 80 (1000 0000) is EOF
    • Codewords 81-FF (1000 0001 – 1111 1111) represent strings of characters
 

Algorithm

Add an entry associating the next unused codeword (value) with the key formed by appending the lookahead character to the key
  1. Find the
    Longest prefix matching
    s in the symbol table that is a prefix of the unscanned input
  1. Write the 8-bit codeword associated with s
  1. Scan one character past s in the input
  1. Associate the next codeword value with s+c (c appended to s) in the symbol table, where c is the next character in the input.
 

LZW decoding

Only requires initial dictionary without the whole symbol table
  1. Write the current string val
  1. Read the next codeword x from the input
  1. Set s to the value associated with x in the symbol table
  1. Associate the next unassigned codeword to the value val+c in the symbol table, where c is the first character of s
  1. Set the current string val to s
  1. Loop until EOF
 
 
 
 
 

Recommendations