PROXY  WHOIS  RQUOTE  TEXTS  SOFT  FOREX  BBOARD
 Music  Philosophy  Code  Literature  Russian

= ROOT|Technical|RFC|rfc1951.txt =

page 4 of 10




         In other words, if one were to print out the compressed data as
         a sequence of bytes, starting with the first byte at the
         *right* margin and proceeding to the *left*, with the most-
         significant bit of each byte on the left as usual, one would be
         able to parse the result from right to left, with fixed-width
         elements in the correct MSB-to-LSB order and Huffman codes in
         bit-reversed order (i.e., with the first bit of the code in the
         relative LSB position).

   3.2. Compressed block format

      3.2.1. Synopsis of prefix and Huffman coding

         Prefix coding represents symbols from an a priori known
         alphabet by bit sequences (codes), one code for each symbol, in
         a manner such that different symbols may be represented by bit
         sequences of different lengths, but a parser can always parse
         an encoded string unambiguously symbol-by-symbol.

         We define a prefix code in terms of a binary tree in which the
         two edges descending from each non-leaf node are labeled 0 and
         1 and in which the leaf nodes correspond one-for-one with (are
         labeled with) the symbols of the alphabet; then the code for a
         symbol is the sequence of 0's and 1's on the edges leading from
         the root to the leaf labeled with that symbol.  For example:












 
RFC 1951      DEFLATE Compressed Data Format Specification      May 1996


                          /\              Symbol    Code
                         0  1             ------    ----
                        /    \                A      00
                       /\     B               B       1
                      0  1                    C     011
                     /    \                   D     010
                    A     /\
                         0  1
                        /    \
                       D      C

         A parser can decode the next symbol from an encoded input
         stream by walking down the tree from the root, at each step
         choosing the edge corresponding to the next input bit.

         Given an alphabet with known symbol frequencies, the Huffman
         algorithm allows the construction of an optimal prefix code
         (one which represents strings with those symbol frequencies
         using the fewest bits of any possible prefix codes for that
         alphabet).  Such a code is called a Huffman code.  (See
         reference [1] in Chapter 5, references for additional
         information on Huffman codes.)

         Note that in the "deflate" format, the Huffman codes for the
         various alphabets must not exceed certain maximum code lengths.
         This constraint complicates the algorithm for computing code
         lengths from symbol frequencies.  Again, see Chapter 5,
         references for details.

      3.2.2. Use of Huffman coding in the "deflate" format

         The Huffman codes used for each alphabet in the "deflate"
         format have two additional rules:

             * All codes of a given bit length have lexicographically
               consecutive values, in the same order as the symbols
               they represent;

             * Shorter codes lexicographically precede longer codes.













 
RFC 1951      DEFLATE Compressed Data Format Specification      May 1996


         We could recode the example above to follow this rule as
         follows, assuming that the order of the alphabet is ABCD:
=4=

1|2|3| < PREV = PAGE 4 = NEXT > |5|6|7|8|9|10

UP TO ROOT | UP TO DIR | TO FIRST PAGE

Google
 


E-mail Facebook Google Digg del.icio.us BlinkList Fark Furl Ma.gnolia Netscape NewsVine Reddit Slashdot Spurl StumbleUpon Technorati YahooMyWeb LiveJournal Blogmarks TwitThis Live News2.ru BobrDobr.ru Memori.ru MoeMesto.ru

0.0110528 wallclock secs ( 0.00 usr + 0.00 sys = 0.00 CPU)