PROXY  WHOIS  RQUOTE  TEXTS  SOFT  FOREX  BBOARD
 Music  Philosophy  Code  Literature  Russian

= ROOT|Technical|RFC|rfc1951.txt =

page 9 of 10



   be examined (not necessarily all different, of course).  First, the
   compressor examines the hash chain for XYZ.  If the chain is empty,
   the compressor simply writes out X as a literal byte and advances one
   byte in the input.  If the hash chain is not empty, indicating that
   the sequence XYZ (or, if we are unlucky, some other 3 bytes with the
   same hash function value) has occurred recently, the compressor
   compares all strings on the XYZ hash chain with the actual input data
   sequence starting at the current point, and selects the longest
   match.

   The compressor searches the hash chains starting with the most recent
   strings, to favor small distances and thus take advantage of the
   Huffman encoding.  The hash chains are singly linked. There are no
   deletions from the hash chains; the algorithm simply discards matches
   that are too old.  To avoid a worst-case situation, very long hash
   chains are arbitrarily truncated at a certain length, determined by a
   run-time parameter.

   To improve overall compression, the compressor optionally defers the
   selection of matches ("lazy matching"): after a match of length N has
   been found, the compressor searches for a longer match starting at
   the next input byte.  If it finds a longer match, it truncates the
   previous match to a length of one (thus producing a single literal
   byte) and then emits the longer match.  Otherwise, it emits the
   original match, and, as described above, advances N bytes before
   continuing.

   Run-time parameters also control this "lazy match" procedure.  If
   compression ratio is most important, the compressor attempts a
   complete second search regardless of the length of the first match.
   In the normal case, if the current match is "long enough", the
   compressor reduces the search for a longer match, thus speeding up
   the process.  If speed is most important, the compressor inserts new
   strings in the hash table only when no match was found, or when the
   match is not "too long".  This degrades the compression ratio but
   saves time since there are both fewer insertions and fewer searches.






 
RFC 1951      DEFLATE Compressed Data Format Specification      May 1996


5. References

   [1] Huffman, D. A., "A Method for the Construction of Minimum
       Redundancy Codes", Proceedings of the Institute of Radio
       Engineers, September 1952, Volume 40, Number 9, pp. 1098-1101.

   [2] Ziv J., Lempel A., "A Universal Algorithm for Sequential Data
       Compression", IEEE Transactions on Information Theory, Vol. 23,
       No. 3, pp. 337-343.

   [3] Gailly, J.-L., and Adler, M., ZLIB documentation and sources,
       available in ftp://ftp.uu.net/pub/archiving/zip/doc/

   [4] Gailly, J.-L., and Adler, M., GZIP documentation and sources,
       available as gzip-*.tar in ftp://prep.ai.mit.edu/pub/gnu/

   [5] Schwartz, E. S., and Kallick, B. "Generating a canonical prefix
       encoding." Comm. ACM, 7,3 (Mar. 1964), pp. 166-169.

   [6] Hirschberg and Lelewer, "Efficient decoding of prefix codes,"
       Comm. ACM, 33,4, April 1990, pp. 449-459.

6. Security Considerations

   Any data compression method involves the reduction of redundancy in
   the data.  Consequently, any corruption of the data is likely to have
   severe effects and be difficult to correct.  Uncompressed text, on
   the other hand, will probably still be readable despite the presence
   of some corrupted bytes.

   It is recommended that systems using this data format provide some
   means of validating the integrity of the compressed data.  See
   reference [3], for example.

7. Source code

   Source code for a C language implementation of a "deflate" compliant
   compressor and decompressor is available within the zlib package at
   ftp://ftp.uu.net/pub/archiving/zip/zlib/.

8. Acknowledgements

   Trademarks cited in this document are the property of their
   respective owners.

   Phil Katz designed the deflate format.  Jean-Loup Gailly and Mark
   Adler wrote the related software described in this specification.
   Glenn Randers-Pehrson converted this document to RFC and HTML format.




 
RFC 1951      DEFLATE Compressed Data Format Specification      May 1996
=9=

1|2|3|4|5|6|7|8| < PREV = PAGE 9 = NEXT > |10

UP TO ROOT | UP TO DIR | TO FIRST PAGE

Google
 


E-mail Facebook Google Digg del.icio.us BlinkList Fark Furl Ma.gnolia Netscape NewsVine Reddit Slashdot Spurl StumbleUpon Technorati YahooMyWeb LiveJournal Blogmarks TwitThis Live News2.ru BobrDobr.ru Memori.ru MoeMesto.ru

0.045963 wallclock secs ( 0.01 usr + 0.01 sys = 0.02 CPU)