Use of DEFLATE with NNTP (Netnews / Usenet)

Hi all,

As you may already know, TLS-level compression will be removed from the 
next version of the TLS protocol (currently named TLS 1.3) whereas NNTP 
was relying on this feature (see for instance the abstract of RFC 4642).
That's why we designed a new NNTP command based on DEFLATE to allow a 
connection to be compressed, and we currently are in the process of 
standardizing it.

You can find the current version of the draft here:

In case some of you are interested and have time to read the draft, do 
not hesitate to tell if you find something wrong or missing.

Here are the parts of the draft that are related to compression and DEFLATE:

3.  Compression Efficiency

    This section is informative, not normative.

    NNTP poses some unusual problems for a compression layer.

    Upstream traffic is fairly simple.  Most NNTP clients send the same
    few commands again and again, so any compression algorithm that can
    exploit repetition works efficiently.  The article posting and
    transfer commands (e.g., POST, IHAVE, and TAKETHIS [RFC4644]) are
    exceptions; clients that send many article posting or transfer
    commands may want to surround large multi-line data blocks with a
    dictionary flush and/or, depending on the compression algorithm, a
    change of compression level in the same way as is recommended for
    servers later in this document (Section 3.1).

    Downstream traffic has the unusual property that several kinds of
    data are sent, possibly confusing a dictionary-based compression

    One type is NNTP simple responses and NNTP multi-line responses not
    related to article header/body retrieval (e.g, CAPABILITIES, GROUP,
    [RFC4644], etc).  These are highly compressible; zlib using its least
    CPU-intensive setting compresses typical responses to 25-40% of their
    original size.

    Another type is article headers (as retrieved via the HEAD, HDR,
    OVER, or ARTICLE commands).  These are equally compressible, and
    benefit from using the same dictionary as the NNTP responses.

    A third type is article body text (as retrieved via the BODY or
    ARTICLE commands).  Text is usually fairly short and includes much
    ASCII, so the same compression dictionary will do a good job here,
    too.  When multiple messages in the same thread are read at the same
    time, quoted lines, etc. can often be compressed almost to zero.

    Finally, non-text article bodies or attachments (as retrieved via the
    BODY and ARTICLE commands) are transmitted in encoded form, usually
    Base64 [RFC4648], UUencode [IEEE.1003-2.1992], or yEnc [yEnc].

    When already compressed articles or attachments are retrieved, a
    compression algorithm may be able to compress them, but the format of
    their encoding is usually not NNTP-like, so the dictionary built
    while compressing NNTP does not help much.  The compressor has to
    adapt its dictionary from NNTP to the attachment's encoding format,
    and then back.

    When attachments are retrieved in Base64 or UUencode form, the
    Huffman coding usually compresses those to approximatively only 75%
    of their encoding size.  8-bit compression algorithms such as DEFLATE
    work well on 8-bit file formats; however, both Base64 and UUencode
    transform a file into something resembling 6-bit bytes, hiding most
    of the 8-bit file format from the compressor.

    On the other end, attachments encoded using a compression algorithm
    that retains the full 8-bit spectrum, like yEnc, are much more likely
    to be incompressible.

3.1.  DEFLATE Specificities

    When using the zlib library (see [RFC1951]), the functions
    deflateInit2(), deflate(), inflateInit2(), and inflate() suffice to
    implement this extension.  The windowBits value MUST be in the range
    -8 to -15 for deflateInit2(), or else it will use the wrong format.
    The windowBits value SHOULD be -15 for inflateInit2(), or else it
    will not be able to decompress a stream with a larger window size.
    deflateParams() can be used to improve compression rate and resource
    use.  In order to improve compression efficiency, the Z_PARTIAL_FLUSH
    argument to deflate() should always be used to flush data.  As far as
    DEFLATE is concerned, clearing the dictionary never improves
    compression over the other flushes.  On the contrary, having the 32kB
    dictionary from previous data, no matter how unrelated, can only
    help.  If there are no matching strings in there, then it is simply
    not referenced.  Using Z_FULL_FLUSH clears the dictionary, and
    consequently always results in compression that is less effective
    than a Z_PARTIAL_FLUSH.

    A server can improve downstream compression and the CPU efficiency
    both of the server and the client if it adjusts the compression level
    (e.g., using the deflateParams() function in zlib) at the start and
    end of large non-text multi-line data blocks (before and after
    'content-lines' in the definition of 'multi-line-data-block' in
    [RFC3977] Section 9.8).  It permits to avoid trying to compress
    incompressible attachments.  Small multi-line data blocks are best
    left alone.  A possible boundary is 5kB.

    A very simple strategy is to change the compression level to 0 at the
    start of a multi-line data block provided the first two bytes are
    either 0x1F 0x8B (as in deflate-compressed files) or 0xFF 0xD8
    (JPEG), and to keep it at 1-5 the rest of the time.  More complex
    strategies are of course possible, and encouraged.

6.  Security Considerations

    Security issues are discussed throughout this document.

    In general, the security considerations of the NNTP core
    specification ([RFC3977] Section 12) and the DEFLATE compressed data
    format specification ([RFC1951] Section 6) are applicable here.

    Implementers should be aware that combining compression with
    encryption like TLS can sometimes reveal information that would not
    have been revealed without compression, as explained in Section 6 of
    [RFC3749].  As a matter of fact, adversaries that observe the length
    of the compressed data might be able to derive information about the
    corresponding uncompressed data.  The CRIME and the BREACH attacks
    ([RFC7457] Section 2.6) are examples of such case.

    In order to help mitigate leaking authentication credentials, this
    document states in Section 2.2.2 that authentication SHOULD NOT be
    attempted when a compression layer is active.  Therefore, when a
    client wants to authenticate, compress data, and negotiate a TLS
    layer (without TLS-level compression) in the same NNTP connection, it
    SHOULD use the STARTTLS, AUTHINFO, and COMPRESS commands in that
    order.  Of course instead of using the STARTTLS command, a client can
    also use implicit TLS, that is to say it begins the TLS negotiation
    immediately upon connection on a separate port dedicated to NNTP over

    NNTP commands other than AUTHINFO are not believed to divulgate
    confidential information as long as only public Netnews newsgroups
    and articles are accessed.  That is why this specification only adds
    a restriction to the use of AUTHINFO when a compression layer is
    active.  In case confidential articles are accessed in private
    newsgroups, special care is needed: implementations SHOULD NOT
    compress confidential data together with public data when a security
    layer is active, for the same reasons as mentioned above in this

    Additionally, implementations MAY ensure that the contents of two
    distinct confidential articles are not compressed together.  This can
    be achieved for instance with DEFLATE by clearing the compression
    dictionary each time a confidential article is sent.  More complex
    implementations are of course possible, and encouraged.

    Implementations SHOULD use a default configuration with disabled
    compression when a security layer is active, and MUST support an
    option to allow compression to be enabled when a security layer is
    active.  Such an option can be either with global scope or server/
    connection based.  Implementations MAY unconditionally allow
    compression to be enabled when no security layer is active.

    Future extensions to NNTP that define commands conveying confidential
    data SHOULD ensure to state that these confidential data SHOULD NOT
    be compressed together with public data when a security layer is

Thanks beforehand for your helpful comments,

Julien ÉLIE

« – À la plage ? Mais il pleut !
   – Pas du tout ! Dans le midi de la Gaule, il pleut. Ici, c'est tout
     juste un peu humide. Vivifiant. Pas vrai, Astérix ?
   – Ce matin, ça devient de plus en plus vivifiant ! » (Astérix)
9/11/2016 8:27:33 PM
comp.compression 4696 articles. 0 followers. Post Follow

0 Replies

Similar Articles

[PageSpeed] 41