As you may already know, TLS-level compression will be removed from the
next version of the TLS protocol (currently named TLS 1.3) whereas NNTP
was relying on this feature (see for instance the abstract of RFC 4642).
That's why we designed a new NNTP command based on DEFLATE to allow a
connection to be compressed, and we currently are in the process of
You can find the current version of the draft here:
In case some of you are interested and have time to read the draft, do
not hesitate to tell if you find something wrong or missing.
Here are the parts of the draft that are related to compression and DEFLATE:
3. Compression Efficiency
This section is informative, not normative.
NNTP poses some unusual problems for a compression layer.
Upstream traffic is fairly simple. Most NNTP clients send the same
few commands again and again, so any compression algorithm that can
exploit repetition works efficiently. The article posting and
transfer commands (e.g., POST, IHAVE, and TAKETHIS [RFC4644]) are
exceptions; clients that send many article posting or transfer
commands may want to surround large multi-line data blocks with a
dictionary flush and/or, depending on the compression algorithm, a
change of compression level in the same way as is recommended for
servers later in this document (Section 3.1).
Downstream traffic has the unusual property that several kinds of
data are sent, possibly confusing a dictionary-based compression
One type is NNTP simple responses and NNTP multi-line responses not
related to article header/body retrieval (e.g, CAPABILITIES, GROUP,
LISTGROUP, LAST, NEXT, STAT, DATE, NEWNEWS, NEWGROUPS, LIST, CHECK
[RFC4644], etc). These are highly compressible; zlib using its least
CPU-intensive setting compresses typical responses to 25-40% of their
Another type is article headers (as retrieved via the HEAD, HDR,
OVER, or ARTICLE commands). These are equally compressible, and
benefit from using the same dictionary as the NNTP responses.
A third type is article body text (as retrieved via the BODY or
ARTICLE commands). Text is usually fairly short and includes much
ASCII, so the same compression dictionary will do a good job here,
too. When multiple messages in the same thread are read at the same
time, quoted lines, etc. can often be compressed almost to zero.
Finally, non-text article bodies or attachments (as retrieved via the
BODY and ARTICLE commands) are transmitted in encoded form, usually
Base64 [RFC4648], UUencode [IEEE.1003-2.1992], or yEnc [yEnc].
When already compressed articles or attachments are retrieved, a
compression algorithm may be able to compress them, but the format of
their encoding is usually not NNTP-like, so the dictionary built
while compressing NNTP does not help much. The compressor has to
adapt its dictionary from NNTP to the attachment's encoding format,
and then back.
When attachments are retrieved in Base64 or UUencode form, the
Huffman coding usually compresses those to approximatively only 75%
of their encoding size. 8-bit compression algorithms such as DEFLATE
work well on 8-bit file formats; however, both Base64 and UUencode
transform a file into something resembling 6-bit bytes, hiding most
of the 8-bit file format from the compressor.
On the other end, attachments encoded using a compression algorithm
that retains the full 8-bit spectrum, like yEnc, are much more likely
to be incompressible.
3.1. DEFLATE Specificities
When using the zlib library (see [RFC1951]), the functions
deflateInit2(), deflate(), inflateInit2(), and inflate() suffice to
implement this extension. The windowBits value MUST be in the range
-8 to -15 for deflateInit2(), or else it will use the wrong format.
The windowBits value SHOULD be -15 for inflateInit2(), or else it
will not be able to decompress a stream with a larger window size.
deflateParams() can be used to improve compression rate and resource
use. In order to improve compression efficiency, the Z_PARTIAL_FLUSH
argument to deflate() should always be used to flush data. As far as
DEFLATE is concerned, clearing the dictionary never improves
compression over the other flushes. On the contrary, having the 32kB
dictionary from previous data, no matter how unrelated, can only
help. If there are no matching strings in there, then it is simply
not referenced. Using Z_FULL_FLUSH clears the dictionary, and
consequently always results in compression that is less effective
than a Z_PARTIAL_FLUSH.
A server can improve downstream compression and the CPU efficiency
both of the server and the client if it adjusts the compression level
(e.g., using the deflateParams() function in zlib) at the start and
end of large non-text multi-line data blocks (before and after
'content-lines' in the definition of 'multi-line-data-block' in
[RFC3977] Section 9.8). It permits to avoid trying to compress
incompressible attachments. Small multi-line data blocks are best
left alone. A possible boundary is 5kB.
A very simple strategy is to change the compression level to 0 at the
start of a multi-line data block provided the first two bytes are
either 0x1F 0x8B (as in deflate-compressed files) or 0xFF 0xD8
(JPEG), and to keep it at 1-5 the rest of the time. More complex
strategies are of course possible, and encouraged.
6. Security Considerations
Security issues are discussed throughout this document.
In general, the security considerations of the NNTP core
specification ([RFC3977] Section 12) and the DEFLATE compressed data
format specification ([RFC1951] Section 6) are applicable here.
Implementers should be aware that combining compression with
encryption like TLS can sometimes reveal information that would not
have been revealed without compression, as explained in Section 6 of
[RFC3749]. As a matter of fact, adversaries that observe the length
of the compressed data might be able to derive information about the
corresponding uncompressed data. The CRIME and the BREACH attacks
([RFC7457] Section 2.6) are examples of such case.
In order to help mitigate leaking authentication credentials, this
document states in Section 2.2.2 that authentication SHOULD NOT be
attempted when a compression layer is active. Therefore, when a
client wants to authenticate, compress data, and negotiate a TLS
layer (without TLS-level compression) in the same NNTP connection, it
SHOULD use the STARTTLS, AUTHINFO, and COMPRESS commands in that
order. Of course instead of using the STARTTLS command, a client can
also use implicit TLS, that is to say it begins the TLS negotiation
immediately upon connection on a separate port dedicated to NNTP over
NNTP commands other than AUTHINFO are not believed to divulgate
confidential information as long as only public Netnews newsgroups
and articles are accessed. That is why this specification only adds
a restriction to the use of AUTHINFO when a compression layer is
active. In case confidential articles are accessed in private
newsgroups, special care is needed: implementations SHOULD NOT
compress confidential data together with public data when a security
layer is active, for the same reasons as mentioned above in this
Additionally, implementations MAY ensure that the contents of two
distinct confidential articles are not compressed together. This can
be achieved for instance with DEFLATE by clearing the compression
dictionary each time a confidential article is sent. More complex
implementations are of course possible, and encouraged.
Implementations SHOULD use a default configuration with disabled
compression when a security layer is active, and MUST support an
option to allow compression to be enabled when a security layer is
active. Such an option can be either with global scope or server/
connection based. Implementations MAY unconditionally allow
compression to be enabled when no security layer is active.
Future extensions to NNTP that define commands conveying confidential
data SHOULD ensure to state that these confidential data SHOULD NOT
be compressed together with public data when a security layer is
Thanks beforehand for your helpful comments,
« – À la plage ? Mais il pleut !
– Pas du tout ! Dans le midi de la Gaule, il pleut. Ici, c'est tout
juste un peu humide. Vivifiant. Pas vrai, Astérix ?
– Ce matin, ça devient de plus en plus vivifiant ! » (Astérix)