TOC 
Independent SubmissionK. Murchison
Internet-DraftCarnegie Mellon University
Intended status: Standards TrackJanuary 21, 2010
Expires: July 25, 2010 


Network News Transfer Protocol (NNTP) Extension for Compression
draft-murchison-nntp-compress-00.txt

Abstract

This memo defines an extension to the Network News Transport Protocol (NNTP) to allow a connection to be effectively and efficiently compressed.

Issues to be addressed

Status of this Memo

This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as “work in progress.”

The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt.

The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html.

This Internet-Draft will expire on July 25, 2010.

Copyright Notice

Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the BSD License.



Table of Contents

1.  Introduction
    1.1.  Conventions Used in this Document
2.  The COMPRESS Extension
    2.1.  Advertising the COMPRESS Extension
    2.2.  COMPRESS Command
        2.2.1.  Usage
        2.2.2.  Description
        2.2.3.  Examples
3.  Compression Efficiency
4.  Augmented BNF Syntax for the COMPRESS Extension
    4.1.  Commands
    4.2.  Capability entries
5.  Summary of Response Codes
6.  Security Considerations
7.  IANA Considerations
8.  References
    8.1.  Normative References
    8.2.  Informative References
Appendix A.  Acknowledgements
§  Author's Address




 TOC 

1.  Introduction

The goal of COMPRESS is to reduce the bandwidth usage of NNTP.

Compared to PPP compression [RFC1962] (Rand, D. and K. Fox, “The PPP Compression Control Protocol (CCP),” June 1996.) and modem-based compression ([MNP] (Held, G., “The Complete Modem Reference,” May 1994.) and [V42bis] (International Telecommunications Union, “Data compression procedures for data circuit-terminating equipment (DCE) using error correction procedures,” January 1990.)), COMPRESS offers greater compression efficiency. COMPRESS can be used together with Transport Security Layer (TLS) [RFC5246] (Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” August 2008.), Simple Authentication and Security layer (SASL) encryption [RFC4422] (Melnikov, A. and K. Zeilenga, “Simple Authentication and Security Layer (SASL),” June 2006.), Virtual Private Networks (VPNs), etc.

Compared to TLS compression [RFC3749] (Hollenbeck, S., “Transport Layer Security Protocol Compression Methods,” May 2004.), COMPRESS has the following advantages:

and the following disadvantages:

In order to increase interoperability, it is desirable to have as few different compression algorithms as possible, so this document specifies only one. The DEFLATE algorithm (defined in [RFC1951] (Deutsch, P., “DEFLATE Compressed Data Format Specification version 1.3,” May 1996.)) is standard, widely available and fairly efficient, so it is the only algorithm defined by this document.

In order to increase interoperability, NNTP servers that advertise this extension SHOULD also support the TLS DEFLATE compression mechanism as defined in [RFC3749] (Hollenbeck, S., “Transport Layer Security Protocol Compression Methods,” May 2004.). NNTP clients MAY use either COMPRESS or TLS compression, however, if the client and server support both, it is RECOMMENDED that the client choose TLS compression.



 TOC 

1.1.  Conventions Used in this Document

The notational conventions used in this document are the same as those in [RFC3977] (Feather, C., “Network News Transfer Protocol (NNTP),” October 2006.) and any term not defined in this document has the same meaning as in that one.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119] (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.).

In the examples, commands from the client are indicated with [C], and responses from the server are indicated with [S].



 TOC 

2.  The COMPRESS Extension

The COMPRESS extension is used to enable compression of an NNTP connection.

This extension provides a new COMPRESS command and has capability label COMPRESS.



 TOC 

2.1.  Advertising the COMPRESS Extension

A server supporting the COMPRESS command as defined in this document will advertise the "COMPRESS" capability label in response to the CAPABILITIES command ([RFC3977] (Feather, C., “Network News Transfer Protocol (NNTP),” October 2006.) Section 5.2). This capability MAY be advertised both before and after any use of the MODE READER command ([RFC3977] (Feather, C., “Network News Transfer Protocol (NNTP),” October 2006.) section 5.3), with the same semantics.

The COMPRESS capability label contains a whitespace-separated list of available compression algorithms. This document defines one compression algorithm: DEFLATE. At least one compression algorithm MUST be supported in order to advertise the COMPRESS extension.

Future extensions may add additional compression algorithms to this capability. Unrecognized algorithms MUST be ignored by the client.

Example:

[C] CAPABILITIES
[S] 101 Capability list:
[S] VERSION 2
[S] READER
[S] IHAVE
[S] COMPRESS DEFLATE
[S] LIST ACTIVE NEWSGROUPS
[S] .


 TOC 

2.2.  COMPRESS Command



 TOC 

2.2.1.  Usage

This command MUST NOT be pipelined.

Syntax
   COMPRESS algorithm

Responses
   291 Compression active
   403 Unable to activate compression
   502 Command unavailable [1]

[1] If a compression layer is already active, COMPRESS is not a valid
    command (see Section 2.2).

Parameters
   algorithm = Name of compression algorithm: "DEFLATE"


 TOC 

2.2.2.  Description

The COMPRESS command instructs the server to use the named compression algorithm ("DEFLATE" is the only one defined) for all commands and/or responses after COMPRESS.

The client MUST NOT send any further commands until it has seen the result of COMPRESS.

If the server is unable to activate compression for any reason (e.g., a server configuration or resource problem), the server MUST reject the COMPRESS command with a 403 response. Otherwise, the server issues a 291 response and the compression layer takes effect for both client and server immediately following the CRLF of the success reply.

Both the client and the server MUST know if there is a compression layer active. A client MUST NOT attempt to activate compression (via either the COMPRESS or STARTTLS [RFC4642] (Murchison, K., Vinocur, J., and C. Newman, “Using Transport Layer Security (TLS) with Network News Transfer Protocol (NNTP),” October 2006.) commands) if a compression layer is already active. A server MUST NOT return the COMPRESS or STARTTLS capability labels in response to a CAPABILITIES command received after a compression layer is active, and a server MUST reply with a 502 response code if a COMPRESS or STARTTLS command is received while a compression layer is already active.

For DEFLATE (as for many other compression mechanisms), the compressor can trade speed against quality. When decompressing there isn't much of a tradeoff. Consequently, the client and server are both free to pick the best reasonable rate of compression for the data they send.

When COMPRESS is combined with TLS [RFC5246] (Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” August 2008.) or SASL [RFC4422] (Melnikov, A. and K. Zeilenga, “Simple Authentication and Security Layer (SASL),” June 2006.) security layers, the sending order of the three extensions MUST be first COMPRESS, then SASL, and finally TLS. That is, before data is transmitted it is first compressed. Second, if a SASL security layer has been negotiated, the compressed data is then signed and/or encrypted accordingly. Third, if a TLS security layer has been negotiated, the data from the previous step is signed and/or encrypted accordingly. When receiving data, the processing order MUST be reversed. This ensures that before sending, data is compressed before it is encrypted, independent of the order in which the client issues COMPRESS, AUTHINFO SASL, and STARTTLS.



 TOC 

2.2.3.  Examples

Example of layering TLS and NNTP compression:

[C] CAPABILITIES
[S] 101 Capability list:
[S] VERSION 2
[S] STARTTLS
[S] AUTHINFO
[S] COMPRESS DEFLATE
[S] .
[C] STARTTLS
[S] 382 Continue with TLS negotiation
[TLS negotiation without compression occurs here]
[Following successful negotiation, all traffic is encrypted]
[C] CAPABILITIES
[S] 101 Capability list:
[S] VERSION 2
[S] AUTHINFO USER
[S] COMPRESS DEFLATE
[S] .
[C] AUTHINFO USER fred
[S] 381 Enter passphrase
[C] AUTHINFO PASS flintstone
[S] 281 Authentication accepted
[C] COMPRESS DEFLATE
[S] 291 Compression active
[From this point on, all traffic is compresssed before being encrypted]

Example of a server failing to activate compression:

[C] CAPABILITIES
[S] 101 Capability list:
[S] VERSION 2
[S] COMPRESS DEFLATE
[S] .
[C] COMPRESS DEFLATE
[S] 403 Unable to activate compression

Examples of a server refusing to compress twice:

[C] CAPABILITIES
[S] 101 Capability list:
[S] VERSION 2
[S] STARTTLS
[S] COMPRESS DEFLATE
[S] .
[C] STARTTLS
[S] 382 Continue with TLS negotiation
[TLS negotiation with compression occurs here]
[Following successful negotiation, all traffic is protected by TLS]
[C] CAPABILITIES
[S] 101 Capability list:
[S] VERSION 2
[S] .
[C] COMPRESS DEFLATE
[S] 502 Compression already active via TLS
[C] CAPABILITIES
[S] 101 Capability list:
[S] VERSION 2
[S] STARTTLS
[S] COMPRESS DEFLATE
[S] .
[C] COMPRESS DEFLATE
[S] 291 Compression active
[C] CAPABILITIES
[S] 101 Capability list:
[S] VERSION 2
[S] .
[C] STARTTLS
[S] 502 DEFLATE compression already active


 TOC 

3.  Compression Efficiency

This section is informative, not normative.

NNTP poses some unusual problems for a compression layer.

Upstream is fairly simple. Most NNTP clients send the same few commands again and again, so any compression algorithm that can exploit repetition works efficiently. The POST and IHAVE commands are an exception; clients that send many POST/IHAVE commands may want to surround large multi-line data blocks with flushes in the same way as is recommended for servers later in this section.

Downstream has the unusual property that several kinds of data are sent, confusing all dictionary-based compression algorithms.

One type is NNTP responses. These are highly compressible; zlib using its least CPU-intensive setting compresses typical responses to 25-40% of their original size.

Another type is article headers. These are equally compressible, and benefit from using the same dictionary as the NNTP responses.

A third type is article body text. Text is usually fairly short and includes much ASCII, so the same compression dictionary will do a good job here, too. When multiple messages in the same thread are read at the same time, quoted lines etc. can often be compressed almost to zero.

Finally, attachments (non-text article bodies) are transmitted, either in binary form or encoded with base-64.

When attachments are retrieved in binary form, DEFLATE may be able to compress them, but the format of the attachment is usually not NNTP-like, so the dictionary built while compressing NNTP does not help. The compressor has to adapt its dictionary from NNTP to the attachment's format, and then back. A few file formats aren't compressible at all using deflate, e.g., .gz, .zip, and .jpg files.

When attachments are retrieved in base-64 form, the same problems apply, but the base-64 encoding adds another problem. 8-bit compression algorithms such as deflate work well on 8-bit file formats, however base-64 turns a file into something resembling 6-bit bytes, hiding most of the 8-bit file format from the compressor.

When using the zlib library (see [RFC1951] (Deutsch, P., “DEFLATE Compressed Data Format Specification version 1.3,” May 1996.)), the functions deflateInit2(), deflate(), inflateInit2(), and inflate() suffice to implement this extension. The windowBits value must be in the range -8 to -15, or else deflateInit2() uses the wrong format. deflateParams() can be used to improve compression rate and resource use. The Z_FULL_FLUSH argument to deflate() can be used to clear the dictionary (the receiving peer does not need to do anything).

A server can improve downstream compression if it hints to the compressor that the data type is about to change strongly, e.g., by sending a Z_FULL_FLUSH at the start and end of large non-text multi-line data blocks (before and after 'content-lines' in the definition of 'multi-line-data-block' in [RFC3977] (Feather, C., “Network News Transfer Protocol (NNTP),” October 2006.) Section 9.8). Small multi-line data blocks are best left alone. A possible boundary is 5k.

A server can improve the CPU efficiency both of the server and the client if it adjusts the compression level (e.g., using the deflateParams() function in zlib) at these points, to avoid trying to compress incompressible attachments. A very simple strategy is to change the level to 0 at the start of a multi-line data block provided the first two bytes are either 0x1F 0x8B (as in deflate-compressed files) or 0xFF 0xD8 (JPEG), and to keep it at 1-5 the rest of the time. More complex strategies are possible.



 TOC 

4.  Augmented BNF Syntax for the COMPRESS Extension

This section describes the syntax of the COMPRESS extension using ABNF [RFC5234] (Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” January 2008.). It extends the syntax in Section 9 of [RFC3977] (Feather, C., “Network News Transfer Protocol (NNTP),” October 2006.), and non-terminals not defined in this document are defined there. The [RFC3977] (Feather, C., “Network News Transfer Protocol (NNTP),” October 2006.) ABNF should be imported first before attempting to validate these rules.



 TOC 

4.1.  Commands

This syntax extends the non-terminal "command", which represents an NNTP command.

command =/ compress-command

compress-command = "COMPRESS" WS compress-alg

compress-alg = "DEFLATE"


 TOC 

4.2.  Capability entries

This syntax extends the non-terminal "capability-entry", which represents a capability that may be advertised by the server.

capability-entry =/ compress-capability

compress-capability = "COMPRESS" *(WS compress-alg)


 TOC 

5.  Summary of Response Codes

This section contains a list of each new response code defined in this document and indicates whether it is multi-line, which commands can generate it, what arguments it has, and what its meaning is.

Response code 291
   Generated by: COMPRESS
   Meaning: Compression layer activated


 TOC 

6.  Security Considerations

As for TLS compression [RFC3749] (Hollenbeck, S., “Transport Layer Security Protocol Compression Methods,” May 2004.).



 TOC 

7.  IANA Considerations

This section gives a formal definition of the COMPRESS extension as required by Section 3.3.3 of [RFC3977] (Feather, C., “Network News Transfer Protocol (NNTP),” October 2006.) for the IANA registry.



 TOC 

8.  References



 TOC 

8.1. Normative References

[RFC1951] Deutsch, P., “DEFLATE Compressed Data Format Specification version 1.3,” RFC 1951, May 1996 (TXT, PS, PDF).
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[RFC3977] Feather, C., “Network News Transfer Protocol (NNTP),” RFC 3977, October 2006 (TXT).
[RFC4642] Murchison, K., Vinocur, J., and C. Newman, “Using Transport Layer Security (TLS) with Network News Transfer Protocol (NNTP),” RFC 4642, October 2006 (TXT).
[RFC5234] Crocker, D. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF,” STD 68, RFC 5234, January 2008 (TXT).


 TOC 

8.2. Informative References

[MNP] Held, G., “The Complete Modem Reference,” Second Edition, Wiley Professional Computing, May 1994.
[RFC1962] Rand, D. and K. Fox, “The PPP Compression Control Protocol (CCP),” RFC 1962, June 1996 (TXT).
[RFC3749] Hollenbeck, S., “Transport Layer Security Protocol Compression Methods,” RFC 3749, May 2004 (TXT).
[RFC4422] Melnikov, A. and K. Zeilenga, “Simple Authentication and Security Layer (SASL),” RFC 4422, June 2006 (TXT).
[RFC4643] Vinocur, J. and K. Murchison, “Network News Transfer Protocol (NNTP) Extension for Authentication,” RFC 4643, October 2006 (TXT).
[RFC4978] Gulbrandsen, A., “The IMAP COMPRESS Extension,” RFC 4978, August 2007 (TXT).
[RFC5246] Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2,” RFC 5246, August 2008 (TXT).
[V42bis] International Telecommunications Union, “Data compression procedures for data circuit-terminating equipment (DCE) using error correction procedures,” ITU-T Recommendation V.42bis, January 1990.


 TOC 

Appendix A.  Acknowledgements

This document draws heavily on ideas in [RFC4978] (Gulbrandsen, A., “The IMAP COMPRESS Extension,” August 2007.) by Arnt Gulbrandsen and a large portion of this text was borrowed from that specification.



 TOC 

Author's Address

  Kenneth Murchison
  Carnegie Mellon University
  5000 Forbes Avenue
  Cyert Hall 285
  Pittsburgh, PA 15213
  US
Phone:  +1 412 268 2638
Email:  murch@andrew.cmu.edu