Network Working Group D. Meyer
Internet-Draft Universitaet Bremen TZI
Intended status: Standards Track P. Saint-Andre
Expires: September 9, 2009 Cisco
March 8, 2009
Extensible Messaging and Presence Protocol (XMPP) End-to-End Encryption
Using Transport Layer Security ("XTLS")
draft-meyer-xmpp-e2e-encryption-01
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on September 9, 2009.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents in effect on the date of
publication of this document (http://trustee.ietf.org/license-info).
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document.
Abstract
This document specifies "XTLS", a protocol for end-to-end encryption
Meyer & Saint-Andre Expires September 9, 2009 [Page 1]
Internet-Draft XTLS March 2009
of Extensible Messaging and Presence Protocol (XMPP) traffic via an
application-level usage of Transport Layer Security (TLS). XTLS
treats the end-to-end exchange of XML stanzas as a virtual transport
and uses TLS to secure that transport, thus enabling XMPP entities to
communicate in a way that is designed to prevent eavesdropping,
tampering, and forgery of XML stanzas. The protocol can be used for
secure end-to-end messaging as well as any others application such as
file transfer.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3. Threat Analysis . . . . . . . . . . . . . . . . . . . . . . . 4
4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 6
5. Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
6. XTLS Protocol Flow . . . . . . . . . . . . . . . . . . . . . . 8
7. End-to-End Streams over XTLS Protocol Flow . . . . . . . . . . 15
8. Bootstrapping Trust on First Communication . . . . . . . . . . 19
8.1. Exchanging Certificates . . . . . . . . . . . . . . . . . 20
8.2. Verification of Non-Human Parties . . . . . . . . . . . . 21
9. Session Termination . . . . . . . . . . . . . . . . . . . . . 22
10. Determining Support . . . . . . . . . . . . . . . . . . . . . 22
11. Security Considerations . . . . . . . . . . . . . . . . . . . 23
11.1. Mandatory-to-Implement Technologies . . . . . . . . . . . 23
11.2. Certificates . . . . . . . . . . . . . . . . . . . . . . . 23
11.3. Denial of Service . . . . . . . . . . . . . . . . . . . . 24
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24
13.1. Normative References . . . . . . . . . . . . . . . . . . . 24
13.2. Informative References . . . . . . . . . . . . . . . . . . 25
Appendix A. XML Schema . . . . . . . . . . . . . . . . . . . . . 27
Appendix B. Copying Conditions . . . . . . . . . . . . . . . . . 27
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 27
Meyer & Saint-Andre Expires September 9, 2009 [Page 2]
Internet-Draft XTLS March 2009
1. Introduction
End-to-end encryption of traffic sent over the Extensible Messaging
and Presence Protocol (XMPP) is a desirable goal. Since 1999, the
Jabber/XMPP developer community has experimented with several such
technologies, including OpenPGP [XEP-0027], S/MIME [RFC3923], and
encrypted sessions or "ESessions" [XEP-0218]. For various reasons,
these technologies have not been widely implemented and deployed.
When the XMPP Standards Foundation asked various Internet security
experts to complete a security review of encrypted sessions, it was
recommended to explore the possibility of instead using the Transport
Layer Security [TLS] as the base technology for XMPP. That
possibility is explored in this document.
TLS is the most widely implemented protocol for securing network
traffic. In addition to applications in the email infrastructure,
the World Wide Web [HTTP-TLS], and datagram transport for multimedia
session negotiation [DTLS], TLS is used in XMPP to secure TCP
connections from client to server and from server to server, as
specified in [rfc3920bis]. Therefore TLS is already familiar to XMPP
developers.
This specification, called "XTLS", defines a method whereby any XMPP
entity that supports the XMPP Jingle negotiation framework [XEP-0166]
can use TLS semantics for end-to-end encryption, whether the
application data is sent over a streaming transport (like TCP) or a
datagram transport (like UDP). The basic use case is to tunnel XMPP
stanzas between two IM users for end-to-end secure chat using end-to-
end XML streams. However, XTLS is not limited to encryption of one-
to-one text chat, since it can be used between two XMPP clients for
encryption of any XMPP payloads, between an XMPP client and a remote
XMPP service (i.e., a service with which a client does not have a
direct XML stream, such as a [XEP-0045] chatroom), or between two
remote XMPP services. Furthermore, XTLS can be used for encrypted
file transfer using [XEP-0234], for encrypted voice or video sessions
using [XEP-0167] and [DTLS-SRTP], and other applications.
Note: The following capitalized keywords are to be interpreted as
described in [TERMS]: "MUST", "SHALL", "REQUIRED"; "MUST NOT", "SHALL
NOT"; "SHOULD", "RECOMMENDED"; "SHOULD NOT", "NOT RECOMMENDED";
"MAY", "OPTIONAL".
2. Scope
The XMPP communication exchanges of interest here exist in the
context of a one-to-one communication "session" between two entities,
where the information exchanged takes the form of XMPP stanzas.
Meyer & Saint-Andre Expires September 9, 2009 [Page 3]
Internet-Draft XTLS March 2009
However, several other kinds of XMPP exchanges exist outside the
context of one-to-one communication sessions:
o Many-to-many sessions, such as a text conference in a chatroom as
specified in [XEP-0045].
o One-to-many broadcast of information, such as undirected presence
stanzas sent from one user to many contacts as described in
[rfc3921bis] and data syndication implemented using the XMPP
publish-subsribe technology described in [XEP-0060].
o One-to-one communications that are stored for later delivery
rather than delivered immediately, such as the so-called "offline
messages" described in [XEP-0160].
Ideally, any technology for end-to-end encryption in XMPP could be
extended to cover all the scenarios above as well as one-to-one
communication sessions. However, many-to-many sessions, one-to-many
broadcast, and offline messages are out of scope for this
specification.
3. Threat Analysis
XMPP technologies are typically deployed using a client-server
architecture. As a result, XMPP endpoints (often but not always
controlled by human users) need to communicate through one or more
servers. For example, the user juliet@capulet.lit connects to the
capulet.lit server and the user romeo@montague.lit connects to the
montague.lit server, but in order for Juliet to send a message to
Romeo the message will be routed over her client-to-server connection
with capulet.lit, over a server-to-server connection between
capulet.lit and montague.lit, and over Romeo's client-to-server
connection with montague.lit. Although [rfc3920bis] requires support
for Transport Layer Security [TLS] to make it possible to encrypt all
of these connections, when XMPP is deployed any of these connections
might be unencrypted. Furthermore, even if the server-to-server
connection is encrypted and both of the client-to-server connections
are encrypted, the message would still be in the clear while
processed by both the capulet.lit and montague.lit servers.
In this specification we primarily address communications security
("commsec") between two parties, especially confidentiality, data
integrity, and peer entity authentication. Communications security
can be subject to a variety of attacks, which [RFC3552] divides
attacks into passive and active categories. In a passive attack,
information is leaked (e.g., a passive attacker could read all of the
messages that Juliet sends to Romeo). In an active attack, the
attacker can add, modify, or delete messages between the parties,
thus disrupting communications.
Meyer & Saint-Andre Expires September 9, 2009 [Page 4]
Internet-Draft XTLS March 2009
Traditionally, it seems that XMPP users have been concerned more
about passive attacks (such as eavesdropping) than about active
attacks (such as man-in-the-middle), perhaps because they have
thought that their communications are "just chat", because they have
had no expectation that endpoints could be authenticated, or because
they have believed that hijacked communications would be detected
socially (e.g., because the other party did not have an authentic
"voice" in a text conversation). However, both forms of attack are
of concern in this protocol.
In particular, we consider the following types of attacks and
attackers:
o One type of passive attack might involve monitoring all the
conversations of a given party. To help prevent this, it is
helpful for the party to ensure that its connection with its
server is protected using TLS. However, in this case the
eavesdropper could monitor outbound traffic from the party's
server, either to other connected clients or to other servers,
since that traffic might be unencrypted. In addition, the
eavesdropper could attack the party's server so that it gains
access to all traffic within the server, or masquerade as the
party's server so that the party is fooled into connecting to the
attacker rather than directly to the party's server.
o Another type of passive attack might involve monitoring of a
single conversation between two particular parties. In this case
the eavesdropper could monitor communications over the server-to-
server connection between the parties' servers, or over the
client-to-server connection between either party and that party's
server.
o One type of active attack would involve modification of the XML
stanzas used to advertise support for the protocol "building
blocks" that make it possible to negotiate a secure session; as a
result, other parties would be led to believe that the party does
not have the ability to negotate a secure session and therefore
would not attempt such a negotiation.
o Another type of active attack would involve modification or
outright deletion of the XML stanzas used to negotiate a secure
session (such as those described in this document), with the
result that the parties would think the negotiation has failed for
legitimate reasons such as incompatibilities between the parties'
clients.
o A more sophisticated active attack would involve a cryptanalytic
attack on the keying material or other credentials used to
establish trust between the parties, such as an ephemeral password
exchanged during an initial certificate exchange if Secure Remote
Password [TLS-SRP] is used.
Meyer & Saint-Andre Expires September 9, 2009 [Page 5]
Internet-Draft XTLS March 2009
Other attacks are possible, and the foregoing list is best considered
incomplete at this time.
4. Requirements
(This section borrows some text from [XEP-0210].)
This document stipulates the following requirements for end-to-end
encryption of XMPP communications. It is possible that some of those
requirements can be met only with particular TLS cipher suites, or
cannot be met at all without defining extensions to TLS itself; a
full gap analysis has not yet been completed.
o Confidentiality. The one-to-one XML stanzas exchanged between two
entities must not be understandable to any other entity that might
intercept the communications. The encrypted stanzas are to be
understood by an intermediate server only to the extent required
to route them.
o Integrity. The two parties to an encrypted communication session
must be sure that no other entity is able to change the content of
the XML stanzas they exchange, or remove or insert stanzas into
the session undetected.
o Replay protection. The two parties to an encrypted communication
session must be able to identify and reject any communications
that are copies of their previous communications resent by another
entity.
o Perfect forward secrecy. The content of an encrypted
communication should not be revealed even if long-lived keys are
compromised in the future (e.g., if one of the parties loses their
device). For long-lived sessions it should be possible to
periodically change the decryption keys.
o PKI independence. The protocol must not rely on any public key
infrastructure (PKI), certification authority, web of trust, or
any other trust model that is external to the trust established
between the two parties. However, if external authentication or
trust models are available then the two parteis must be able to
use them as a way of enhancing any trust that exists between them.
o Authentication. Each party to a conversation must know that the
other party is who they want to communicate with.
o Generality. The solution must be generally applicable to the full
content of any XML stanza type sent between two entities (i.e.,
message, presence, and IQ stanzas).
o Implementability. The only good security technology is an
implemented security technology. The solution must be one that
XMPP client developers can implement in a relatively
straightforward and interoperable fashion, preferably by re-using
existing building blocks such as Transport Layer Security XML
Meyer & Saint-Andre Expires September 9, 2009 [Page 6]
Internet-Draft XTLS March 2009
streams, Jingle [XEP-0166], and in-band bytestreams [XEP-0047].
o Usability. The requirement of usability takes implementability
one step further by stipulating that the solution must be one that
organizations may deploy and humans may use with 100% transparency
(with the ease-of-use of secure web browsing via HTTPS).
Experience has shown that solutions requiring a full public key
infrastructure do not get widely deployed and solutions requiring
any user action are not widely used. If, however, the parties are
prepared to verify the integrity of their copies of each other's
keys (thus enabling them to discover targeted active attacks),
then the actions necessary for them to verify key integrity must
be minimal (requiring no more effort than a one-time out-of-band
verification of a string of up to 6 alphanumeric characters).
o Efficiency. Cryptographic operations are highly CPU intensive,
particularly public key and Diffie-Hellman operations.
Cryptographic data structures can be relatively large, especially
public keys and certificates. Network round trips can introduce
unacceptable delays, especially over high-latency wireless
connections. The solution must perform efficiently even when CPU
and network bandwidth are constrained. The number of round trips
required to set up an encrypted session should be minimized.
o Flexibility. The solution must be compatible with a variety of
existing and future cryptographic algorithms and identity
certification schemes, including X.509 and PGP.
5. Approach
In broad outline, XTLS takes the following approach to end-to-end
encryption of XMPP traffic:
1. We assume that all XMPP entities will have X.509 certificates;
realistically these certificates are likely to be self-signed and
automatically generated by an XMPP client, however CA-issued
certificates are encouraged to overcome problems with self-signed
certificates.
2. We use the XMPP Jingle extensions as the negotiation framework
(see [XEP-0166]).
3. We define a element that can be included in any
Jingle negotiation, and a new "security-info" Jingle action for
sending security-related information.
4. When an entity wishes to encrypt its communications with a second
entity, it sends a Jingle session-initiate request that specifies
the desired application type, a possible transport, the sender's
X.509 fingerprint, and optionally hints about the sender's
supported TLS methods.
Meyer & Saint-Andre Expires September 9, 2009 [Page 7]
Internet-Draft XTLS March 2009
5. If both parties support XTLS, the first data sent over the
negotiated transport is TLS handshake data, not application data.
Once the TLS handshake has finished, the parties can then send
application data over the now-encrypted transport.
6. The simplest scenario is end-to-end encryption of traditional
XMPP text chat using end-to-end XML streams, in-band bytestreams
(see [XEP-0047]), and previously-accepted X.509 certificates.
7. On first use of end-to-end encryption between two entities, it is
encouraged to use secure remote passwords rather than leap-of-
faith to bootstrap the subsequent use of the client-generated
X.509 certificates.
8. More complex scenarios are theoretically supported (e.g.,
encrypted file transfer using SOCKS5 bytestreams and encrypted
voice chat using DTLS-SRTP) but have not yet been fully defined.
9. XTLS theoretically can be used to establish a TLS-encrypted
streaming transport or a DTLS-encrypted datagram transport, but
integration with DTLS [DTLS] has not yet been prototyped so use
with streaming transports is the more stable scenario.
We expand on this approach in the following section.
6. XTLS Protocol Flow
The basic flow for an XTLS session is as follows, where traffic
represented by single dashes (---) is sent over the XMPP signalling
channel and traffic represented by double lines (===) is sent over
the negotiated transport.
Meyer & Saint-Andre Expires September 9, 2009 [Page 8]
Internet-Draft XTLS March 2009
Initiator Responder
| |
| session-initiate |
| (with security info) |
|--------------------------->|
| ack |
|<---------------------------|
| session-accept |
|<---------------------------|
| ack |
|--------------------------->|
| open transport |
|<==========================>|
| TLS ClientHello |
|===========================>|
| TLS ServerHello, [...] |
|<===========================|
| TLS [...], Finished |
|===========================>|
| TLS [...], Finished |
|<===========================|
| application data |
|<==========================>|
| session-terminate |
|<---------------------------|
| ack |
|--------------------------->|
| |
To simplify the description we assume here that the parties already
trust each other's certificates. See discussion under Section 8 for
information about bootstrapping of certificate trust on the first
communication.
First the initiator sends a Jingle session-initiate request (here the
simple case of an end-to-end text chat session using in-band
bytestreams [XEP-0047]. This request includes a element
that contains the fingerprint of the certificate that the initiator
will use during the TLS negotiation and a list of TLS methods the
initiator supports (here X.509 certificate based authentication and
TLS-SRP). Note that this information is exchanged over the insecure
server based connection. The purpose of the exchange is to gather
information what TLS method should be used in the TLS handshake, e.g.
if a client can not verify the fingerprint of the peer it MAY omit
the X.509 method. If both clients can verify the fingerprint of the
other, it is likely that X.509 certificate based authentication will
succeed (unless the data is altered); if one client can not verify
the fingerprint the client MAY prompt the user for a password for
Meyer & Saint-Andre Expires September 9, 2009 [Page 9]
Internet-Draft XTLS March 2009
TLS-SRP based authentication (see Section 8 for details).
action='session-initiate'
initiator='romeo@montague.lit/orchard'
sid='a73sjjvkla37jfea'>
RomeoX509CertSHA1Hash
The responder immediately acknowledges receipt of the session-
initiate by sending an IQ stanza of type "result" (not shown here).
Depending on the application type, a user agent controlled by a human
user might need to wait for the user to affirm a desire to proceed
with the session before continuing. When the user agent has received
such affirmation (or if the user agent can automatically proceed for
any reason, e.g. because no human intervention is expected or because
a human user has configured the user agent to automatically accept
sessions with a given entity), it returns a Jingle session-accept
message. This message will typically contain the offered application
type, transport method, and a element that includes the
fingerprint of the responder's X.509 certificate as well as the
responder's supported TLS methods.
Meyer & Saint-Andre Expires September 9, 2009 [Page 10]
Internet-Draft XTLS March 2009
action='session-accept'
initiator='romeo@montague.lit/orchard'
sid='a73sjjvkla37jfea'>
JulietX509CertSHA1Hash
The following rules apply to the responder's handling of the session-
initiate message:
1. If the responder does not support Jingle-XTLS it will silently
ignore the element in the offer and therefore will
return a session-accept message without a element.
2. If the responder supports Jingle-XTLS it SHOULD return a session-
accept message that contains a element.
3. If the responder thinks it will be able to verify the initiator's
certificate, it MUST include the fingerprint for the responder's
certificate in the element of the session-accept
message. This is the "happy path" and will occur when the
parties have already verified each other's certificates.
4. If the responder thinks it will not be able to verify the
initiator's certificate, it MAY omit the fingerprint for the
responder's certificate in the element of the
session-accept message. This indicates that certificate-based
authentication is not possible. In this case the responder
SHOULD signal that it wishes to use some other authentication
method, such as secure remote passwords (see discussion under
Section 8).
5. If the responding client cannot verify the initiator's
certificate, it SHOULD ask the responding user if a password was
exchanged between the parties that can be used for TLS-SRP. If
this is not the case, setting up a mutually-authenticated link
will fail and the responder MAY terminate the session.
Meyer & Saint-Andre Expires September 9, 2009 [Page 11]
Internet-Draft XTLS March 2009
Alternatively it could send its own fingerprint knowing it cannot
authenticate the initiator, in which case the responder has to
trust that there is no man-in-the-middle (see discussion under
Section 8).
When the responder sends the session-accept message, the initiator
acknowledges receipt by sending an IQ stanza of type "result" (not
shown here).
The following rules apply to the initiator's handling of the session-
accept message:
1. If the initiator receives a session-accept without a
element, setting up a secure transport layer has failed. The
initiator MAY terminate the session at this point or instead
proceed without securing the transport. The client SHOULD ask
the initiating user how to processed. This depends on the Jingle
application and the initiator's preferences: it makes no sense to
use end-to-end XML streams without encryption, but the initiator
might continue a file transfer without encryption.
2. If the initiating client cannot verify the responder's
certificate it SHOULD ask the initiating user if a password was
exchanged between the parties that can be used for TLS-SRP. If
this is not the case, setting up a mutually-authenticated link
will fail and the responder MAY terminate the session or proceed
with leap-of-faith (see discussion under Section 8).
The initiator can now determine if X.509 certificate based
authentication will work or if TLS-SRP will be used. It sends an
additional security-info message to the responder to signal its
choice. This step is not really necessary because the responder will
see the initiator's choice in the first message of the TLS handshake,
but it can help an implementation to set up its TLS library properly.
Because in this section we assume that the parties already have
validated each other's certificates, the security method signalled
here is "x509".
Meyer & Saint-Andre Expires September 9, 2009 [Page 12]
Internet-Draft XTLS March 2009
action='security-info'
initiator='romeo@montague.lit/orchard'
sid='a73sjjvkla37jfea'>
The responder acknowledges receipt by sending an IQ stanza of type
"result" (not shown here).
Parallel to the security-info exchange, the clients negotiate a
transport for the Jingle session (here the transport is an in-band
bytestream as defined in [XEP-0047], for which the Jingle negotiation
process is specified in [XEP-0261]; however other transports could be
used, for example SOCKS5 bytestreams as defined in [XEP-0065] and
negotiated for Jingle as specified in [XEP-0260]). Because the
parties wish to establish end-to-end encryption, they do not send
application data over the transport until the transport has been
secured. Therefore the first data that they exchange over the
transport consists of the standard four-way TLS handshake, encoded in
accordance with the negotiated transport method.
Note: Each transport MUST define a specific time when both clients
know that the transport is secured. When XTLS is not used, the
Jingle implementation would signal to the using application that
the transport is open when the session-accept is sent or received,
or when connectivity checks determine media can flow over one of
the transport candidates. When XTLS is used, the Jingle
implementation starts a TLS handshake on the transport and signals
to the using application that the transport is open only after the
TLS handshake has finished successfully.
During the TLS handshake, the responder MUST take the role of the TLS
server and the initiator MUST take the role of the TLS client.
Because the transport is an in-band bytestream, the TLS handshake
data is prepared as described in [XEP-0047] (i.e., Base64-encoded).
First the initiator (acting as the TLS client) constructs a TLS
ClientHello, encodes it according to IBB, and sends it to the
responder.
Meyer & Saint-Andre Expires September 9, 2009 [Page 13]
Internet-Draft XTLS March 2009
Base64-encoded-TLS-data
The responder (acting as the TLS server) then acknowledges receipt by
sending an IQ stanza of type "result" (not shown here).
The responder then constructs an appropriate TLS message or messages,
such as a ServerHello and a CertificateRequest.
Note: The responder MUST send a CertificateRequest to the
initiator.
Base64-encoded-TLS-data
(Because in-band bytestreams are bidirectional and this data is sent
from the responder to the initiator, the IBB 'seq' attribute has a
value of zero, not 1.)
The initiator then acknowledges receipt by sending an IQ stanza of
type "result" (not shown here).
After some number of TLS messages, the initiator eventually sends a
TLS Finished message to the responder.
Meyer & Saint-Andre Expires September 9, 2009 [Page 14]
Internet-Draft XTLS March 2009
Base64-encoded-TLS-data
The responder then acknowledges receipt by sending an IQ stanza of
type "result" (not shown here).
The responder then also sends a TLS Finished message.
Base64-encoded-TLS-data
The initiator then acknowledges receipt by sending an IQ stanza of
type "result" (not shown here).
If the TLS negotiation has finished successfully, then the Jingle
implementation shall signal to the using application that the
transport has been secured and is ready to be used. The parties can
then begin to exchange application data over the encrypted transport.
7. End-to-End Streams over XTLS Protocol Flow
For end-to-end encryption of XMPP traffic, the application data is an
end-to-end XML stream. After the XTLS session is set up, the peers
open an XML stream to excahnge messages. The XML streams are sent
though the XTLS connection. In this example the streams are sent
over TLS over IBB.
First the initiator constructs an initial stream header.
Meyer & Saint-Andre Expires September 9, 2009 [Page 15]
Internet-Draft XTLS March 2009
Note: In accordance with [rfc3920bis], the initial stream header
SHOULD include the 'to' and 'from' attributes, which SHOULD specify
the full JIDs of the clients. The initiator SHOULD include the
version='1.0' flag as shown in the previous example.
The initiator then sends the stream header through the TLS stream and
encodes the TLS data in IBB and sends it to the responder.
Base64-TLS-data-of-the-stream-header
The responder then acknowledges receipt by sending an IQ stanza of
type "result" (not shown here).
The responder then constructs a response stream header back to the
initiator.
The responder then sends the response stream header over the TLS link
it to the initiator.
Meyer & Saint-Andre Expires September 9, 2009 [Page 16]
Internet-Draft XTLS March 2009
Base64-TLS-data-of-the-responce-stream-header
The initiator then acknowledges receipt by sending an IQ stanza of
type "result" (not shown here).
Once the streams are established over the bytestreams, either entity
then can send XMPP message, presence, and IQ stanzas, with or without
'to' and 'from' addresses.
For example, the initiator could construct an XMPP message.
M'lady, I would be pleased to make your acquaintance.
The initiator then sends the message over the XTLS connection to the
responder.
Base64-TLS-data
The responder then acknowledges receipt by sending an IQ stanza of
type "result" (not shown here).
The responder could then construct a reply.
Meyer & Saint-Andre Expires September 9, 2009 [Page 17]
Internet-Draft XTLS March 2009
Art thou not Romeo, and a Montague?
The responder then sends the reply over the XTLS connection to the
initiator.
Base64-TLS-data
The initiator then acknowledges receipt by sending an IQ stanza of
type "result" (not shown here).
To close the end-to-end XML stream, either party (here the responder)
constructs a closing element.
The client sends the closing element to the peer over the XTLS
connection.
Base64-TLS-data
The peer then acknowledges receipt by sending an IQ stanza of type
"result" (not shown here).
However, even after the application-level XML stream is terminated,
the negotiated Jingle transport (here in-band bytestream) continues
and could be re-used. To completely terminate the Jingle session,
the terminating party would then also send a Jingle session-terminate
Meyer & Saint-Andre Expires September 9, 2009 [Page 18]
Internet-Draft XTLS March 2009
message.
The other party then acknowledges the Jingle session-terminate by
sending an IQ stanza of type "result" (not shown here).
8. Bootstrapping Trust on First Communication
When two parties first attempt to use XTLS, their certificates might
not be accepted (e.g., because they are self-signed or issued by
unknown certification authorities). Therefore each party needs to
accept the other's certificate for use in future communication
sessions. There are several ways to do so:
o Leap of faith. The recipient can hope that there is no man-in-
the-middle during the first communication session. If the
certificate does not change in future sessions, the recipient at
least knows that it is talking with the same entity it talked with
during the first session. However, that entity might be a man-in-
the-middle rather than the assumed communication partner.
Therefore, leap of faith is discouraged.
o Check fingerprints. The parties could validate the certificate
fingerprints via some trusted means outside the XMPP band, such as
in person, via encrypted email, or over the phone. This is not
user-friendly because certificate fingerprints consist of long
strings of letters and numbers. As a result, few humans routinely
check certificate fingerprints in protocols such as Secure Shell
(ssh).
o One-time password. The parties can exchange a user-friendly
password known only to themselves and verify it out of band before
the TLS handshake finishes. For this purpose, it is REQUIRED for
implementations to support at least one TLS cipher that uses
Secure Remote Password (SRP) as defined in [TLS-SRP].
o Channel binding. It is possible that a future version will
describe how to use an appropriate Simple Authentication and
Security Layer (SASL) mechanism, such as [SCRAM], to authenticate
the XTLS channel after the TLS handshake finishes using the
concept of channel bindings (see [RFC5056]).
Meyer & Saint-Andre Expires September 9, 2009 [Page 19]
Internet-Draft XTLS March 2009
If the parties use a password or SASL channel binding to bootstrap
trust, the process needs to be completed only once. After the
clients have authenticated with the shared secret, they can exchange
their certificates for future communication.
8.1. Exchanging Certificates
To retrieve the certificate of the peer for future communications, a
client SHOULD request the certificate according to [XEP-0189] over
the secure connection. This works only if XTLS was used to set up an
end-to-end secure XML stream; exchanging certificates if XTLS was
used for other purposes like file transfer is not possible. A client
MUST NOT request the certificate over the insecure stream based on
the connection to the XMPP server.
The peer MUST return its own client certificate. If the user has
different clients with different client certificates and one user
certificate, the user certificate SHOULD also be returned. The user
certificate allows it to verify other client certificates using
public key retrieval described in [XEP-0189].
Meyer & Saint-Andre Expires September 9, 2009 [Page 20]
Internet-Draft XTLS March 2009
MIICCTCCAXKgAwIBAgIJALhU0Id6xxwQMA0GCSqGSIb3DQEBBQUAMA4xDDAKBgNV
BAMTA2ZvbzAeFw0wNzEyMjgyMDA1MTRaFw0wODEyMjcyMDA1MTRaMA4xDDAKBgNV
BAMTA2ZvbzCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkCgYEA0DPcfeJzKWLGE22p
RMINLKr+CxqozF14DqkXkLUwGzTqYRi49yK6aebZ9ssFspTTjqa2uNpw1U32748t
qU6bpACWHbcC+eZ/hm5KymXBhL3Vjfb/dW0xrtxjI9JRFgrgWAyxndlNZUpN2s3D
hKDfVgpPSx/Zp8d/ubbARxqZZZkCAwEAAaNvMG0wHQYDVR0OBBYEFJWwFqmSRGcx
YXmQfdF+XBWkeML4MD4GA1UdIwQ3MDWAFJWwFqmSRGcxYXmQfdF+XBWkeML4oRKk
EDAOMQwwCgYDVQQDEwNmb2+CCQC4VNCHesccEDAMBgNVHRMEBTADAQH/MA0GCSqG
SIb3DQEBBQUAA4GBAIhlUeGZ0d0msNVxYWAXg2lRsJt9INHJQTCJMmoUeTtaRjyp
ffJtuopguNNBDn+MjrEp2/+zLNMahDYLXaTVmBf6zvY0hzB9Ih0kNTh23Fb5j+yK
QChPXQUo0EGCaODWhfhKRNdseUozfNWOz9iTgMGw8eYNLllQRL//iAOfOr/8
8.2. Verification of Non-Human Parties
If one of the parties is a "bot" (e.g., an automated service or a
device such as a set-top box), the password exchange is a bit more
complicated. It is similar to Bluetooth peering if the user has
access to both clients at the same time. One of the following
scenarios might apply:
o The bot can be controlled via a remote control input device. The
human user can enter the same password or "PIN" on both the bot
and the XMPP client.
o If the bot has no user input but does have a small display, it
could display a random password. The human user can then enter
the provided password on the XMPP client.
o The bot might have not enough buttons for input and has no output
device. In that case the password is fixed. Similar to Bluetooth
peering with simple devices such as a headset, the password will
be written in the manual or printed on the device. For security
reasons the device SHOULD NOT use password-based authentication
without any user input. Many Bluetooth devices have at least one
button to set the device into peering mode.
o A bot may be associated with a web service and could display a
random password when the user has logged in to the web site using
HTTPS. This assumes that an attacker does not have control over
the web server and can perform a man-in-the-middle attack on XMPP
Meyer & Saint-Andre Expires September 9, 2009 [Page 21]
Internet-Draft XTLS March 2009
level at the same time. If the web service knows the GPG-key of
the user (e.g. launchpad) it could send an encrypted email.
A user might have different X.509 certificates for each device.
[XEP-0189] can be used to manage the user's certificates. A client
SHOULD check the peer's PubSub node for certificates. This makes it
possible to use the password method only once between two users even
if one or both users switch clients. A user can also communicate
with a friend's bots: they first open a secure link between two chat
clients with a password and exchange the user certificates. After
that each device of a user can verify all devices of the other
without the need of a password.
The retrieved certificate from the PubSub node may be signed by a CA
the client can verify. In that case the client MAY skip the password
authentication and rely on the X.509 certificate chain. The client
SHOULD ask the user if the certificate should be accepted or if a
password exchange is desired.
9. Session Termination
If either client cannot verify the certificate of the peer or
receives an invalid message on the TLS layer, it MUST terminate the
Jingle session immediately by sending a Jingle session-terminate
message that includes a Jingle reason of .
The other party then acknowledges the session-terminate by sending an
IQ stanza of type "result" (not shown here), and the Jingle session
is finished.
10. Determining Support
If an entity wishes to request the use of XTLS, it SHOULD first
determine whether the intended responder supports the protocol. This
Meyer & Saint-Andre Expires September 9, 2009 [Page 22]
Internet-Draft XTLS March 2009
can be done directly via [XEP-0030] or indirectly via [XEP-0115].
If an entity supports XTLS, it MUST report that by including a
service discovery feature of "urn:xmpp:jingle:security:xtls:0" in
response to disco#info requests.
Both service discovery and entity capabilities information could be
corrupted or intercepted; for details, see under Section 11.3.
11. Security Considerations
This entire document addresses security. Particular security-related
issues are discussed in the following sections.
11.1. Mandatory-to-Implement Technologies
An implementation MUST at a minimum support the "srp" and "x509"
methods. A future version of this specification will document
mandatory-to-implement TLS ciphers.
11.2. Certificates
As noted, XTLS can be used between XMPP clients, between an XMPP
client and a remote XMPP service (i.e., a service with which a client
does not have a direct XML stream), or between remote XMPP services.
Therefore, a party to an XTLS bytestream will present either a client
certificate or a server certificate as appropriate. Such
certificates MUST be generated and validated in accordance with the
certificate guidelines guidelines provided in [rfc3920bis].
Meyer & Saint-Andre Expires September 9, 2009 [Page 23]
Internet-Draft XTLS March 2009
A future version of this specification might provide additional
guidelines regarding certificate validation in the context of client-
to-client encryption.
11.3. Denial of Service
Currently XMPP stanzas such as Jingle negotiation messages and
service discovery exchanges are not encrypted or signed. As a
result, it is possible for an attacker to intercept these stanzas and
modify them, thus convincing one party that the other party does not
support XTLS and therefore denying the parties an opportunity to use
XTLS.
This is a more general problem with XMPP technologies and needs to be
addressed at the core XMPP layer.
12. IANA Considerations
It might be helpful to create a registry of TLS methods that can be
used in the context of XTLS (e.g., "openpgp" for use of [RFC5081],
"srp" for use of [TLS-SRP], and "x509" for use of [TLS] with
certificates). The registry could be maintained by the IANA or by
the XMPP Registrar. A future version of this specification will
provide more detailed information about the registration
requirements.
13. References
13.1. Normative References
[rfc3920bis]
Saint-Andre, P., "Extensible Messaging and Presence
Protocol (XMPP): Core", draft-saintandre-rfc3920bis-09
(work in progress), March 2009.
[TERMS] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[TLS] Dierks, T. and E. Rescorla, "The Transport Layer Security
(TLS) Protocol Version 1.2", RFC 5246, August 2008.
[XEP-0047]
Karneges, J., "In-Band Bytestreams (IBB)", XSF XEP 0047,
November 2006.
[XEP-0166]
Meyer & Saint-Andre Expires September 9, 2009 [Page 24]
Internet-Draft XTLS March 2009
Ludwig, S., Beda, J., Saint-Andre, P., McQueen, R., Egan,
S., and J. Hildebrand, "Jingle", XSF XEP 0166,
December 2008.
13.2. Informative References
[DTLS] Rescorla, E. and N. Modadugu, "Datagram Transport Layer
Security", RFC 4347, April 2006.
[DTLS-SRTP]
McGrew, D. and E. Rescorla, "Datagram Transport Layer
Security (DTLS) Extension to Establish Keys for Secure
Real-time Transport Protocol (SRTP)",
draft-ietf-avt-dtls-srtp-07 (work in progress),
February 2009.
[HTTP-TLS]
Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000.
[RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC
Text on Security Considerations", BCP 72, RFC 3552,
July 2003.
[rfc3921bis]
Saint-Andre, P., "Extensible Messaging and Presence
Protocol (XMPP): Instant Messaging and Presence",
draft-saintandre-rfc3921bis-07 (work in progress),
October 2008.
[RFC3923] Saint-Andre, P., "End-to-End Signing and Object Encryption
for the Extensible Messaging and Presence Protocol
(XMPP)", RFC 3923, October 2004.
[RFC5056] Williams, N., "On the Use of Channel Bindings to Secure
Channels", RFC 5056, November 2007.
[RFC5081] Mavrogiannopoulos, N., "Using OpenPGP Keys for Transport
Layer Security (TLS) Authentication", RFC 5081,
November 2007.
[TLS-SRP] Taylor, D., Wu, T., Mavrogiannopoulos, N., and T. Perrin,
"Using the Secure Remote Password (SRP) Protocol for TLS
Authentication", RFC 5054, November 2007.
[SCRAM] Menon-Sen, A., Melnikov, A., and C. Newman, "Salted
Challenge Response (SCRAM) SASL Mechanism",
draft-newman-auth-scram-10 (work in progress),
February 2009.
Meyer & Saint-Andre Expires September 9, 2009 [Page 25]
Internet-Draft XTLS March 2009
[XEP-0027]
Muldowney, T., "Current Jabber OpenPGP Usage", XSF
XEP 0027, November 2006.
[XEP-0030]
Hildebrand, J., Millard, P., Eatmon, R., and P. Saint-
Andre, "Service Discovery", XSF XEP 0030, June 2008.
[XEP-0045]
Saint-Andre, P., "Multi-User Chat", XSF XEP 0045,
July 2008.
[XEP-0060]
Millard, P., Saint-Andre, P., and R. Meijer, "Publish-
Subscribe", XSF XEP 0060, September 2008.
[XEP-0065]
Smith, D., Miller, M., and P. Saint-Andre, "SOCKS5
Bytestreams", XSF XEP 0065, May 2007.
[XEP-0115]
Hildebrand, J., Saint-Andre, P., Troncon, R., and J.
Konieczny, "Entity Capabilities", XSF XEP 0115,
February 2008.
[XEP-0160]
Saint-Andre, P., "Best Practices for Handling Offline
Messages", XSF XEP 0160, January 2006.
[XEP-0167]
Ludwig, S., Saint-Andre, P., Egan, S., McQueen, R., and D.
Cionoiu, "Jingle RTP Sessions", XSF XEP 0167,
December 2008.
[XEP-0189]
Paterson, I., Saint-Andre, P., and D. Meyer, "Public Key
Publishing", XSF XEP 0189, March 2009.
[XEP-0210]
Paterson, I., "Requirements for Encrypted Sessions", XSF
XEP 0210, May 2007.
[XEP-0218]
Saint-Andre, P. and I. Paterson, "Bootstrapping
Implementation of Encrypted Sessions", XSF XEP 0218,
May 2007.
[XEP-0234]
Meyer & Saint-Andre Expires September 9, 2009 [Page 26]
Internet-Draft XTLS March 2009
Saint-Andre, P., "Jingle File Transfer", XSF XEP 0234,
February 2009.
[XEP-0260]
Saint-Andre, P. and D. Meyer, "Jingle SOCKS5 Bytestreams
Transport Method", XSF XEP 0260, February 2009.
[XEP-0261]
Saint-Andre, P., "Jingle In-Band Bytestreams Transport",
XSF XEP 0261, February 2009.
Appendix A. XML Schema
The XML schema will be provided in a later version of this document.
Appendix B. Copying Conditions
Regarding this entire document or any portion of it, the authors make
no guarantees and are not responsible for any damage resulting from
its use. The authors grant irrevocable permission to anyone to use,
modify, and distribute it in any way that does not diminish the
rights of anyone else to use, modify, and distribute it, provided
that redistributed derivative works do not contain misleading author
or version information. Derivative works need not be licensed under
similar terms.
Authors' Addresses
Dirk Meyer
Universitaet Bremen TZI
Email: dmeyer@tzi.de
Peter Saint-Andre
Cisco
Email: psaintan@cisco.com
Meyer & Saint-Andre Expires September 9, 2009 [Page 27]