IP telephony security overview
CESNET
technical report number 35/2006
also available in PDF,
PostScript, and
XML formats.
Miroslav Vozňák, Jan Růžička
4.12.2006
1 Abstract
The paper provides a basic overview of the IP telephony security and focuses in particular on standardised protocols. Its first part explains mechanisms of authentication in protocols SIP and H.323 and the second part deals with attacks, interdomain trust and DNS.
Keywords: security, SIP, H.323, authentication, attacks
2 Introduction
Most VoIP security issues results from the fact that many VoIP systems works on complex systems, using the existing IP networks, standard elements and known operating systems. The risk therefore includes all known problems in the field of IP an OSes. A person that attacks any part of the Internet is also able to attack VoIP. IP telephony belongs to the applications that can be attacked quite easily, with the availability of the service depending on availability and quality of IP infrastructure. As regards VoIP, we will deal with issues concerning signalling security and the media transmission security.
We should certainly consider a standard DoS attack. The attack does not even need to concern the central elements of the VoIP system and still can deny the VoIP service. Tapped calls represent a huge risk. If one manages to catch RTP (Real Time Protocol) packets on the route, which can be done for example with Ethereal, then it is easy to save these in the wav format and replay the conversation (if G.711 codec is used). SRTP (Secure Real Time Protocol) specified in RFC 3711, which enables encrypted transmission, was introduced to solve these problems. However, it is not yet commonly implemented in clients. Phil Zimmermann's ZRTP [ZJC06] represents another interesting alternative for protection against the man in the middle attacks. The signalling data also carry sensitive information, from which it is possible to read not only who called and how long the call lasted, but also where the particular person logged in. Using the gathered data, the attacker can try to change or tear down the session or steal the registration.
Further in this paper we discuss the issues relating to the most popular VoIP signalling protocols ITU-T H.323 and IETF SIP.
3 Authentication in H.323
H.323 [H.323] is a multimedia communication protocol for packet networks created on the basis of ITU-T in 1996. It covers a wide range of standards. It is a robust system that uses the ASN.1 syntax with BER encoding to represent transported information. H.323 uses several layers of signalling: H.225.0 Q.931 signalling for calling, RAS (registration, admission, status) for management of calls using GK (Gatekeeper) and H.245 serves for media signalling. The media transmission is done via RTP.
What is important for the authentication is the H.235 standard. It appeared for the first time with the second version of H.323v2 in 1998. Three authentication approaches are defined in it: passwords with symmetric encryption, passwords with hashing and public-key mechanisms. The information is delivered in the CryptoToken field in RAS or H.225.0 Q.931 messages. The following listing shows an example of the CryptoToken field of the message RRQ (Registration Request).
cryptoTokens: 2 items Item 0 Item: cryptoEPPwdHash (0) cryptoEPPwdHash alias: h323-ID (1) h323-ID: 950012315 timeStamp: Feb 26, 2006 15:36:41.000000000 token algorithmOID: 1.2.840.113549.2.5 (md5) paramS hash: 8BB5DFAE1F23EA0AA5C7E73C23B18639
4 Authentication in SIP
SIP (Session Initiation Protocol) [Ros02] is a protocol for the establishment, modification and tear down of a session. The core is described in RFC 3261. Various extensions such as instant messaging (IM) and presence are described in separate RFCs. SIP comes from the HTTP protocol and thus, contrary to H.323, is text-oriented. The text nature makes the protocol more legible and extensible which on the other hand makes it easier to be misused by a potential observer or attacker.
Due to its HTTP protocol background, the authentication in SIP uses the HTTP Digest Access Authentication. The older version of the standard (RFC 2543) mentioned also a HTTP Basic but RFC 3261 refuses it and thus it can neither be required nor accepted.
Within the communication framework we also distinguish the authentication between users (User-to-User) and between proxy server and user (Proxy-to-User). The former is mostly used in the registration process. The registration server is the final recipient of the request and therefore the User-to-User method is applied. The communication is carried out as shown in the following listing. If the necessary data in the message are not filled in the target client sends the reply 401 Unauthorized with WWW-Authenticate header containing the challenge. The registering client repeats the request with the authentication data corresponding to the challenge in the Authorization header.
SIP/2.0 401 Unauthorized. Via: SIP/2.0/UDP 1.2.3.4:49252;branch=z9hG4bK.6afb7404;rport=49253. From: sip:user@cesnet.cz;tag=6c2c90b8. To: sip:user@cesnet.cz;tag=c10ed4fff3e6fb17efd0bfbdcce87ce2.c76e. Call-ID: 1814859960@1.2.3.4. CSeq: 1 REGISTER. WWW-Authenticate: Digest realm="cesnet.cz", nonce="43eeaeb76e6eec559d737d4f4018dc659c5d282a". Server: Sip EXpress router (0.9.5-pre1 (i386/linux)). Content-Length: 0. REGISTER sip:cesnet.cz SIP/2.0. Authorization: Digest username="user", uri="sip:cesnet.cz", algorithm=MD5, realm="cesnet.cz", nonce="43eeaeb76e6eec559d737d4f4018dc659c5d282a", response="9e83c39e8a7262901 Via: SIP/2.0/UDP 1.2.3.4:49252;branch=z9hG4bK.32f02bf2;rport. From: sip:user@cesnet.cz;tag=6c2c90b8. To: sip:user@cesnet.cz. Call-ID: 1814859960@1.2.3.4. CSeq: 2 REGISTER. Content-Length: 0. Max-Forwards: 70. Expires: 15. Contact: sip:user@1.2.3.4:49252.
If the proxy server needs to authenticate the user before processing the request, it asks for this in the reply 407 Proxy Authentication Required and the header Proxy-Authenticate contains the challenge. The client fills the Proxy-Authorization header with adequate data in the request.
INVITE sip:mamut@iptel.org SIP/2.0. Max-Forwards: 10. Record-Route: <sip:5.6.7.8;ftag=5DAA94E7;lr=on>. Via: SIP/2.0/UDP 5.6.7.8;branch=z9hG4bK0a5d.90580ee2.0. Via: SIP/2.0/UDP 1.2.3.4:5062;branch=z9hG4bK2E1FD348. CSeq: 262 INVITE. To: <sip:mamut@iptel.org>. Proxy-Authorization: Digest username="bbb", realm="ces.net", nonce="43788e90381194d66364fced4dc7097828391e81", uri="sip:mamut@iptel.org", cnonce="abcdefghi", nc=00000001, response="ed4adec8 Content-Type: application/sdp. From: "Franta Vomacka" <sip:bbb@ces.net>;tag=5DAA94E7. Call-ID: 379332994@1.2.3.4. Subject: sip:bbb@ces.net. Content-Length: 234. User-Agent: kphone/4.2. Contact: "Franta Vomacka" <sip:bbb@1.2.3.4:5062;transport=udp>. . v=0. o=username 0 0 IN IP4 1.2.3.4. s=The Funky Flow. c=IN IP4 1.2.3.4. t=0 0. m=audio 33728 RTP/AVP 0 97. a=rtpmap:0 PCMU/8000. a=rtpmap:97 iLBC/8000.
There is no strict relation between the originator in From header and the user in the Authorization header within the message. The authenticating entity should have a list of URIs (one or more) coupled with authentication data used in Authorization headers. This approach is necessary to prevent theft of identity that could occur if the correctly authenticated user uses an improper from or to header identity.
4.1 Integrity and confidentiality
Authentication itself does not protect the message from modifications. How should we deal with integrity and confidentiality of the message? One way is to use S/MIME in the message body. The content of the message body, described by Content-type, need not be only SDP. The body can also carry a signed digest of message headers. The recipient is then able to verify whether the message was modified during the transport. Some headers, for instance Request-URI, Route and Via change during the way so that only persistent headers (To, From, Cseq, Call-id, Contact) should be taken into consideration. However, a problem could arise in respect of the Contact header if the client behind NAT is sending a private address which is changed by a helping intermediary (home proxy).
Confidentiality could be also solved by S/MIME. The SDP in the message body can be encrypted to deny listener seeing and changing the IP address, ports and codecs that will be used for communication. It is possible to tunnel the whole SIP message in the body and hide the identity of the caller. This approach brings end-to-end integrity and confidentiality but can also limit functionality of some elements, such as session border controllers.
Another solution is using an encrypted communication channel, either IPSec that we mentioned above or TLS (Transport Layer Security). The certificates of both sides should be preferably verified during the connection negotiation to provide a secured channel protecting signalling messages. The use of TLS or IPSec does not assure encryption of the connection between the endpoints but only to the next hop (Hop-by-hop). TLS and S/MIME could be combined together.
In practice, usually at least four entities are involved in communication clients of the caller and the called party and proxies of both participants.
For security reasons, at least the first hop (from the caller client to his home proxy) should use an encrypted channel. Once the secured channel is established it should be used to protect all message against any unauthorized changes that would lead to breach or misuse of policies providing domain identity or other user related headers at the home proxy.
5 Interdomain trust
Currently, the IP telephony is still deployed in the form of closed islands. The full utilisation of the IP telephony potential requires interdomain communication. Naturally, it raises certain issues. Building a closed island, with mobile or remote clients connected using VPN, is not difficult. The communication between domains can be closed into pre-build IPsec channels too, but the amount of required administration from both partners is relatively high and the solution often hardly scales. If we face the situation with possible hundreds subjects of interconnection, it is necessary to choose another solution. The problem has two parts: finding a target and establishing communication. Both steps need to be trustworthy.
One of the possible solutions is building a hierarchy of elements that may, but do not have to, use the same communication protocol as the desired communication. A practical example is GDS (Global Dialling Scheme), which is a hierarchy of H.323 gatekeepers (Figure), enabling communication of a wide global research and education community. There are several so-called global gatekeepers to which national ones are linked. Gatekeepers of particular institutions are hooked under national gatekeepers. The system is directly linked to the H.323 protocol and cannot be directly used for SIP. The trust fabric is not strong, as it is based only on configured links between the elements. As in any hierarchical system, the system depends on availability of upper-level elements. A failure of one of them can seriously hinder the functionality of the whole sub-tree.
This approach can be seen in Eduroam for which a hierarchy of radius servers was built. IPSec tunnels are used between the national and institutional level in the Czech Republic and the same approach could be used to secure the gatekeeper hierarchy. Configuration and monitoring of IPSec tunnel brings another level of complexity into the solution, but at least signalling is secured even if on hop-by hop basis only.
Another alternative involves a distributed AAI. In the IP telephony, it can work as follows: Subjects agree upon mutual trust, i.e. a trustworthy method and form of delivery of authenticated identity. The client logs in its own home server and makes a call to the other institution. The request goes over the home server where it is authenticated and the authentication headers of the home domain are replaced with a defined external identity, which says that the caller has been authenticated and has the permission of the home institution to establish such connection. The recipient of such request shall verify the validity and trustworthiness of the entered identity and can permit the connection. The system can, of course, be made even more sophisticated by applying local policies and can take into account additional attributes, such as the institution affiliation of the caller (student, teacher).
As regards SIP, we can find the above-mentioned approach in a sip-identity model [PJ06] which uses PKI infrastructure and signing of selected headers. Another mechanism using SAML assertions is described in sip-saml [Tsc06].
6 Attacks
The danger of tapping and tampering with message bodies is present in the IP telephony as in any communication. This can for example cause that the calls of the subscriber are redirected, or modification or tear down of the call in progress. There are also attacks limiting the use of the service and bothering the user - SPIT (Spam over IP telephony). The demands on the IP telephony spammer´s resources are higher than in the case of e-mail spam. The distributor must dispose with an incomparably higher communication bandwidth and stronger hardware. These, and possibly the lower number of potential recipients in comparison with e-mails, are the most probably reasons why we are not yet affected by SPIT. The protection against such form of communication in the IP telephony is harder because it is a real-time communication and the call cannot be passed to a queue and analysed for a longer time. The authentication and authorization could serve as the protection. However, the availability of the service at the interdomain level could be hindered by the use of unsuitable AA methods.
DoS (Denial of Service) in the IP telephony runs faster and therefore it is more difficult to protect against it. To make all phones in a company ringing requires a significantly lower volume of messages than to overload an e-mail server. The effect of a permanently ringing phone and impossibility to make a call is much more annoying then a half-an-hour delay of an e-mail. The defence could be based on limiting the frequency of messages from a particular source but the efficiency of this method is reduced in the face of distributed attacks.
7 DNS
Like most of today´s IP systems, the IP telephony system depends more and more on the DNS system. IP telephony uses more advanced records than a pure translation of the name to the corresponding IP address like ENUM and SRV records. SRV records are used to locate servers that are serving relevant domains. The trustworthiness of such records is very important. For example, if one manages to fake a SRV record, the requests will be routed to a totally different device and the whole domain will not be available. The same applies to A records of servers but the example has to illustrate the existence of additional sensitive points. Some adverse impact of redirection can be limited for example by using a mutual authentication at the start of the TLS connection between the signalling entities.
ENUM enables translating a phone number into URIs - identifiers of services, such as IP telephony, e-mail, web, etc. The system can significantly simplify finding the way to the called person and administration of routing data. Its simplicity is, however, also its weakness. This is due to the way in which DNS is build and used. The digits of a phone number form a tree, like ordinary records, but with one field always containing only a single digit. This fact significantly simplifies walking the whole tree and finding VoIP enabled destination. Again, it put emphasis on the protection in routing elements of IP telephone networks, for example by using the interdomain authentication.
DNSSec is supposed to protect against faked DNS records in general. Nevertheless, widespread use of DNSSec is not in sight. We shall be aware of potential threat and use additional methods to assure the identity of the peer such as TLS and certificates.
8 Conclusion
We have discussed authentication methods and their limits and summarized possible attacks on VoIP and underlying systems used by VoIP. We offered certain possible solutions and recommendation in order to secure VoIP communications.
References
| [ZJC06] | Zimmerman P., Johnston A. (Ed.), Callas J.: ZRTP: Extensions to RTP for Diffie-Hellman Key Agreement for SRTP. Internet draft draft-zimmermann-avt-zrtp-02.txt, IETF, October 2006. Work in progress. |
| [H.323] | ITU-T: Recommendation H.323. ITU-T, July 2003. |
| [Ros02] | Rosenberg J. et al.: SIP: Session Initiation Protocol. RFC 3261, IETF, June 2002. Available online. |
| [PJ06] | Peterson J., Jennings C." Enhancements for Authenticated Identity Management in the Session Initiation Protocol (SIP). RFC 4474, IETF, August 2006. Available online. |
| [Tsc06] | Tschofenig H. et al. SIP SAML Profile and Binding. Internet draft draft-ietf-sip-saml-01.txt, IETF, October 2006. Work in progress. |