Está en la página 1de 5

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 10, OCTOBER 2012, ISSN (Online) 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.

ORG

SPDY: A new Protocol for the Web and its impact on Trafc Classication An Overview
Michael Finsterbusch, Chris Richter, Klaus Hangen and Jean-Alexander Muller
AbstractThis paper gives a short overview of the SPDY protocol. We point to the protocol design, protocol constraints and the implications to the TLS protocol. The impact of SPDY on different kinds of deep packet inspection and the impact on the TLS extension Next Protocol Negotiation are discussed. Furthermore, we observed the 100,000 most popular web sites to determine the use of SPDY support. Thus, we could see a widespread increase in the number of domains and web servers that support the SPDY protocol. KeywordsInternet; Protocol architecture; Trafc analysis; Web server;

I NTRODUCTION

HE new protocol SPDY [1] SPDY is not an acronym, this name was derived from the English word speedy; SPDY is pronounced speedy is designed to reduce the latency of web pages. The rst draft of the protocol was published in 2009 by Google on their development platform [2]. In February 2012 the SPDY protocol in version 3 was suggested as an Internet-Draft to the HTTPbis working group [3] of the Internet Engineering Task Force (IETF). The HTTPbis working group started in January 2012 with the development process for the future Hypertext Transfer Protocol (HTTP) HTTP/2.0. SPDY fulls all requirements for the future HTTP which were set by the HTTPbis working group. Hence, the SPDY protocol or intrinsic parts of the SPDY protocol will become HTTP/2.0 or be part of it. So, it is useful to inspect the SPDY protocol because a change in the HTTP protocol [4] involves more than 50% of the Internet trafc [5], [6]. Due to signicant changes in frame format, traits and behaviour in transition from HTTP to SPDY, new mechanisms for detecting SPDY must be added to network devices with application protocol specic behaviour like rewalls, trafc management systems, and intrusion detection/prevention systems. For trafc classication, deep packet inspection techniques can be used. In general, there are four methods for deep packet inspection. These methods are based on the transport protocol port numbers, protocol decoding, pattern matching or heuristics [7]. All of these methods are used to classify network trafc as HTTP, or usually simply as web trafc. Due to the new protocol design, some of these methods could eventually become useless for SPDY classication. These consequences for deep packet inspection will be discussed. The paper is structured as followed: Section 2 describes the
M. Finsterbusch, C. Richter and K. H ngen are with the Faculty of a Computer Science, Mathematics and Natural Sciences, HTWK Leipzig, University of Applied Science, Germany. J.-A. M ller is with the Dept. of Communication and Computer Science, u Hochschule f r Telekommunikation Leipzig, Germany. u

new design of the SPDY protocol. In Section 3, the changes in the Transport Layer Security (TLS) protocol are outlined for application protocol multiplexing, which is required for HTTP and SPDY interoperability. Section 5 shows how far the SPDY protocol is currently distributed and used, while Section 4 describes how the statistical data for Section 5 was investigated. Finally, Section 6 concludes the paper and summarises our ndings.

2
2.1

SPDY
Overview

The goal of the SPDY protocol design is to speed up the transfer and access to web pages. Furthermore, SPDY attempts to preserve the existing semantics of HTTP in order to avoid changes or rewriting of existing web applications. The high latency of web pages results from HTTP protocol design. The main disadvantages of HTTP, which SPDY addresses, are: HTTP uses a text-based protocol header, which needs costly text parsing HTTP is connectionless; therefore, most request/response messages contain much redundant information (useragent, server, supported encodings and languages, etc.) for concurrency, HTTP uses multiple TCP connections, which results in additional delay for TCP connection establishment and TCP slow-start a typical HTTP header can reach a size of 700 to 800 bytes [2], because of cookies or gabby i. e., wordy web browsers or web servers SPDY provides an additional session layer for HTTP to multiplex all request/ response messages belonging to one session into one SPDY session, and transmits all data over only one TCP connection. SPDY uses a new binary protocol header for faster packet processing and compresses the HTTP header with zlib [8]. Optionally, SPDY can also compress the whole payload. To provide security, SPDY uses always Transport Layer Security (TLS) [9]. Additionally, it has an improvement called server push. This can be used by the web server to send data belonging to a request, without an additional client request. This feature can reduce the latency

2012 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 10, OCTOBER 2012, ISSN (Online) 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.ORG

to load all contents of a web page and also support web applications that need a server push function. Today, the server push function is implemented by client-side server polling with AJAX (Asynchronous JavaScript and XML) [10], which causes additional trafc. Any HTTP client request (or server push) starts a new SPDY stream. All streams will be multiplexed across one TCP connection. A SPDY protocol frame consists of a stream header containing information (stream id, ags, type, length) for stream management, followed by the HTTP data. The HTTP header lines are encoded in a length-value tuple that separates the header name and header value into the so called Name/Value header block for faster HTTP header parsing and processing. The Name/Value header block will always be compressed. 2.2 Consequences for DPI

A TLS client can propagate the capability for NPN in its ClientHello message. Then, the server adds the NPN extension to its ServerHello too. The content of the servers NPN extension is a list of available protocols. This list contains a concatenation of available protocol names in the so called Pascal string format, which uses a leading byte to store the length of the string, e. g., 0x06spdy/2 0x06spdy/3 0x08http/1.1. All content of the ServerHello message is unencrypted. The client announce its protocol selection with the NextProtocol message after its ChangeCipherSpec message and before the Finished message, which means that the clients selection is encrypted and not accessible for protocol observation. 3.2 Consequences for DPI

At rst view, it would still seem possible to use a port-based classication to detect SPDY or at least to classify SPDY as well as HTTPS (HTTP with SSL/TLS) as web trafc, because SPDY uses the TCP port 443 as well as HTTPS. However, with the TLS extension Next Protocol Negotiation (NPN), there is the problem that other protocols could also use SSL/TLS over TCP port 443, see Section 3. Protocol decoding and the use of patterns will also fail for SPDY classication, because of the obligatory use of encryption with TLS. So, these methods have no access to the plain SPDY frames which is necessary to observe the frame structure, follow the session and search for patterns. Heuristic-based classication seems to be the only feasible method to identify the SPDY protocol. Heuristics use statistical parameters like frame size, interarrival time, the duration of a session or connection, as well as the symmetry or asymmetry of these statistical parameters for the clientserver interconnection [7] [11]. However, the heuristics for SPDY will differ signicantly from those of HTTP. Due to compression of HTTP headers and payload, the frame sizes and frame count per request/response will be reduced. The duration of a session and a connection will become much longer, respectively. Additionally, the packet sequences between client and server can vary between different SPDY sessions, SPDY servers and web applications, because the server push function can result in a very asymmetric packet ow. Furthermore, any SPDY session starts with a TLS handshake, which has its own heuristic ngerprint. This could also complicate the protocol prediction and obfuscate or confuse the heuristics.

Due to encryption, protocol classication of TLS protected trafc is not possible with absolute certainty. It will always depend on calculated predictions and assumptions. Therefore, heuristics are often used for determining the kind of TLS protected applications. However, TLS-NPN provides a new approach for DPI. If the NPN extension is used, protocol decoding and pattern-based DPI methods can be used to nd the negotiated protocols. Therewith it is possible to limit the feasible protocols or applications, e. g., SPDY or HTTP for web applications. With this limitation, additional dedicated protocol validation methods can be applied, to nd out what protocol is used inside of TLS, if necessary. This would improve DPI for TLS protected protocols. However, if TLS-NPN becomes more popular due to SPDYs increased usage, it may be possible that system administrators decide to use only one TCP port to provide all TLS protected protocols. Then, the protocol multiplexing would no longer be done by the TCP port but with TLS-NPN. This could make some tasks such as the design of rewalls, load-balancer, protection of unsecured applications simpler for the administrators. In such a case many applications i.g., web servers, e-mail systems, TLS-VPN, administrative access could be used with only one TLS entry point supporting NPN. In this case, the benets of TLS-NPN for DPI would be defeated. Additionally, there is the problem that the TLS-NP draft also facilitates that a client may select an unadvertised protocol: There may be cases where the client knows, via other means, that a server supports an unadvertised protocol. In these cases the client can simply select that protocol. [12]. 3.3 Discussions on TLS-NPN

3
3.1

TLS N EXT P ROTOCOL N EGOTIATION


Overview In January 2010 TLS-NPN was rst suggested as a draft to the IETF TLS working group. Since then, NPN has been discussed controversially [13]. Key point of discussion has been the requirement of NPN, because application-layer protocol multiplexing is usually done by transport-layer port numbers. The rst drafts of NPN did not describe the extension sufciently for implementation, and numbers for the NPN extension have not yet been assigned by IANA (Internet Assigned Numbers

Due to a lack of backward compatibility from SPDY to HTTP, an extension of the Transport Layer Security (TLS) [9] protocol, which is used by SPDY and HTTP to cryptographically protect their communication, is needed for HTTP and SPDY coexistence. The Next Protocol Negotiation (NPN) extension for TLS [12] is a simple mechanism that allows the application layer to negotiate which protocol should be performed over the secure connection.

2012 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 10, OCTOBER 2012, ISSN (Online) 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.ORG

Authority). Finally, it seemed that the NPN extension will not be added to the TLS specication, because the draft 02 expired in October 2011 without an updating draft. All previous drafts had been immediately updated by a new draft after their expire date. In April 2012 a new draft [12] was suggested, which fulls most requirements from the TLS working group. Now, there are no formal or technical discussions for the requirement of NPN in the TLS working group. Hence, a new approach for Private Use extensions is being discussed to make the integration of future extension easier and to avoid problems with interoperability. One point of critique was that Google used the not yet standardized NPN extension for SPDY on all their web servers in the public Internet. Some members of the TLS working group raised concerns that the market power of Google could result in two distinct standards, one of the IETF and one of Google, because some implementations based on Googles TLS-NPN implementation already exist. Due to SPDY and other extensions suggested to TLS which also consist of unencrypted information exchange, a new idea is currently being discussed in the working group. To avoid weak points in TLS caused by more unencrypted information, a post-Auth post-Finished phase could be added to exchange the extensions data encrypted after the authentication phase [13]. Hence, the discussions on NPN could result in a more exible, easier to extend and more secure TLS protocol.

TOOL

For test purposes and to gather statistics about SPDY usage (see Section 5), we wrote the tool spdy ping. It is a command line tool to check SPDY support and latency of web servers. It uses a SPDY ping frame, which is designed for measuring a minimal round-trip time. Therefore the ping frame consists only of the common SDPY header (8 Bytes with version number, type and length eld) and a 4 Bytes ping identier to differentiate single ping messages [1]. The tool supports different SPDY server implementations with and without TLS-NPN and, for debugging purposes, also SPDY without TLS encryption. For SPDY servers that negotiate SPDY by TLS-NPN, a common SPDY ping message is applied. If TLS does not support NPN, or if TLS is not even used, four additional bytes will be added to the ping message. If TLS-NPN is not supported, or if SPDY is not negotiated, it is possible that the SPDY ping message will be sent to an HTTP server, so we are adding the HTTP end-of-header mark (an empty line: Carriage Return + Line Feed) to avoid starvation until server timeout. If an HTTP server receives such a message, it answers immediately with an HTTP 400 Bad Request message. The tool spdy ping is available online at [14].

web sites and is updated daily. We used this list to test the top 100,000 most frequently visited web sites. We did some consecutive automated tests to observe the increasing usage of the SPDY protocol from April 2012 to September 2012. Table 1 shows the test results. The test shows that currently less than 1% of the 100,000 most frequently visited web sites are reachable via SPDY. Many web sites are hosted by a web hosting service, which means that not every web site has its own web server. Therefore, we checked who had registered the host names for the web sites. By requesting whois1 databases, we were able to nd out which web hosting service provider or organisation is responsible for each individual web sites. We used the netname attribute from whois as organisation name. Table 1 shows which providers are hosting web servers with SPDY support. Additionally, Table 1 contains the amount of Domains which are provided by the individual organisations. In fact, the most SPDY servers are hosted by Google. Only about 1% of the web sites available using SPDY are not hosted by Google. However, we can see that the number of web pages that can be accessed using SPDY increased to its top level at June 2012. Afterwards, the number of SPDY Domains regressed slightly. The number of web hosting service providers or organisations also reached its top level at June 2012. Thereafter, the number of organisations hosting web servers with SPDY support was halved until September 2012. This evolution can have several reasons. SPDY is currently under development, therefore web hosting service providers may test SPDY for a period of time. Or simply some domains that support SDPY dropped of the top 100,000 most frequently visited web sites. The absolute number of domains or web hosting services that support SPDY is one side of the coin. The rank of the single web sites must be considered. Because a web site with a high rank generates more trafc than sites with a lower rank. Figure 1 shows a histogram of the distribution of SPDY use over the 100.000 most frequently visited web sites. The histogram summarize 1000 values to one point in the diagram. As we can see, the histogram shows that 7% of the rst 1000 most frequently visited web sites support SPDY. In the range from 1000 to 100.000 the SPDY support is in average between 0.5% and 1%. On the top 100 most frequently visited web sites is the SPDY support over 20%.

C ONCLUSION

AND

F URTHER W ORK

S TATISTICS

To observe the spreading of the SPDY protocol, we used the tool described in Section 4 to test the most frequently visited web sites for SPDY support. A list of the top one million most frequently visited web sites is provided by Alexa Internet [15]. This list contains the top one million most frequently visited

The launch of the SPDY protocol has not and will not be noticeable for most web users, web application engineers and network operators. However, it will be noticeable for all who provide products like rewalls, intrusion detection systems, intrusion prevention systems, or other content driven network devices. Currently, the usage of SPDY is negligible, but this could increase rapidly in future. The widely used web browsers Mozilla Firefox [16] and Google Chrome [2]
1. whois is a protocol for querying databases that store the registered users or assignees of an Internet resource, such as a domain name, an IP address block, or an autonomous system.

2012 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 10, OCTOBER 2012, ISSN (Online) 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.ORG

TABLE 1: Use of SPDY on top 100,000 most frequently visited web sites.
Measurement Date Domains Organisations AMAZON-EU-AWS APPLIEDINNOVATIONS BLACK-LOTUS-COMMUNICATIONS BIZLAND-FC01 BODIS-COM CALPOP-NETWORK CLOUDFLARENET CNC-BJ-IDC DIMECNET DIMENOC DREAMHOST-BLK7 GO-DADDY-COM-LLC GOOGLE GOOGLECN HETZNER-RZ* HGBLOCK-* HOSTZILLA-LTD Hurricane Electric HSI-1 iLOL Internap Network Services Corporation LIQUIDWEB-9 MEDIATEMPLE-106 MumbaiPool NETBLK-SPHNET NETBLK-THEPLANET-BLK-17 OC3-NETWORKS2 OVH Racksrv RIPE-* SAKURA-OSAKA SchillingAviation SCHLUND-SHARED SE-HD SINGLEHOP SIXCORE SOFTLAYER-4-* STADT-KARLSRUHE STOLICZKUPL-ALEKSANDRALAZAR-PIOTR-SARNACKI-SC TPSUZ-NET TWITTER-NETWORK UNICOM-SC WIREDTREE 11. Apr 2012 490 5 11. May 2012 495 9 11. Jun 2012 868 32 1 1 1 2 1 2 1 1 2 1 2 828 2 1 1 11. Jul 2012 805 29 1 11. Aug 2012 815 24 11. Sep 2012 811 17

1 2 1 1 1 1 771 1 2 1 1 1 1 1 1 1 1 2 2 1 1 1

1 1 1

1 1

487 1

488 1 1 1

1 1 786 1 2

1 4 787 1 2

1 1 1 1 1 1 3 1 2 1 1 1 1 1 1 1

1 1 1 1

1 1 2 2 1 1 2

1 1

2 1

1 1 1 1 3 1

1 1 1 3 1

2 1 1 3 1

2 1

have SPDY support, so many users can use SPDY, without being aware of it. The SPDY support for web servers is currently not as good as for web browsers. For the most used web server Apache [17], SPDY support is available as an optional module [18], but it must be separately installed and congured. Other web servers like nginx [19] do currently not provide SPDY, but SPDY support is planned for future versions and a SPDY patch is available, but it is currently under development. If SPDY becomes a proposed standard or increases in popularity, it could be supported by the web servers with their default installation and default conguration. Then, the usage of SPDY will increase signicantly. However, just about 1% of the web sites support SPDY, but over 20% of the top 100 web sites and about 7% of the top 1000 web sites support SPDY. A major part of the global Internet trafc is generated by the top 100 web sites. For example, the video streaming platform YouTube (rank 3

since 2011 [15]) is responsible for over 20% of the mobile trafc in 2011 [20] and it supports SPDY. That means, much web content is already reachable by SPDY even if the total number of web sites with SPDY support is low. These facts show that it is necessary to consider SPDY in content specic network management systems as rewalls, network intrusion detection systems or network trafc management. The detection, ltering, managing and trafc engineering of web trafc will become more difcult with SPDY. Most common DPI techniques will fail on SPDY. Additionally, due to TLS-NPN, the protocol multiplexing by transport protocol port numbers is no longer necessary. All applications or protocols which use TLS can share the same port. This enables fast and secure rewall congurations, TLS load balancing, etc., but makes protocol or application recognition more difcult. In future we will continue to follow the SPDY standardisation process and its increased widespread usage. Furthermore,

2012 Journal of Computing Press, NY, USA, ISSN 2151-9617

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 10, OCTOBER 2012, ISSN (Online) 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.ORG

80 70 60 50 40 30 20 10 0
0

# SPDY Domains

09/11/2012 08/11/2012 07/11/2012 06/11/2012 05/11/2012 04/11/2012 one percent

Fig. 1: Distribution of SPDY support on top 100,000 most frequently visited web sites.

we want to investigate heuristics for SPDY classication. Therefore, we will have to compare SPDY and HTTPS trafc to evaluate if a heuristic based protocol prediction is possible. In past work [11] we have determined that the protocol asymmetry of communication from client to server and server to client can increase the protocol classication accuracy. Thus, the separation of SPDY and HTTPS may be possible, due to the more asymmetric behaviour of SPDY (server push). Other distinctions could also result in SPDYs header compression and stream multiplexing.

00 00 10 0 00 95 0 00 90 0 00 85 0 00 80 0 00 75 0 00 70 0 00 65 0 00 60 0 00 55 0 00 50 0 00 45 0 00 40 0 00 35 0 00 30 0 00 25 0 00 20 0 00 15 0 00 10 00 50

Rang in list of most visited web sites

[8] [9] [10] [11] [12]

[13]

ACKNOWLEDGEMENTS
We thank the anonymous reviewers for their insightful comments. This work is supported by the European Regional Development Fund (ERDF) and the Free State of Saxony.

[14] [15] [16] [17] [18] [19] [20]

P. Deutsch and J.-L. Gailly, ZLIB Compressed Data Format Specication version 3.3, RFC 1950 (Informational), Internet Engineering Task Force, May 1996. T. Dierks and E. Rescorla, The Transport Layer Security (TLS) Protocol Version 1.2, RFC 5246 (Proposed Standard), Internet Engineering Task Force, Aug. 2008, updated by RFCs 5746, 5878. OpenAjax Alliance. [Online]. Available: http://www.openajax.org M. Finsterbusch, C. Richter, and J.-A. M ller, Impact of Asymmetry u of Internet Trafc for Heuristic Based Classication, 2012. A. Langley, Transport Layer Security (TLS) Next Protocol Negotiation Extension, Working Draft, IETF Secretariat, InternetDraft draft-agl-tls-nextprotoneg-03, Apr. 2012. [Online]. Available: https://tools.ietf.org/html/draft-agl-tls-nextprotoneg-03 Transport Layer Security working group of the IETF Discussion Archive, IETF Secretariat. [Online]. Available: http://www.ietf.org/mail-archive/web/tls/current/maillist.html SPDY Ping, ping-like tool to test SPDY servers. [Online]. Available: http://sourceforge.net/projects/spdyping/ Alexa Internet, Alexa Internet, Inc., Top 1m sites at http://s3.amazonaws.com/alexa-static/top-1m.csv.zip. [Online]. Available: http://www.alexa.com/topsites Mozilla Firefox. [Online]. Available: http://www.mozilla.org The Apache HTTP Server Project. [Online]. Available: http://httpd.apache.org/ mod spdy. [Online]. Available: https://developers.google.com/speed/spdy/mod spdy/ nginx. [Online]. Available: http://nginx.org/ Allot Communications, Allot MobileTrends, Tech. Rep. H1, 2011. [Online]. Available: www.allot.com

R EFERENCES
[1] [2] [3] [4] M. Belshe and R. Peon, SPDY Protocol, Working Draft, IETF Secretariat, Internet-Draft draft-mbelshe-httpbis-spdy-00.txt, Feb. 2012. The Chromium Projects. [Online]. Available: http://dev.chromium.org/spdy Hypertext Transfer Protocol Bis - Active Workgroup. [Online]. Available: http://tools.ietf.org/wg/httpbis/ R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee, Hypertext Transfer Protocol HTTP/1.1, RFC 2616 (Draft Standard), Internet Engineering Task Force, Jun. 1999, updated by RFCs 2817, 5785. C. Labovitz, S. Iekel-Johnson, D. McPherson, J. Oberheide, and F. Jahanian, Internet inter-domain trafc, SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. , Aug. 2010. [Online]. Available: http://dl.acm.org/citation.cfm?id=2043164.1851194 G. Maier, A. Feldmann, V. Paxson, and M. Allman, On dominant characteristics of residential broadband internet trafc, in Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference, ser. IMC 09. New York, NY, USA: ACM, 2009, pp. 90 102. [Online]. Available: http://doi.acm.org/10.1145/1644893.1644904 M. Finsterbusch, C. Richter, and J.-A. M ller, Parameter Estimation u for Heuristic Based Internet Trafc Classication, in ICIMP 2012: The Seventh International Conference on Internet Monitoring and Protection. Stuttgart, Germany: IARIA, 2012, ISBN: 978-1-61208-201-1 .

M.Eng. Michael Finsterbusch studied computer science in Leipzig at Telekom University of Applied Sciences (HfTL) and received his Master degree in 2009 for the research on WLAN mobility. His current research covers network trafc management, Deep Packet Inspection, Quality of Service and network protocols. M.Eng. Chris Richter studied computer science in Leipzig at Telekom University of Applied Sciences (HfTL) and received his Master degree in 2010 for the research on home automation. His current projects cover network trafc management, Deep Packet Inspection and Machine Learning. Prof.Dr. Klaus Hangen studied Physics at the Leipzig University (Germany) and received his doctors degree in 1983. Since 2000 he has been a professor for Computer Science at the HTWK Leipzig and is head of many research projects covering Multimedia Communication Networks. Prof.Dr. Jean-Alexander Muller studied computer science at the Leipzig University (Germany) and received his doctors degree in 2004 and became professor for Computer Science at HTW Dresden. Since 2009 he is working at HfTL and became 2011 dean of the faculty. His current research interests are Communication Network and Quality of Service.

[5]

[6]

[7]

2012 Journal of Computing Press, NY, USA, ISSN 2151-9617

También podría gustarte