Active network performance monitoring
CESNET
technical report number 8/2007
also available in PDF,
PostScript, and
XML formats.
Sven Ubik
15 October 2007
1 Abstract
In this report we describe our experience with active performance monitoring in our CESNET network. We describe what performance characteristics can be measured by active monitoring, what are the recommended tools and what experience we have acquired in setting up active monitoring. We use a novel pragmatic approach for loss and reordering presentation and for UDP throughput measurements.
Keywords: network performance, active monitoring
2 Active monitoring
In active monitoring we send test packets into the network, capture these packets after they have passed through the network and analyse how these test packets were affected in terms of volume, time and error characteristics. Active monitoring can be considered a probe into the network.
On the other hand, in passive monitoring we do not send test packets, rather we observe and analyze real network traffic. Passive monitoring can be consider a watch on the network.
Characteristics obtained from active monitoring are in principle only truly applicable to the test packets. It is not certain whether comparable characteristics are experienced by real network traffic. Some characteristics, such as packet loss rate, are known to differ significantly as experienced by test packets when compared to real network traffic (we will explain the case of packet loss later). With active monitoring we also cannot say what real traffic is there in the network. With passive monitoring we can in principle obtain any information about real network traffic. Therefore, passive monitoring is by nature more powerful than active monitoring.
Nevertheless, active monitoring is less expensive than passive monitoring, because it does not require powerfull hardware to process large volumes of traffic in real time, and it can provide certain useful network characteristics.
In this report we concentrate on performance monitoring, rather than operational monitoring. That is we are interested in characteristics affecting quality of data transfers, such as delay, packet loss, reordering and throughput, rather than indicators of a running status, such as reachability of network nodes.
The rest of this report is organized as follows. Section describes what network performance characteristics can be obtained from active monitoring. Section provides more details about delay, loss and reordering monitoring. Finally, section Section describes our experience with throughput monitoring.
3 Performance characteristics
Important network performance characteristics that can be obtained from active monitoring are delay, packet loss, reordering and throughput. These characteristics are useful for two purposes. First, to assess the expected quality of data transfers. Second, as indicators of a good health of the network - we should look for unusual values and sudden changes in continuous monitoring. In the following paragraphs we describe more details about these characteristics.
3.1 Delay
Delay can be measured as round-trip or one-way. Round-trip
delay is most commonly measured by a well-known ping tool,
which sends ICMP or UDP packets to a remote host, which returns back a
response in ICMP packets. ICMP support is a standard part of any
network equipment with IP connectivity. Therefore we do not need to
install any special software on remote hosts.
For performance debugging it is more useful to know separate one-way delay from the source to the destination and one-way delay in the other direction. One-way delay monitoring requires precise time synchronization between the source and destination hosts and cooperating software on the destination host.
A formal definition of one-way delay is provided in [rfc-ippm] as a time period between the first bit of a packet is sent by the source host and the last bit of a packet is received by the destination host. It is technically difficult to obtain exact timestamps refering to these moments. Therefore, practical measurements are approximation by using timestamps that are close to sending or receiving a packet on a given host.
Accuracy of one-way delay monitoring is affected mainly by the following two factors, their usual magnitude is indicated in parenthesis:
-
Relative accuracy of time synchronisation between monitoring stations (tens of microseconds with GPS, 1-3 milliseconds with NTP without GPS)
-
Delay between a packet is sent or received and its timestamp is assigned due to hardware and software processing in end hosts (tens of microseconds)
Standard one-way delay over an inter-city link is 1-5 milliseconds. Therefore, GPS-based synchronisation is required and if used the measurement accuracy is at the order of tens of microseconds.
However, we found that on a highly loaded machine, the second factor can be at the order of milliseconds. This can be tested by sending packets in precise time intervals using a hardware packet generator or a hardware monitoring card (such as DAG) and receiving these packets on a monitoring station. If the magnitude of this fluctuation reaches significant percentage of a typical one-way delay that we want to measure, we must use hardware monitoring cards that assign timestamps in hardware. In our case, we checked with a hardware generator that regular Ethernet cards and assigning timestamps in an operating system were acceptable.
The difference of delay between two consecutive packets is called IP packet delay variation (IPDV) or jitter [rfc-ipdv]. It is an important statistical characteristics for certain network applications, such as audio and video transmissions and it is often considered a separate network performance characteristics.
Delay is a performance characteristics, which is most conveniently measured by active monitoring. The implementation is much easier than with passive monitoring and values measured by test packets are well applicable to real traffic on a given network path.
3.2 Loss
Packet loss can be measured as a singleton metric - a packet is either delivered or lost. More conveniently, it is usually expressed statistically, as a percentage of packets lost from the total number of packets sent over a certain time period [rfc-loss]. It is also interesting to measure the number of consecutive packets lost (loss period) or the number of packets sent between two detected losses (loss distance) [rfc-loss-dynamics], which can affect some applications.
Active monitoring of packet loss by test packets is easy, but the values obtained in this way only apply to test packets and are known not to correspond to packet loss experienced by real network traffic.
A short summary of the problem follows. Suppose that we sent 10 test packets per second between two end points. It would take almost 3 hours to detect packet loss of 10-5 and more than a day for 10-6. Such packet loss rates still affect significantly TCP throughput over fast long-distance networks. These calculations are true for evenly distributed packet loss. When bursts of packet loss occur, which is a common case, it can take even longer time to measure packet loss rate with reasonable accuracy. We can look at the problem from another perspective. If we send a burst of 10 test packets and manage to catch a loss period and lose 5 out of 10 packets, what is the time period for which was this 50% packet loss rate valid? A consequence is that packet loss experienced by test packets, no matter how many and in what patterns we send them, is not related to the actual behaviour experienced by real network traffic.
Therefore, active packet loss monitoring should be considered only as an indicator of a good network health, rather than accurate measurement of what packet loss is expected by user traffic. Measurement with test packets should normally indicate near zero packet loss. Steady loss of test packets usually indicates a network problem.
3.3 Reordering
Similarly to packet loss, packet reordering can be also measured as a singleton metric - a pair of packets is either reordered or not and statistically, as percentage of reordered packets from the total number of packets sent. Additionally, there are several metrics to quantify dimensions of reordering in time and space [rfc-reordering].
Packet reordering has similar properties to packet loss regarding its monitoring. While packet loss experienced by real network traffic is often higher than what we can detect by test packets, due to temporary congestions caused by real network traffic, it may be even harder to apply reordering detected by test packets to real network traffic. The cause of packet reordering may be different from the cause of packet loss. Reordering generally happens as consequence of different timing in parallel processing. Some routers are known to cause reordering in periods of high load. Therefore, reordering detected by active monitoring is also useful mainly as an indicator of a possible network problem.
3.4 Throughput
When observing network load, we can define several terms:
- Installed capacity
-
is the maximum volume of data that can be theoretically transfered over a network in a unit if time. It is a property of the physical network medium.
- Used capacity
-
is the currently occupied part of installed capacity. It can be expressed as percentage from the installed capacity, in that case we call it utilization.
- Available capacity
-
is the currently unoccupied part of the installed capacity. It is a complement of used capacity to installed capacity.
- Throughput
-
sometimes called bulk transfer capacity or goodput, denotes the volume of additional data that can be transfered over the network already including some data.
The term bandwidth is sometimes used interchangeably with capacity, which is a prefered terminology according to [ietf-capacity]. Installed capacity is specified at the physical layer including inter-packet gaps. Used capacity and available capacity are usually given at the network layer, that is in bits of IP packets including IP headers. And throughput is usually given at the session layer that is in bits successfully transfered by the transport protocol.
Throughput is different from available capacity, not only because it is computed at a different layer, but mainly because it depends on transport protocols used by existing traffic and by added traffic. Most traffic is currently carried by TCP, which is an elastic protocol reacting to congestion by reducing volume of data sent into the network. Throughput of a TCP connection added to a network whose current traffic consists mostly of TCP is usually higher than available capacity, because added traffic stresses existing traffic.
All these metrics can be considered either for individual links in a network or for the whole path through the network. In the latter case we are interested in the maximum value over all links for used capacity and minimum value for other metrics.
Available capacity cannot be monitored directly. We can monitor used capacity by reading router interface byte counters by SNMP. We can also obtain used capacity from packet capture on a network line. Throughput can only be measured by active monitoring.
While monitoring of used capacity is non-intrusive, does not depend on network protocols and can run continuously, it also useful to make throughput measurements as practical verification that certain volume of data can really be transfered over the network. Current backbone network lines have often installed capacity of 10 Gb/s or more. When we measure throughput with monitoring stations equipped with Gigabit Ethernet network adapters, performance is usually limited by monitoring stations themselves. The value of such measurements is in providing baseline of good network health, similarly to active loss measurements. If measured throughput is very low or drops suddenly, it is usually an indicator of a network problem.
4 One-way delay, loss and reordering monitoring
We have decided to use RUDE/CRUDE tool for active
one-way delay, loss and reordering monitoring. rude is a
packet generator that can send multiple UDP streams of packets of
specified sizes and rates. Each packet carries a sender timestamp
expressed in seconds and microseconds. crude is a packet
receiver that can log a short record about each received packet to a
log file, including sender and receiver timestamps.
The reason we chose RUDE/CRUDE is its flexibility in stream configuration and very low overhead that allows to send packets with precise timing and high rates (even though we currently use quite low rates).
Original RUDE/CRUDE only works for unicast IPv4 packets. We added support for IPv6 and multicast. We also added ability to specify very low packet rates with less than one packet per second and ability to send packets in bursts of specified sizes.
We monitor one-way delay and packet loss from the central monitoring station located in CESNET premises in Prague to monitoring stations located in CESNET PoPs in different cities and conversely from these remote monitoring stations back to the central monitoring station. One test UDP packet is sent every 10 seconds in each direction. Deployment of monitoring stations is illustrated in Figure. The same monitoring stations are used also for active throughput monitoring (see next section) and for passive monitoring [abw].
We have developed a set of scripts that compute one-way delay, loss and reordering, store these values in an RRD database and provide results in a graphical form based on user requests. A web-based user interface is illustrated in Figure.
An example of a delay graph is shown in Figure. Average delay computed over specified time steps during a specified time range is shown in as positive values in red color for one direction and as negative values in green color for the other direction. The reason we use different colors is that in case of poor time synchronization values may be shifted from positive to negative part of the graph or vice versa. The colors make it easy to note such a problem.
The same graph also shows average and maximum delay over a coarse specified time steps for comparison. These additional two values are depicted by yellow and blue lines respectively.
An example of a loss graph is shown in Figure. We also indicate packet reordering in the same type of graph. We adopted a pragmatic approach to storing and presentation of packet loss and reordering. As we described in Section, active loss monitoring is useful only as an indicator of possible network problems, rather than precise characterisation of packet loss experienced by real traffic. The same is true for packet reordering. Therefore, we just store each detected loss event in the RRD database as the number of consecutive test packets lost and present this number in the graph. If reordering of test packets is detected, we store the difference of a received and an expected sequence number and present this reordering size in the graph. We instruct the rrdgraph utility to plot maximum values during a specified time range with a specified time step. Loss is shown in red color and reordering is shown in green color. If you choose any time range and any time step, you will always see the same magnitude of loss or reordering (they are not averages), if those we detected during the time range covered by the graph. You can find more precisely the times when loss or reordering were detected by generating another graph with different time range or time step. So far we have not detected any reordering of test packets in our network. It would be indeed unlikely given the low rate of test packets. We plan to investigate possibilities of detection of reordering in user traffic.
5 Throughput monitoring
For throughput measurements, we have decided to use a well-known and time-proven tool iperf. It is a classic stress-type throughput measurement tool, which tries to send as much data as possible over a network path between a sender and a receiver, which are both implemented in the same tool.
We decided not to use any of the lightweight capacity estimation tools, such as pathrate or pathload, because they tend to be unprecise and unreliable in Gigabit-speed networks.
We wanted to measure throughput separately for IPv4 and IPv6. We have found that IPv6 support is not working properly in iperf distributed in a tarball. The latest version from CVS (version 2.0.2, as of the time of this writing) works correctly.
A few practical findings about iperf use:
-
Iperf can do both forward and backward throughput measurement. Backward measurement can be dome in parallel with forward measurement (requested by
-dargument) or after forward measurement (requested by-rargument). The latter option provides better results, because running forward and backward test in parallel puts higher load on measurement stations and achieved throughput is lower. -
If backward measurement is requested, we need to use two separate iperf servers for IPv4 and IPv6. If only forward measurement is requested, you can use just one iperf server configured to accept IPv6 connections (requested by
-Vargument), which also accepts IPv4 connections (this does not work for backward measurements). -
Test connections must be able to pass through all firewalls along the network path and on monitoring stations. For instance, if you use iptables on monitoring stations, you need to add the following commands to iptables configuration:
# IPv4 iperf # Set subnet to IP addresses if iperf clients allowed to connect to this # iperf server. Set port to where this iperf server is running. $IPTABLES -N S_IPERF $IPTABLES -F S_IPERF $IPTABLES -A S_IPERF -s <subnet> -j ACCEPT $IPTABLES -A INPUT -m state -state NEW -p udp -dport <port> -j S_IPERF $IPTABLES -A INPUT -m state -state NEW -p tcp -dport <port> -j S_IPERF # IPv6 iperf # We need to open also source ports, because iptables do not track IPv6 # connections. Set subnet and port (can be different from IPv4). $IPTABLES -N S_IPERF $IPTABLES -F S_IPERF $IPTABLES -A S_IPERF -s <subnet> -j ACCEPT $IPTABLES -A INPUT -p udp -dport <port> -j S_IPERF $IPTABLES -A INPUT -p tcp -dport <port> -j S_IPERF $IPTABLES -A INPUT -p udp -sport <port> -j S_IPERF $IPTABLES -A INPUT -p tcp -sport <port> -j S_IPERF
-
For IPv6 UDP test, you need to specify message size on the client side and for the backward test also on the server side using the
-largument such that the datagrams fit into MTU of the network path and of the monitoring stations. Otherwise, fragmentation will occur and IPv6 iptables on the receiving monitoring station will block them. Maximum message size for MTU=1500 is 1452 bytes (8 bytes are needed for UDP header and 40 bytes for IPv6 header). When this problem happens, you will see a message like this in syslog of the receiving monitoring station:Jun 8 14:39:07 perfmonc kernel: UNDEFINED INPUT:IN=eth0 OUT= MAC=00:30:48:82:2b:0e:00:15:fa:87:31:00:86:dd SRC=2001:0718:0001:000c:0230:48ff:fe53:03aa DST=2001:0718:0001:0101:0230:48ff:fe82:2b0e LEN=78 TC=0 HOPLIMIT=63 FLOWLBL=0 FRAG:1448 ID:0001991f PROTO=UDP
-
We needed to patch iperf code in a few places so that in case of various connection problems it always exits with an error status, rather than hanging indefinitely, which would prohibit the use in scripts that call iperf many times with various arguments.
-
We found that the minimum time of one measurement when lower throughput during the connection startup does not affect average throughput of the whole measurement is 2 seconds. Therefore, we make 2-second measurements for TCP and for each rate of a UDP test.
We monitor throughput from the central monitoring station located in CESNET premises in Prague to monitoring stations located in CESNET PoPs in different cities and conversely from these remote monitoring stations back to the central monitoring station. We use the same monitoring stations as for delay monitoring and passive monitoring.
We have developed a set of scripts that perform regular throughput measurements, store results in an RRD database and provide results in a graphical form based on user requests. Throughput is monitored separately for IPv4 and IPv6 and separately for TCP and UDP. When monitoring UDP throughput, we send a stream of UDP packets in a stepwise increasing rate and note two values - the maximum rate (separately for each direction) with zero packet loss and the maximum rate with packet loss lower than a specified limit. When packet loss rate exceeds the specified limit, we stop measurement. We currently use the limit of 5%. The measured values suggest performance that can be expected from practical UDP-based applications without excessively stressing existing network traffic. Configuration of monitoring stations follows standard recommendations regarding socket buffers and other end-host tuning for high performance [end-host-tuning].
A web-based user interface is similar to delay monitoring and is illustrated in Figure.
An example graph of throughput monitoring is shown in Figure. Green color (light grey) indicates UDP throughput with zero packet loss, red color (dark grey) indicates additional UDP throughput with packet loss under 5% and blue lines indicate TCP throughput. User can select time range and time step to compute average values. Only one time step is used in one graph. The indicated throughput is shown as measured by iperf that is in bits of TCP or UDP payload.
We studied effects of stress-type throughput tests affect other types of monitoring done on the same monitoring stations. Each complete throughput test between two monitoring stations (including IPv4 and IPv6, UDP and TCP in both directions) transfers approximately 4 gigabytes of data. This volume of data is visible in SNMP link monitoring for links that are normally lightly loaded. It is invisible for links that are normally highly loaded (such as Prague - Brno link), because it is below normal fluctuations of link load. We found that throughput tests do not cause visible affects on passive monitoring results done on the same monitoring stations. However, we found that throughput tests cause significant fluctuations in delay measured actively on the same monitoring stations. For instance, Figure shows delay peaks around 8 AM when throughput test was performed. Another fluctuation on the left is unrelated to the throughput test. We will modify our scripts so that delay measurements will be temporarily suspended during throughput measurements.
6 Conclusion
We have described how active monitoring can be used for delay, loss, reordering and throughput measurements. We summarized our experienced with certain monitoring tools and we presented how the monitored characteristics can be conveniently presented.
References
| [rfc-ippm] | G. Almes, S. Kalidindi, M. Zekaukas. A One-way Delay Metrics for IPPM, RFC-2679, IETF, September 1999. |
| [rfc-ipdv] | C. Demichelis, P. Chimento. IP Packet Delay Variation Metric for IP Performance Metrics (IPPM), RFC-3393, IETF, November 2002. |
| [rfc-loss] | G. Almes, S. Kalidindi, M. Zekaukas. A One-way Packet Loss Metrics for IPPM, RFC-2680, IETF, September 1999. |
| [rfc-loss-dynamics] | R. Koodli, R. Ravikanth. One-way Loss Pattern Sample Metrics, RFC-3357, IETF, August 2002. |
| [rfc-reordering] | A. Morton, L. Ciavattone, G. Ramachandran, S. Shalunov, J. Perser. Packet Reordering Metrics, RFC-4737, IETF, November 2006. |
| [ietf-capacity] | P. Chimento, J. Ishac. Defining Network Capacity, IETF Draft <draft-ietf-ippm-bw-capacity-05>, May 2007. Work in progress. |
| [abw] | Sven Ubik, Demetres Antoniades, Arne Øslebø ABW: Short-Timescale Passive Bandwidth Monitoring, CESNET Technical Report 3/2007. |
| [tbwtools] | Sven Ubik, Václav Řehák, Lukáš Baxa. Tbwtools: Debugging TCP Performance, CESNET Technical Report 6/2006. |
| [end-host-tuning] | Brian L. Tierney. TCP Tuning Techniques for High-Speed Wide-Area Networks, NFNN2, Edinburgh, June 2005. |