Advanced Traceroute
CESNET
technical report number 28/2007
also available in PDF,
PostScript, and
XML formats.
Sven Ubik, Kamil Žáček
30.11.2007
1 Abstract
Many performance problems in the current Internet are caused by misconfiguration, faulty components or exceeding link or node capacity in some specific point in a network path. In this report we describe an intelligent traceroute that traverses an end-to-end path similarly as a standard traceroute. However, it additionally provides performance information about links and nodes in a path collected from measurement points of the perfSONAR monitoring framework. It also correlates forward and backward traces and identifies assymetries.
Keywords: Traceroute, Performance monitoring, Web services
2 Introduction
Many performance problems in the current Internet are caused by misconfiguration, faulty components or exceeding link or node capacity in some specific point in a network path between communication end points. When resolving performance problems in the PERT (Performance Enhancement and Response Team) activity, we found that it would be very useful to have a tool that would help find points in a network path where the origins of performance problems are located.
We call such a tool "smart traceroute", because it should traverse a network path similarly as the standard traceroute program, but it should provide more information about the links and nodes along a path that will help debug performance problems.
This report summarizes the first phase of our contribution to providing such a tool. We expect that we will be able to create a more advanced tool with the emerging technologies in the future phase.
3 Problem statement and requirements
The standard traceroute program provides for each hop along a network path information about its reachability and RTT (Round Trip Time). This information has limited usefulness for debugging performance problems. Moreover, both characteristics are sometimes inacurate. Some hops (routers or switches) do not respond to traceroute queries because this response is disabled on the hop, whereas the hop is actually reachable by passing traffic. If such responses are permitted, they are usually provided by a lower priority process. As a result, RTT experiences fluctuations that may not relate to actual load on the links in a measured network path.
The current technologies for providing on-demand dedicated circuits or VPNs (Virtual Private Networks) often use L2 or L1 connections on the whole or part of the end-to-end network path. The hops in L2 or L1 connections are not visible to the standard traceroute program.
We identify requirements for the smart traceroute as follows:
-
to identify all hops along an end-to-end network path, regardless of the technology used to connect these hops (L1, L2 or L3)
-
to operate in both direction with identification of assymetric parts of the network path
-
to provide characteristics that can be used to locate performance limiting points
-
to allow extension for measurement, storage and presentation of new network characteristics
Considering a typical process to debug end-to-end performance problems, the network characteristics that should be provided by a smart traceroute for each link or hop in a network path should ideally include the following:
-
installed capacity
-
available capacity
-
packet error rate (including reason)
-
queue occupancy and queue utilization on the hop before the link
While installed capacity is constant, other characteristic vary in time and therefore need to be complemented by information about the time period for which they were measured and about statistics used to compute them (average, maximum, etc.). Packet loss is an important characteristics, but it is even more useful to know the cause why packets are being dropped, which can be either a packet error of some type detected on some hop interface or queue overflow in some hop. Queue occupancy is the number of packets held in a queue, while queue utilization is the percentage of the queue capacity that is occupied.
We reviewed some of the currently available implementations of traceroute-like programs that represent various measurement and presentation techniques. Details about each reviewed program can be found in [Ple06].
4 Application design
The above stated requirements require support from the network infrastructure. None of the given characteristics can be measured for individual links or hops from end-hosts only, without support from the network infrastructure. An obvious candidate for communication with hops is an SNMP protocol. But only part of the required characteristics can be obtained from the data objects in MIB (Management Information Base) of the current routers. Nevertheless, even if MIB is extended or another protocol is used, it is a common practice that access to routers by management protocols is restricted to a few IP addresses where local network management stations are running. We need to access this information from a PC of an unprivileged user for hops in local and remote networks along the network path.
Therefore it is necessary to use proxy devices that will mediate access to the hops along a network path. One technology based exactly on this paradigm is the perfSONAR framework currently being developed for performance monitoring in the European GN2 network. In perfSONAR, components called Measurement Points (MP) and Measurement Archives (MA) are designed to provide data about specific network resources. An MA includes internal storage of historical data, while an MP provides only on-the-fly data. There are different types of MPs or MAs providing different network characteristics. Each MP or MA accepts requests as XML messages, with syntax and semantics specific to the given type of MP or MA, and sends responses back also as XML messages. Each MP or MA registers its location (URL) with the Lookup Service (LS), which is another perfSONAR component. We can send queries to the LS to ask for the locations of registered MPs and MAs. A common installation to be used is one MP or MA of each type in every network under certain administration. Therefore, we need to contact all MPs or MAs of the networks that are traversed by a network path that is being analysed.
For the purposes of the smart traceroute, two existing types of MP appear to be particulary useful: an SNMP MP that uses the SNMP protocol to retrieve data from routers and a Telnet/SSH MP that uses telnet or SSH protocol to connect to routers, executes commands there and sends their results back. Both MPs use a predefined set of SNMP queries and command-line commands, respectively, to retrieve information from routers. You cannot send arbitrary SNMP request or commands to routers. However, you can configure the set of permitted SNMP requests and commands in MP configuration.
We decided to divide the smart traceroute into several steps, that can be implement as separate programs or as phases in one program:
-
Standard traceroute This phase works as a standard traceroute to traverse the network path from the source to the destination. The destination host can optionally run a server that performs traceroute in the backward direction on request sent from thr source host. The result is stored in a text file.
-
Traceroute XML converter This phase takes as input the text representation of the result of a standard traceroute and converts it into the XML form according to the draft proposal.
-
MP lookup This phase queries the Lookup Service to get URLs where the MPs or MAs can be contacted, that can provide information about routers whose interface addresses were obtained in the first phase.
-
MP/MA queries This phase queries MPs or MAs to get performance characteristics about links and hops along a network path.
The application must be able to work with partial information, when some of the hops along a path were unreachable or no MP or MA provides information about them. In such cases at least the available information should be presented.
5 JTraceRoute - a prototype implementation
We created a prototype version of a smart traceroute that implements the design decisions described in Section. We implemented the prototype in the Java language so that it can be embedded in a web browser through the web start technology and we call this prototype JTraceRoute.
JTraceRoute can operate in both directions, identify assymetric parts of a network path and present the network path graphically. It can then contact the Lookup Service (LS) to ask where MPs and MAs for individual nodes in a network path are located. JTraceRoute then presents users with a choice of XML queries that can be sent to these MPs and MAs. Templates of queries are included in configuration and IP addresses or other fields are replaced depending on the interface address that is being investigated. A user can edit the XML queries. Responses retrieved from MPs and MAs are presented again as XML messages.
Only a subset of characteristics proposed in Section is provided by current MPs and MAs. We do not interpret responses from MPs and MAs in any way. They are presented to the user as they are.
Data storage
A need for standardized way to store traceroute results was recognized by the IPPM (IP Performance Metrics) group of IETF (Internet Engineering Task Force). As a result, an Internet draft is now being prepared [Nic07] to propose a normalized format to store results of the standard traceroute.
We discussed within the IPPM community a possibility of extension of this draft to cover additional characteristics, possibly in an extendable way so that more characteristics can be easily added in the future. It was concluded that the current draft should be reserved to the standard traceroute only and a new draft should be proposed, if needed, to cover additional characteristics. We are currently working on the proposal of this new Internet draft.
The tool described in this report uses our temporary extension of the format specified in the current Internet Draft. We use it as a prototype to experiment with and assess whether it is the right way to go.
6 Example of use
We illustrate a typical usage of JTraceRoute on the following set of steps:
-
A user clicks on the JTraceRoute tool to start it. The traceroute window appears as shown in Figure. In this window the user can enter hostname or IP address of a destination node, the method of traceroute (currently only one method is supported - standard traceroute), whether bidirectional traceroute should be done, the directory where the resulting XML file should be stored and the maximum number of hops. The user can then start the actual test by clicking the "Trace" button. The test progress can be observed in the lower part of the window.
-
After the test finishes, the result of standard traceroute is stored in an XML file. The user can click on the "View results" button to visualise the network path as illustrated in Figure. If a bidirectional test was done, both directions are shown. It is actually not trivial to precisely correlate nodes in each direction, because we get different IP addresses belonging to different interfaces on routers and we do not have network masks of these addresses. JTraceRoute uses a simple heuristics that assumes that the two ends of one point-to-point link usually have close IP addresses because subnets with long masks are used on such links. The tool then displays each direction of the same link in parallel positions. Assumed assymetries are also displayed accordingly, as shown in Figure. Each interface has its IP address and measured RTT displayed.
A scroll-down list of previous measurements in the traceroute window can be used to easily select these measurements and to display their results in separate windows. As the results are stored in uniquely named XML files, the user can also just start JTraceRoute and visualise tests done in previous sessions without doing a new test.
-
The user can click on any of the interfaces in the visualization window, which opens an interface information window, as shown in Figure. The user can do two actions here. The first option is to run an external application with the IP address of the interface as its argument, see Figure. A list of commands specified in configuration is available in a scroll-down menu. By default, ping, nslookup and traceroute commands are available.
The second option is to run a perfSONAR query for the given interface. In this case, the user should first click on the "Query LS" button to ask the Lookup Service what are the types and addresses of Measurement Points that have information about the given interface, that is which are registered at the Lookup Service as serving the given interface, see Figure. The user can then select one of the found Measurement Point (see Figure) and one of the predefined XML queries for the selected Measurement Point (see Figure) from scroll-down lists. When the user clicks on the "Load" button, a predefined query is loaded from configuration to an editable window. The interface IP address is automatically inserted in a proper place. The user can edit the XML query and click on the "Run" button to send it to the Measurement Point. The response is displayed also as an XML message in the lower part of the window.
For example, Figure illustrates the result of the "Metadata Key Request" sent to the SNMP Measurement Point, which retrieves information about all types of SNMP queries supported by the Measurement Point. Figure shows results of the "Simple Data Request" sent to the SNMP Measurement Point, when we asked for the "ifInOctets" object. Finally, Figure shows the result of the "IP_SHOW_ROUTE" request sent to the Telnet/SSH Measurement Point, which used SSH connection to the router to execute the "show ip route" command for the given interface.
-
The user can repeat the previous step with different XML queries, different Measurement Points and different interfaces to obtain information about links and nodes in a network path.
7 Conclusion and future directions
We have proposed an extensible traceroute tool that provides detailed information about links and nodes in a network path in order to assist in locating limiting points. We proposed how performance information about components of a network path can be stored in an extensible way in XML files. We have implemented a prototype version of the designed tool. We are currently not able to detect nodes operating below L3. Each node needs to be queried by requests explicitely sent by the user and the number of nodes that are actually accessible by the tool is very limited. In the future, we plan to interact with the IPPM group within IETF to standardize a data storage format for an extensible traceroute, we plan to add possibility to query all nodes in a path automatically by the tool and we will cooperate with the JRA1 activity in the GN2 project that develops the perfSONAR framework in deploying Measurement Points and letting them accessible by our tool.
References
| [Ple06] | Pleva L. Advanced Traceroute, BS Thesis, 2006. |
| [Nic07] | Niccolini S. et al. Information Model and XML Data Model for Traceroute Measurements. Internet draft draft-ietf-ippm-storetraceroutes-07.txt, IETF, December 2007. Work in progress. |
![[Figure]](capture-traceroute.png)
![[Figure]](capture-perfsonar.png)
![[Figure]](capture-perfsonar-mps.png)
![[Figure]](capture-perfsonar-queries.png)
![[Figure]](capture-perfsonar-snmp-queries-desc.png)
![[Figure]](capture-perfsonar-snmp-result-desc.png)
![[Figure]](capture-perfsonar-ssh-result-desc.png)