Cable Modem Troubleshooting Tips


 

On this page:


Traceroute

The traceroute tool is a valuable aid to network troubleshooting. It also is the most commonly misinterpreted diagnostic tool, capable of raising false alarms when nothing is wrong, and equally capable of showing no problems when quite a lot is wrong.

Before using traceroute, ensure that your network connection is idle (no download or upload), and that there is no background activity which might put traffic on your network connection. Any network traffic on a rate-capped service such as a cable modem will cause spuriously slow timings in traceroute.

To use the traceroute tool in Windows, open a permanent command-line window. In Windows 9x/ME, make the window 50 lines high by giving the command mode con co80,50. In Windows 2K/XP, stretch the window by dragging its bottom margin. Now give the command tracert followed by the DNS name or IP number of the host to which you wish to discover the route. Here is an example:

C:\>tracert www.uu.net

Tracing route to www.uu.net [208.243.117.123]
over a maximum of 30 hops:

  1    14 ms   <10 ms    14 ms  172.19.83.254
  2    14 ms   <10 ms   <10 ms  ren-cam2-a-fa41.inet.ntl.com [62.252.131.69]
  3   426 ms   494 ms    55 ms  ren-core-a-pos1000.inet.ntl.com [62.252.128.149]
  4     *       27 ms    28 ms  win-bb-a-atm100-808.inet.ntl.com [62.253.187.130]
  5   110 ms   110 ms   124 ms  mae-east-gw1-atm410-1.inet.ntl.com [194.168.118.238]
  6   151 ms   138 ms   137 ms  902.Serial3-1-1.GW1.TCO1.ALTER.NET [157.130.33.57]
  7   124 ms   124 ms   123 ms  717.GW1.TCO3.customer.ALTER.NET [157.130.32.206]
  8   123 ms   138 ms   151 ms  uu123.web.uu.net [208.243.117.123]

Trace complete.

Each line is labelled with a hop number. Three probes were sent for each hop, and the round trip times (RTT), there and back, for each probe is given in milliseconds (thousandths of a second). If a probe is sent and no reply is received, the RTT is replaced with an asterisk. The IP number of the replying router is given on the extreme right, and if that IP number could be resolved to a DNS name, that is shown too. The final line is for the host given in the command.

Apple Mac OS X 8.x/9.x users can perform traceroutes with utilities such as WhatRoute (on the Mac OS 9.1 CDROM), Interarchy or IPNetMonitor.

Under Linux and Mac OS X, the command is spelt traceroute.


How traceroute works

All IP packets are initially sent with a Time-To-Live (TTL) field set to a suitable number. Each time an IP packet passes through an IP router, its TTL is reduced by one. When the TTL reaches zero, the packet is discarded. This strategy is to prevent IP packets circulating endlessly on the internet in the event of a router misconfiguration leading to a circular route.

Each traceroute query is sent with a very small TTL. When the query's TTL reaches zero, instead of being totally discarded, an ICMP TTL-exceeded warning packet is returned to the sending address, where the traceroute program is waiting to time its return. For the first hop, the TTL is initialised to one, three queries are sent and timed. Then the initial TTL is increased by one, three queries sent and timed, and so on. This repeats until the IP number that replies to the query does so with an echo instead of an ICMP TTL-exceeded. This indicates that it is the sought host, rather than an intermediate router.


How traceroute can mislead

  1. Windows PCs have poor clocks, and are incapable of measuring with a resolution of a millisecond. Study of the RTTs above will show they are strongly quantised to be close to multiples of 6.8 ms. This means that slight differences in RTTs can either (a) show an apparent 7 ms jump, or (b) show no change at all. But nothing inbetween. Windows 2000 appears to have an even worse resolution of 10 ms. Windows XP seems better than all previous versions.
  2. The route displayed is only the outward route taken by the probes. It is impossible to infer by what route the replies came back. It is possible for the return route to be different to the outward route, and involve a different number of hops. It is possible for every router shown to have a different return path to the sender. It is even possible for each of the three replies from one single router to take a different return path to the sender. Those different return routes will add their own different contributions to the RTTs shown. This can lead to sudden steps in RTT up or down compared with the previous hop or the next hop, or to inconsistencies between the three RTT figures for one router.
  3. Modern routers give priority to real data packets, and very low priority to returning TTL-exceeded ICMP packets. So RTTs on intermediate routers can be anomalously high, such as those shown on hop 3 above. Routers can drop these packets altogether, as in hop 4. In both cases, reasonable RTTs and lack of packet loss on all the subsequent queries which must have passed through these routers in order to produce results for later hops show that the routers at hops 3 and 4 are in reasonable working order. The RTTs on the final line will be more reliable, because they are echoes from the traced host, rather than a router. The RTT on the final line is the only one that matters when checking latency times to game servers etc.
  4. The speed of light in fibre or copper cables adds significantly to the RTTs on long hops. This is called propagation delay. Using the most optimistic assumptions, every 100 miles of cable will add at least 1 ms to the RTT, and in practice much more, as the signal has to pass through repeater amplifiers on the way, each delaying the signal. In practice, a hop which crosses the Atlantic will cost at least 80 ms in RTT. Thus, in the above example, hop 4 is in Winnersh UK, and hop 5 is in Washington DC, USA, so a sudden increase in RTT is normal.
  5. Large RTTs do not necessarily mean poor performance. The TCP pacing algorithms are quite capable of maintaining optimal traffic volume over very long paths, provided your TCP Receive Window (RWIN) is large enough (see Networking tweaks). The Receive Window must be of size in bytes at least equal to the product of your rated download speed in bytes per second (512 kbps = 64000 bytes per second) times the RTT on the path in seconds. The consequence of the TCP Receive Window being even a little too small is a catastrophic reduction in throughput, as the TCP pacing algorithm is destroyed. In the above example, to get good FTP downloads from www.uu.net (with a worst-case RTT of 151 msecs), you would need a Receive Window of at least 64000*0.151 = 9664 bytes. This is larger than the default Windows 9x setting, so you will get poor speeds on an untweaked system. Windows ME/2000 and Macs will be fine. If you want to maintain download performance while simultaneously uploading, you need to add to the RTT the likely delay that an ACK packet suffers while waiting for an upload packet to be transmitted (1518 bytes at 128 kbps, say 100 msecs to be safe).
  6. Small RTTs do not necessarily mean good performance. It is possible for a path to give low RTTs even when it is close to saturation, or suffering packet loss.
  7. Some routers by design do not respond to traceroute queries. These will give a row of three asterisks and no address, but the trace might recover at a later point and start giving results again. In particular, it will eventually show the target host.
  8. Some corporates block all traceroute queries at their firewall. In this case, the traceroute will give rows of three asterisks for ever, and never reach the target host.
  9. Never blame a router for something which can be adequately explained by congestion on the cable to the router from the previous hop.

Given all these provisos, you might wonder what use traceroute is. Well, it does reliably discover the outward route. But you should be extremely careful about inferring network problems from traceroute reports. Certainly, on its own, traceroute is not a reliable measure of packet loss, as it tries only three times to each hop. However, if indications of packet loss (asterisks) start at a particular hop and continue for every hop thereafter, then this is indicative of packet loss starting at the first affected hop, or the cable leading to it from the previous (good) hop.


Misleading traceroutes to certain NTL servers

Traceroutes to certain NTL servers always look wrong when there is nothing in fact wrong. For instance:

C:\WINDOWS>tracert www.ntl.com

Tracing route to www.ntl.com [212.250.5.110]
over a maximum of 30 hops:

  1    18 ms    19 ms    72 ms  172.16.231.254
  2    17 ms    15 ms    14 ms  cam-cam1-a-fa00.inet.ntl.com [62.253.129.1]
  3    21 ms    13 ms    15 ms  cam-core-a-pos200.inet.ntl.com [62.253.128.133]
  4    27 ms    24 ms    25 ms  lng-bb-a-atm100-808.inet.ntl.com [62.253.187.42]
  5    21 ms    28 ms    24 ms  gfd-bb-a-so-110-0.inet.ntl.com [62.253.188.246]
  6    27 ms    46 ms    24 ms  gfd-dc-a-ge400.inet.ntl.com [62.253.188.154]
  7    23 ms    29 ms    62 ms  gfd-alder-fa10.inet.ntl.com [194.168.2.1]
  8    21 ms    23 ms    20 ms  62.252.0.98
  9     *        *        *     Request timed out.
 10     *        *        *     Request timed out.
 11     *        *        *     Request timed out.
 12     *        *        *     Request timed out.
 13     *        *        *     Request timed out.
 14     *        *        *     Request timed out.
 15     *        *        *     Request timed out.
 16     *        *        *     Request timed out.
 17    17 ms     9 ms    24 ms  www.ntl.com [212.250.5.110]

Trace complete.

The server at www.ntl.com has a defect in its TCP/IP implementation in that it sets the TTL of an ICMP echo to be the same as the residual TTL of the ICMP query. This means that the ICMP echo will be dropped by some router in the return path, until such time as the originating traceroute program has increased the TTL on the query packet to account for the number of hops in the return path as well as the outward path. This fools the traceroute program into inventing hops 9 to 16, which really don't exist: www.ntl.com is in fact directly connected to the router in hop 8. It is characteristic of this defect that one sees the same number of missing hops (8 in this case) as the number of genuine hops before the gap, assuming the return path is the same as the outward one. There is no evidence of any network problem in the above traceroute.

Some Blueyonder servers, such as smtp.blueyonder.co.uk, exhibit the same defect.


Speeding up traceroutes in Windows

In the example traceroute above, the local UBR's private IP address in hop 1 will be slow to appear, because Windows does a reverse DNS lookup on the IP number, but gets no reply from the DNS system, because addresses in the range 172.16.0.0 - 172.31.255.255 and 10.xxx.xxx.xxx are private IP addresses (rather than public internet IP addresses) and are not registered in the DNS system. There will be similar slow responses for all hops with IP addresses in private ranges.

You can provide a substitute quick look-up for such addresses as follows.

First discover the private IP address of your local UBR: see Finding the UBR address. In this example we shall assume it is 172.19.83.254.

In Windows 9x/ME, copy the file C:\WINDOWS\HOSTS.SAM to C:\WINDOWS\HOSTS. It is difficult to create files with no (hidden) extension after the filename using the Windows graphical interface, so this is best done in an MS-DOS window by giving the commands

    cd \WINDOWS
    copy HOSTS.SAM HOSTS

In Windows 9x/ME, open the file C:\WINDOWS\HOSTS in a plain-text editor.

In Windows 2000, open the file c:\winnt\system32\drivers\etc\hosts in a plain-text editor.

A suitable plain-text editor is the command line edit. Do not use word-processors such as WordPad or Word. NotePad would be fine provided you know how to defeat its tendency to add a hidden .txt extension to filenames when it saves (the trick, in the Save As dialog, is to quote the filename in double-quotes, e.g. "HOSTS").

The last line of the file will look like:

127.0.0.1       localhost

Add a line after it, so that they look like:

127.0.0.1       localhost
172.19.83.254   My-UBR

where the 172 address you actually use is the one you discovered above. Save the changed file. Make quite sure that it has been saved as a file called hosts or HOSTS with no .txt extension, even when you look at with the dir command. Restart Windows, and try a tracert. You should find that the first hop of a traceroute now looks like:

 1    14 ms   <10 ms    14 ms  My-UBR [172.19.83.254]

and appears almost instantly. If other hops in the traceroute are also private addresses (in the ranges 192.168.xxx.xxx, 172.xx.xxx.xxx or 10.xxx.xxx.xxx) you can repeat this process for all such addresses to speed up traceroute results, using invented names for them.


Getting reverse traceroutes

Because a traceroute reveals only the outward path, leaving the return path unknown, investigation of networking problems between your PC and a specific host ideally require a traceroute from that host back to your own PC. For normal users, this is not in general possible. The next best thing is to send your web browser to www.traceroute.org, choose a site close (in terms of internet topology) to the site under investigation, and run a traceroute back to your PC.


Partial reverse routes using ping

It is possible to discover the last 9 routers that packets passed through on their way back to you using the record route feature of the ping command. The options required to select record route vary according to operating system. Here is an example under Windows:

C:\>ping -r 9 -a www.dslreports.com

Pinging dslreports.com [209.123.109.175] with 32 bytes of data:

Reply from 209.123.109.175: bytes=32 time=109ms TTL=245
    Route: cmbg-cmbg-ubr-2-ge20.inet.ntl.com [80.1.202.134] ->
           cmbg-t2cam1-b-ge-wan31.inet.ntl.com [80.1.201.154] ->
           cam-t2core-b-pos31.inet.ntl.com [62.253.188.198] ->
           nth-bb-b.inet.ntl.com [194.145.149.116] ->
           nth-bb-a.inet.ntl.com [194.145.149.115] ->
           gfd-bb-b.inet.ntl.com [194.145.149.4] ->
           gfd-bb-a.inet.ntl.com [194.145.149.3] ->
           linx-gw1.router.ntli.net [195.66.224.22] ->
           a1-0-118.core1.ltn.nac.net [209.123.11.218]
Reply from 209.123.109.175: bytes=32 time=103ms TTL=245
    Route: cmbg-cmbg-ubr-2-ge10.inet.ntl.com [80.1.202.6] ->
           cmbg-t2cam1-a-ge-wan31.inet.ntl.com [80.1.201.26] ->
           cam-t2core-a-pos31.inet.ntl.com [62.253.188.194] ->
           pop-bb-a.inet.ntl.com [194.145.149.105] ->
           pop-bb-b.inet.ntl.com [194.145.149.106] ->
           linx-gw1.router.ntli.net [195.66.224.22] ->
           a1-0-118.core1.ltn.nac.net [209.123.11.218] ->
           vlan3.msfc1.oct.nac.net [209.123.109.129] ->
           www.dslreports.com [209.123.109.175]
Reply from 209.123.109.175: bytes=32 time=113ms TTL=245
    Route: cmbg-cmbg-ubr-2-ge20.inet.ntl.com [80.1.202.134] ->
           cmbg-t2cam1-b-ge-wan31.inet.ntl.com [80.1.201.154] ->
           cam-t2core-b-pos31.inet.ntl.com [62.253.188.198] ->
           nth-bb-b.inet.ntl.com [194.145.149.116] ->
           lee-bb-a.inet.ntl.com [194.145.149.35] ->
           pop-bb-b.inet.ntl.com [194.145.149.106] ->
           linx-gw1.router.ntli.net [195.66.224.22] ->
           a1-0-118.core1.ltn.nac.net [209.123.11.218] ->
           vlan3.msfc1.oct.nac.net [209.123.109.129]

Note that in the above example, three successive ping replies each took a different return route back to me in my ISP's network. This demonstrates how futile it can be trying to make sense of traceroute timings, because return routes can be so different for successive packets.

For Record Route to work, the remote host pinged must support the setting of the Record Route flag in the ping reply: not all hosts support this.

Under Unix-derived systems, the option to request a return route is -R rather than -r, but there might be requirements for additional simultaneous options: check man ping. In some distributions (including Mac OS X), the command ping -R is broken and always gives Invalid argument.

Mac OS X users can request a Record Route using the Trace Route tool of the IPNetMonitorX utility: use the pull down option UDP Trace/ICMP Trace/Record Route.


Return to Index.