archiveInternet Archivearchive

Load Balancing with Multiple T1s and cisco Routers.

Tim Pozar - Network Architect

Mon Apr 10 10:39:25 PDT 2000 -
See bottom of page for new Cisco IOS commands for load balancing! - Tim

Internet Archive is presently connected to the rest of the world via two T1s (aka DS1) lines, nominally capable of supporting 1.5 Mb/Sec each flowing in and out simultaneously for a total of a little more than 3 Mb/Sec in each direction. Getting two T1 lines running so they effectively add together as a 3 Mbit pipe was a study in cisco router technology.

Cisco routers use two primary methods for determining what interface to send a packet. Either "processor" or "fast" switching. When the router is configured for "processor switching" the incoming packets are queued up and examined where they need to go via the routing tables.

If the router is configured for "fast switching" a packet coming through the router is examined for the destination IP address as it is coming in and then shoved out the interface that the destination address best matches in the route cache. Route caches "bolt in" the routing per interface. This reduces the processor time and moves the packet quicker through the router.

Both of these methods work fine when there is only one direct path to the routes. Although "fast switching" actually wins out with one path. When there are two or more parallel connections, then we encounter some problems. Let's look as the differences and some some configuration files.

When configuring two or more parallel link the router needs to see all of the interfaces in the same "net" to treat them equal. Here is a snippet from a cisco configuration file showing this:

interface Serial0
 description The first serial line to our provider...
 ip address 140.174.188.18 255.255.255.240
 encapsulation hdlc
!
interface Serial1
 description The second serial line to our provider...
 ip address 140.174.188.20 255.255.255.240
 encapsulation hdlc
!
! default to the outside world...
ip route 0.0.0.0 0.0.0.0 140.174.188.17
ip route 140.174.0.0 255.255.0.0 140.174.188.17
!
Note that both serial interfaces have a netmask of 255.255.255.240 and both IP addresses fall in this "net". The router will treat these serial lines as going to the same place. We can put up to 7 lines in this netmask with each line taking two IP numbers. In this "net" we just need to avoid 140.174.188.16 and 140.174.188.31. The default route is pointing to an IP address assigned to the other end of one of the links.

As it is configured above, cisco routers will default to fast switching. A route cache will be created. You can check it with the "show ip cache" command. You will see something like:

gw>show ip cache
IP routing cache 1475 entries, 213200 bytes
Minimum invalidation interval 2 seconds, maximum interval 5 seconds,
   quiet interval 3 seconds, threshold 0 requests
Invalidation rate 0 in last second, 0 in last 3 seconds
Last full cache invalidation occurred 0:00:00 ago

Prefix/Length       Age       Interface       Next Hop
1.0.0.0/8           13:36:07  Serial1         140.174.188.17
10.0.0.0/8          0:09:13   Serial0         140.174.188.17
13.0.0.0/8          0:29:43   Serial1         140.174.188.17
[...]
44.0.0.0/8          1:15:28   Serial0         140.174.188.17
128.6.0.0/16        1:03:39   Serial0         140.174.188.17
128.9.0.0/16        0:20:21   Serial1         140.174.188.17
128.32.0.0/16       0:18:24   Serial1         140.174.188.17
[...]
206.13.122.0/24     1:05:44   Serial1         140.174.188.17
206.14.154.26/32    0:01:48   Ethernet0       206.14.154.26
206.14.154.27/32    0:04:29   Ethernet0       206.14.154.27
206.14.154.180/32   0:12:18   Ethernet1       206.14.154.180
206.14.154.182/32   0:13:19   Ethernet1       206.14.154.182
206.14.165.0/24     0:13:38   Serial1         140.174.188.17
[...]

With two serial ports the router sees various networks split between the two ports. It also will show host addresses on the ethernet.

Packets going from the Internet Archive to the net are reasonably distributed between the two serial lines, but what if our provider used route caching towards us? If it was turned on for us we would see something like:

gw0-sf-tlg>show ip cache
IP routing cache 22174 entries, 3567320 bytes
   8958580 adds, 8936406 invalidates, 23907 refcounts
Minimum invalidation interval 2 seconds, maximum interval 5 seconds,
   quiet interval 3 seconds, threshold 0 requests
Invalidation rate 0 in last second, 0 in last 3 seconds
Last full cache invalidation occurred 01:23:20 ago

Prefix/Length       Age       Interface       Next Hop
4.0.0.0/8           01:13:53  Hssi3/0         206.86.228.89
[...]
140.174.38.11/32    00:59:31  Ethernet0/1     140.174.125.2
[...]
206.14.154.160/32   00:02:43  Serial10/0      206.14.154.18
206.14.154.164/32   00:03:13  Serial10/2      206.14.154.18
206.14.154.180/32   00:28:52  Serial10/2      206.14.154.18
206.14.154.25/32    00:11:00  Serial10/0      206.14.154.18
So each machine is associated with an individual interface. This means that all the traffic for a machine will go out only one interface until the cache times out and the traffic may or may not switch to the other T1. So traffic for a single machine would be limited to the speed of a single serial interface. In the case of Internet Archive, this is not what we are looking for as we need the bandwidth of multiple T1s. So with the command "no route-cache" applied on these interfaces we can turn this "feature" off. Below is a sample configuration for these interfaces on our ISP's router.

!
interface Serial10/0
 description archive A2.2 
 encapsulation hdlc 
 ip address 140.174.188.19       255.255.255.240
 no ip route-cache
!
interface Serial10/2
 description archive A1.2
 ip address 140.174.188.17       255.255.255.240
 encapsulation hdlc
 no ip route-cache
!
Now when we type "show ip cache" on our ISP's router we get:
gw0-sf-tlg>show ip cache 206.14.154.0 255.255.255.0
IP routing cache 20132 entries, 3234340 bytes
   9081327 adds, 9061195 invalidates, 21355 refcounts
Minimum invalidation interval 2 seconds, maximum interval 5 seconds,
   quiet interval 3 seconds, threshold 0 requests
Invalidation rate 0 in last second, 0 in last 3 seconds
Last full cache invalidation occurred 03:49:09 ago

Prefix/Length       Age       Interface       Next Hop

gw0-sf-tlg>
Nothing... But if we do a "show ip route" a couple of times we get:

gw0-sf-tlg>show ip route 206.14.154.0 255.255.255.0
Routing entry for 206.14.154.0/24
  Known via "static", distance 1, metric 0
  Redistributing via ospf 2, rip
  Advertised by rip route-map IGP-TO-EGP
  Routing Descriptor Blocks:
    140.174.188.20
      Route metric is 0, traffic share count is 1
  * 140.174.188.18
      Route metric is 0, traffic share count is 1

gw0-sf-tlg>show ip route 206.14.154.0 255.255.255.0
Routing entry for 206.14.154.0/24
  Known via "static", distance 1, metric 0
  Redistributing via ospf 2, rip
  Advertised by rip route-map IGP-TO-EGP
  Routing Descriptor Blocks:
  * 140.174.188.20
      Route metric is 0, traffic share count is 1
    140.174.188.18
      Route metric is 0, traffic share count is 1
The "*" indicates the preferred path at that moment. The two times we checked it moved between "10/2" to "10/0". As the router is configured now a packet will come in and the router will put the packet in the queue that is the shortest of each of the interfaces going towards Internet Archive. The advantage now we will get full use of both serial lines for each machine. As mentioned earlier the down side is some additional processor load on the router. Since this is only two interfaces on a 7513 we expect that the load is very small.

If you go back to the graph that shows our load on the serial lines you can see the difference between a route cached and non-route cached traffic. The incoming traffic from our ISP is non-route cached. So each packet will either take the first or second line depending on which queue is shorter. With this configuration the incoming lines pretty much track each other.

Since we are using route caching on our outgoing lines and we are crawling many sites at the same time, the outgoing traffic is about equal but you will see some differences when more traffic goes to one site. Since this the output traffic's bandwidth is so low and we are not hitting the limit of our lines, it is not a concern to take off route caching at this point..

Below is a graph of the traffic on the serial lines when we turned up the second line.

2nd T1 graph
Note that one line or the other is taking "the load" of the crawl from our machine. After a while the route-cache will flip over to the other serial line. Once we turned off route-caching the load is spread between the two lines.

More details regarding routing over parallel paths can be found from cisco in:
"Routing Theory for IP Over Equal Paths" - http://cio.cisco.com/warp/public/105/27.html
"Load Balance on Parallel Lines" - http://cio.cisco.com/warp/public/105/33.html

If you are interested in getting more details about this page, send mail to pozar@lns.com.



Mon Apr 10 10:39:25 PDT 2000 -
Cisco has a new command called:
ip load-sharing per-packet
As per their documentation they describe it as:

Configure Per-Packet Load Balancing

As "process-cache" can cause some CPU overhead on routers this may be a better alternitive.

Keep in mind you may need to enable "Cisco Express Forwarding" before this command will work. Please see the documentation from cisco.


back to papers...