Iproute2
Advanced Routing and iproute2 - the ip routing utility
iproute2.txt research document, details may be sketchy and incomplete. v0.02
Contents
General Notes and Information
ifconfig and route use iproute2 syscalls. However, the 'ip' tool is the key to access advanced iproute2 syscalls.
[ Usage examples of the ip tool ] ip link list : shows links including MAC but not IP address ip address show : more info including IP and queueing disipline ip route show : equivalent to route -n except uses iproute2
Address Resolution Protocol (ARP) resolves the hardware address of another machine on the same local network. ARP determines where a machine is at on the LAN (really it determines what the MAC is of that machine).
synonymous terminology: MAC address = location = hardware address
IP address is Layer 3 networking, Media Access Control (MAC) is Layer 2.
An IP address knows not the location of a machine, ARP does. Machines on the Internet have DNS names which resolve to IP addresses, not to be confused with knowing the location (MAC address).
When computer A wants to find computer B on the same LAN, computer A does an ARP broadcast with IP (layer 3). Computer B with matching IP answers with its MAC address. Further communication is layer 2 by MAC. The ARP entry for computer B remains in the ARP cache of computer A for a limited duration.
When computer A wants to find computer C, which is on another network across the Internet, 'A' knows the subnet is different from its own and therefore references its gateway (the router) and asks it for the location (MAC) of computer C. Through Internet routing the local router finds the remote router which is the gateway for computer C and asks it what the MAC address is for computer C. If the remote router does not know, it does an ARP broadcast on the LAN of computer C. Computer C answers telling its router its MAC address. That remote router then sends the MAC back to the local router for computer A and informs computer A of the MAC address for 'C'. Further communication is layer 2 from local machine to local router and layer 3 between routers.
synonymous terminology: arp cache = neighbor cache = neigh
[ Usage examples of ip relating to ARP ] ip neigh show : view current ARP cache table ip neigh delete X.X.X.X dev ethX : delete IP X.X.X.X from APR cache
Of the three routing tables which are part of iproute2, the traditional 'route' command only modifies the main table. The 'ip' tool can modify all three tables but modifies the main table by default.
ip route ls : shows only the main route table ip rule list : display current route rules and priority ip route list table local : shows necessary stuff in the local table ip route list table main : same as 'ip route ls' ip route flush cache : clear all route cache, do after modification
You can create your own route tables. To create a custom table:
echo 200 tablename >> /etc/iproute2/rt_tables
(above: creates a route table in rt_tables)
ip rule add from X.X.X.X table tablename
(above: source IP or computer on LAN)
ip route add default via Y.Y.Y.Y dev ethX table tablename
(above: assigns route for computer on LAN)
Example: Two Internet Providers - Multihoming
A business may have two Internet providers. To set up routing for two ISPs on linux consider this example a generic guide.
192.168.0.1 = IP of Internal network (irrelevant) on eth1 64.21.10.250 = IP of first ISP on eth0 gw 64.21.10.1 network 255.255.255.0 128.42.20.250 = IP of second ISP on eth2 gw 128.42.20.1 network 255.255.255.0
1. create two tables and set up routing
ip route add 255.255.255.0 dev eth0 src 64.21.10.250 table T1 ip route add default via 64.21.10.1 table T1 ip route add 255.255.255.0 dev eth2 src 128.42.20.250 table T2 ip route add default via 128.42.20.1 table T2
2. set up main routing table
ip route add 255.255.255.0 dev eth0 src 64.21.10.250 ip route add 255.255.255.0 dev eth2 src 128.42.20.250
3. set the preference for the default route
ip route add default via 64.21.10.1
4. routing rules for interfaces
ip rule add from 64.21.10.250 table T1 ip rule add from 128.42.20.250 table T2
- . load balancing between the two providers
ip route add default scope global nexthop via 64.21.10.1 dev eth0 weight 1 nexthop via 128.42.20.1 dev eth2 weight 1
Thu Oct 23 11:48:27 CDT 2003
ADSL Rate Limit Experiment
# All Rates are in Kbits, so in order to gets Bytes divide by 8 # e.g. 25Kbps == 3.125KB/s # # eth1 ==> is local network e.g. 192.168.0.0/24 # eth0 ==> is Internet network e.g. Your ISP DNLD=128Kbit DWEIGHT=15Kbit # DOWNLOAD Weight Factor ~ 1/10 of DOWNLOAD Limit UPLD=25KBit # UPLOAD Limit UWEIGHT=2Kbit # UPLOAD Weight Factor tc qdisc del dev eth0 root tc qdisc del dev eth1 root #tc qdisc add dev eth1 root handle 1: cbq avpkt 1500 bandwidth 100mbit #tc class add dev eth1 parent 1: classid 1:1 cbq rate 256kbit allot 1000 prio 5 bounded isolated #tc filter add dev eth1 parent 1: protocol ip prio 16 u32 match ip dst 192.168.30.83 flowid 1:1 #tc qdisc add dev eth0 root handle 10: cbq bandwidth 10Mbit avpkt 1000 mpu 64 #tc class add dev eth0 parent 10:0 classid 10:1 cbq rate $UPLD weight $UWEIGHT allot 1514 prio 1 avpkt 1000 bounded #tc filter add dev eth0 parent 10:0 protocol ip handle 3 fw flowid 10:1 #tc qdisc add dev eth1 root handle 11: cbq bandwidth 100Mbit avpkt 1000 mpu 64 #tc class add dev eth1 parent 11:0 classid 11:1 cbq rate $DNLD weight $DWEIGHT allot 1514 prio 1 avpkt 1000 bounded #tc filter add dev eth1 parent 11:0 protocol ip handle 4 fw flowid 11:1 echo "1" tc qdisc add dev eth0 root handle 10: cbq bandwidth 10Mbit avpkt 1000 mpu 64 tc class add dev eth0 parent 10:0 classid 10:1 cbq rate $UPLD weight $UWEIGHT allot 1514 prio 1 avpkt 1000 bounded tc filter add dev eth0 parent 10:0 protocol ip prio 16 u32 match ip dst 192.168.30.83 flowid 10:1 echo "2" tc qdisc add dev eth1 root handle 11: cbq bandwidth 100Mbit avpkt 1000 mpu 64 tc class add dev eth1 parent 11:0 classid 11:1 cbq rate $DNLD weight $DWEIGHT allot 1514 prio 1 avpkt 1000 bounded tc filter add dev eth1 parent 11:0 protocol ip prio 16 u32 match ip dst 192.168.30.83 flowid 11:1
Bandwidth Limiting with IP Masquerade
Introduction
This document is meant for IP Masquerade users who want to limit specific host's bandwidth. The example made throughout the document is a aDSL line (640Kbits download / 160Kbits upload) where the DHCP hosts of the subnet are bandwidth limited and also forced through a cachine proxy.
My router specs are as follows:
- Distribution:
- Debian 2.2 (stable)
- Technically, its Debian 3.0 (unstable), buy only the sysinit scripts, bind, sendmail, dhcp, nfs, samba and X are kept at the bleeding edge. (All the major 'Services').
- Hardware:
- CPU: 466Mhz Celeron
- RAM: 192MB
- DISK: 40GB 7200 RPM Maxtor ATA100, 2GB IBM 5400 RPM ATA33
- Services:
- DHCP Server
- DNS Server
- Mail Server
- IP Masquerade (port fowarding also)
- Caching Proxy (Squid)
- Samba server
- NFS Server
Requirements
The requirements here will only pertain to setting up bandwidth limiting and NOT IP Masquerading, DHCP Server, NFS Server, etc.
Kernel Requirements:
All IPTABLES support needed for IP Masquerade plus
CONFIG_IP_NF_CONNTRACK CONFIG_IP_NF_TARGET_MARK --> This is for marking packets. We are going to mark the packets we want limited with this
QoS and/or fair queueing
CONFIG_NET_SCH_CBQ CONFIG_NET_CLS_FW
Now I usually compile all the IPTABLES and QoS and/or fair queueing stuff as modules, but lsmod only shows those above as in use.
Software Requirements:
- IPTABLES --> http://www.iptables.org/
- iproute2 --> ftp://ftp.inr.ac.ru/ip-routing/ --> Most distributions come with this defaultly installed
Setting Bandwidth
I set my bandwidth in a shell script for the purpose of having it set on boot time and for ease and stopping and starting.
#!/bin/bash # # All Rates are in Kbits, so in order to gets Bytes divide by 8 # e.g. 25Kbps == 3.125KB/s # TC=/sbin/tc DNLD=150Kbit # DOWNLOAD Limit DWEIGHT=15Kbit # DOWNLOAD Weight Factor ~ 1/10 of DOWNLOAD Limit UPLD=25KBit # UPLOAD Limit UWEIGHT=2Kbit # UPLOAD Weight Factor tc_start() { $TC qdisc add dev eth0 root handle 11: cbq bandwidth 100Mbit avpkt 1000 mpu 64 $TC class add dev eth0 parent 11:0 classid 11:1 cbq rate $DNLD weight $DWEIGHT allot 1514 prio 1 avpkt 1000 bounded $TC filter add dev eth0 parent 11:0 protocol ip handle 4 fw flowid 11:1 $TC qdisc add dev eth1 root handle 10: cbq bandwidth 10Mbit avpkt 1000 mpu 64 $TC class add dev eth1 parent 10:0 classid 10:1 cbq rate $UPLD weight $UWEIGHT allot 1514 prio 1 avpkt 1000 bounded $TC filter add dev eth1 parent 10:0 protocol ip handle 3 fw flowid 10:1 } tc_stop() { $TC qdisc del dev eth0 root $TC qdisc del dev eth1 root } tc_restart() { tc_stop sleep 1 tc_start } tc_show() { echo "" echo "eth0:" $TC qdisc show dev eth0 $TC class show dev eth0 $TC filter show dev eth0 echo "" echo "eth1:" $TC qdisc show dev eth1 $TC class show dev eth1 $TC filter show dev eth1 echo "" } case "$1" in start) echo -n "Starting bandwidth shaping: " tc_start echo "done" ;; stop) echo -n "Stopping bandwidth shaping: " tc_stop echo "done" ;; restart) echo -n "Restarting bandwidth shaping: " tc_restart echo "done" ;; show) tc_show() ;; *) echo "Usage: /etc/init.d/tc.sh {start|stop|restart|show}" ;; esac exit 0
Now lets go through that.
At the top are some variables which make it easier to change the entire script with on change.
THIS SCRIPT IS ONLY ** ONE ** RULESET.
You can add as many rules as you like. I don't run a ISP, I only needed to restrict a few people.
This scripts assumes 2 things.
- eth0 ==> is local network e.g. 192.168.0.0/24
- eth1 ==> is Internet network e.g. Your ISP
Stop, Restart, and Show are self explanitory. I will not explain them.
Start:
$TC qdisc add dev eth0 root handle 11: cbq bandwidth 100Mbit avpkt 1000 mpu 64 $TC class add dev eth0 parent 11:0 classid 11:1 cbq rate $DNLD weight $DWEIGHT allot 1514 prio 1 avpkt 1000 bounded $TC filter add dev eth0 parent 11:0 protocol ip handle 4 fw flowid 11:1 $TC qdisc add dev eth1 root handle 10: cbq bandwidth 10Mbit avpkt 1000 mpu 64 $TC class add dev eth1 parent 10:0 classid 10:1 cbq rate $UPLD weight $UWEIGHT allot 1514 prio 1 avpkt 1000 bounded $TC filter add dev eth1 parent 10:0 protocol ip handle 3 fw flowid 10:1
First the top 3 three lines are for the Download bandwidth.
The first line creates and "PARENT" qdisc:
--> $TC qdisc add dev eth0 root handle 11: cbq bandwidth 100Mbit avpkt 1000 mpu 64
man tc for more information, but basically its the device definition. 100Mbit Netgear card.
Next line creates the child qdisc with will have the actual download limit in it.
--> $TC class add dev eth0 parent 11:0 classid 11:1 cbq rate $DNLD weight $DWEIGHT allot 1514 prio 1 avpkt 1000 bounded
man tc and friends again if you need any more definition than that. Personally I don't know what all it means or any optimal settings.
I just used the suggested man page defaults.
The last line and VERY important line, sets the handle on the child qdisc.
It sets the handle to 4. Later we will use iptables to MARK the download packets with a 4 mark, and thats how the child qdisc will know which packets to limit and which to let alone.
The Bottom three lines are for the uploading bandwidth.
Its the same as the download, but the device has changed from eth0 to eth1, the variables have changed from $DNLD to UPLD etc, and the handle has changed from 4 to 3.
Marking Packets for Limiting
Ok now you are ready to mark packets as they come through in order to limit them.
I just added the next few lines to my rc.firewall.
# Mark packets to route # Upload marking $IPTABLES -t mangle -A FORWARD -s 192.168.0.128/29 -j MARK --set-mark 3 $IPTABLES -t mangle -A FORWARD -s 192.168.0.6 -j MARK --set-mark 3 # Download marking $IPTABLES -t mangle -A POSTROUTING -s ! 192.168.0.0/24 -d 192.168.0.128/29 -j MARK --set-mark 4 $IPTABLES -t mangle -A POSTROUTING -s ! 192.168.0.0/24 -d 192.168.0.6 -j MARK --set-mark 4
Here I mark the following hosts:
192.168.0.6 192.168.0.128/29 => 192.168.0.128 192.168.0.129 192.168.0.130 192.168.0.131 192.168.0.132 192.168.0.133 192.168.0.134 192.168.0.135
I hope no one needs help with Subnet masks. :)
Downloads are marked 4 and uploads are marked 3
29 means 29 out of the 32 bits are marked. i.e. 11111111.11111111.11111111.11111000 is the subnet mask ==> 255.255.255.248 which means 00000111 hosts allowed in subnet or 8 hosts: 0 through 7 + 192.168.0.128 SIMPLE.
Bandwidth Monitoring
Freshmeat.
These following tools are recommended:
- bwm ==> very simple, ncurses based, for quick and easy overall network summary.
- iptraf ==> very robust, ncurses based, my favorite, without kernel patch it lets you monitor specific host based on MAC Addresses
- connmon ==> ncurses and gtk interfaces. With kernel patch you can monitor individual ip bandwidths
Conclusion
It took me a long time to figure this crap out.
I learned plenty from the cbq-init script from freshmeat. thanx.
Email if you have anymore questions, fixes, complaints.
Written by Joe Roback <joe@roback.cc >
References
- Anubis ISPToolz - Anubis ISP Tools: Bandwidth & IP System
- Software and Tools - Traffic Control HOWTO
- Implementation - ADSL Bandwidth Management HOWTO
- Bandwidth Limiting with IP Masquerade - Howto *
- Linux.com - The Ultimate Traffic Conditioner: Low Latency, Fast Up & Downloads