linkdelays.html 9.27 KB
Newer Older
1
2
3
4
5
6
<!--
   EMULAB-COPYRIGHT
   Copyright (c) 2000-2003 University of Utah and the Flux Group.
   All rights reserved.
  -->
<center>
7
<h2>End Node Traffic Shaping and Multiplexed Links</h2>
8
9
10
</center>

<h3>Per-Link Delays:</h3>
Leigh B. Stoller's avatar
Leigh B. Stoller committed
11
12
13
14
In order to conserve nodes, it is possible (when using FreeBSD) to
specify that instead of doing traffic shaping on separate delay nodes
(which eats up a node for every two links), it be done on the nodes
that are actually generating the traffic. Just like normal delay
15
16
17
18
19
20
21
22
nodes, end node (sometimes called per-link) traffix shaping uses IPFW
to direct traffic into the proper Dummynet pipe. On each node in a
duplex link or lan, a set of ipfw rules and Dummynet pipes is
setup. As traffic enters or leaves your node, ipfw looks at the packet
and stuffs it into the proper Dummynet pipe. At the proper time,
Dummynet takes the packet and sends it on its way. To specify this in
your NS file, simply setup a normal link or lan, and then mark it as
wanting to use end node traffic shaping. For example:
23
24
25
26
    <code><pre>
    set link0 [$ns duplex-link $nodeA $nodeD 50Mb 0ms DropTail]
    set lan0  [$ns make-lan "nodeA nodeB nodeC" 0Mb 100ms]

27
28
    tb-set-endnodeshaping $link0 1
    tb-set-endnodeshaping $lan0 1		</code></pre>
29
30
31
32
33
34
35
36
37
38
39

Please be aware though, that the kernel is different than the standard
kernel in a couple of ways.
<ul>
<li> The kernel runs at a 1000HZ clockrate instead of 100HZ. That is,
the timer interrupts 1000 times per second instead of 100. This finer
granularity allows Dummynet to do a better job of scheduling packets.

<li> IPFW and Dummynet are compiled into the kernel, which affects the
network stack; all incoming and outgoing packets are sent into ipfw to
be matched on.
40
41
42
43

<li> Flow-based IP forwarding is turned off. This is also known as
IP <em>fast forwarding</em> in the FreeBSD kernel. Note that
regular IP packet forwarding is still enabled. 
44
45
</ul>

46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
If you would like to use end node traffic shaping globally, without
having to specify per link or lan, then you can put this in your NS
file:
    <code><pre>
    tb-use-endnodeshaping   1 		</code></pre>

If you would like to specify non-shaped links, but perhaps control the
shaping parameters later (increase delay, decrease bandwidth, etc.)
after the experiment is swapped in, you can put this in your NS file:
    <code><pre>
    tb-force-endnodeshaping   1 	</code></pre>

<h3>Multiplexed Links:</h3>
Another feature we have added is <em>multiplexed</em> (sometimes called
<em>emulated</em>) links. An emulated link is one that can be multiplexed
over a physical link along with other links. Say your
62
63
64
65
66
67
experimental nodes have just 1 physical interface (call it fxp0), but
you want to create two duplex links on it:
    <code><pre>
    set link0 [$ns duplex-link $nodeA $nodeB 50Mb 0ms DropTail]
    set link1 [$ns duplex-link $nodeA $nodeC 50Mb 0ms DropTail]

68
69
    tb-set-multiplexed $link0 1
    tb-set-multiplexed $link1 1		</code></pre>
70

71
Without multiplexed links, your experiment would not be mappable since
72
there are no nodes that can support the two duplex links that nodeA
73
requires; there is only one physical interface. Using multiplexed links
74
75
76
however, the testbed software will assign both links on NodeA to one
physical interface. That is because each duplex link is only 50Mbs,
while the physical link (fxp0) is 100Mbs. Of course, if your
77
application actually tried to use more than 50Mbs on each multiplexed link,
78
79
there would be a problem; a flow using more than its share on link0
would cause packets on link1 to be dropped when they otherwise would
80
not be. (<b>At this time, you cannot specify that a lan use multiplexed
81
82
83
84
links</b>)

<br>
<br>
85
To prevent this problem, a multiplexed link is automatically setup to use
86
87
88
per-link delays (discussed above). Each of the links in the above
example would get a set of DummyNet pipes restricting their bandwidth
to 50Mbs. Each link is forced to behave just as it would if the actual
Leigh B. Stoller's avatar
Leigh B. Stoller committed
89
link bandwidth were 50Mbs. The keeps the aggregate bandwidth to that
90
91
which can be supported by the underlying physical link (on fxp0,
100Mbs). Of course, the same caveats mentioned above for per-link
92
delays applies when using multiplexed links.
Leigh B. Stoller's avatar
Leigh B. Stoller committed
93
94
95
96
97
98
99
100
101
102
103
104
105
106

<br>
<br>
As a concrete example on Emulab.Net, consider the following NS file
which creates a router and attaches it to 12 other nodes:
    <code><pre>
    set maxnodes 12

    set router [$ns node]

    for {set i 1} {$i <= $maxnodes} {incr i} {
        set node($i) [$ns node]

        set link($i) [$ns duplex-link $node($i) $router 30Mb 10ms DropTail]
107
        tb-set-multiplexed $link($i) 1
Leigh B. Stoller's avatar
Leigh B. Stoller committed
108
109
110
111
    }
    # Turn on routing.
    $ns rtproto Static 					</code></pre>

112
113
114
115
Since each node on Emulab.Net has four 100Mbs interfaces, the above
mapping would not be possible without the use of multiplexed links.
However, since each link is defined to use 30Mbs, by using multiplexed
links, the 12 links can be shared over the four physical
Leigh B. Stoller's avatar
Leigh B. Stoller committed
116
interfaces, without oversubscribing the 400Mbs aggregate bandwidth
117
118
available to the node that is assigned to the router. <em> Note:
while it may sound a little like channel bonding, it is not!</em>
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169

<h3>Technical Discussion:</h3>

First, lets just look at what happens with per-link delays on a duplex
link. In this case, an ipfw pipe is setup on each node. The rule for
the pipe looks like:
    <code><pre>
    ipfw add pipe 10 ip from any to any out xmit fxp0 </code></pre>

which says that any packet going out on fxp0 should be stuffed into
pipe 10. Consider the case of a ping packet that traverses a duplex
link from NodeA to NodeB. Once the proper interface is choosen (based
on routing or the fact that the destination is directly connected),
the packet is handed off to ipfw, which determines that the interface
(fxp0) matches the rule specified above. The packet is then stuffed
into the corresponding Dummynet pipe, to emerge sometime later (based
on the traffic shaping parameters) and be placed on the wire. The
packet then arrives at NodeB. A ping reply packet is created and
addressed to NodeA, placed into the proper Dummynet pipe, and arrives
at NodeA. As you can see, each packet traversed exactly one Dummynet
pipe (or put another way, the entire ping/reply sequence traversed two
pipes).

<br>
<br>

Constructing delayed lans is more complicated than duplex links
because of the desire to allow each node in a lan to see different
delays when talking to any other node in the lan. That is, the delay
when traversing from NodeA to NodeB is different than when traversing
from NodeA to NodeC. Further, the return delays might be specified
completely differently so that the return trips take a different
amount of time. More information on why we allow this can be found <a
href=../tutorial/docwrapper.php3?docname=behindthescenes.html>here.</a>
To support this, it is necessary to insert two delay pipes for each
node. One pipe is for traffic leaving the node for the lan, and the
other pipe is for traffic entering the node from the lan. The reader
might ask why not create N pipes on each node for each possible
destination address in the lan, so that each packet traverses only one
pipe. The reason is that a node on a lan has only one connection to
it, and multiple pipes would not respect the aggregate bw cap
specified. The rule for the second pipe looks like:
    <code><pre>
    ipfw add pipe 15 ip from any to any in recv fxp0 </code></pre>

which says that any packet received on fxp0 should be stuffed into
pipe 15. The packet is later handed up to the application, or
forwarded on to the next hop, if appropriate.

<br>
<br>
170
The addition of multiplexed links complicates things further. To multiplex
171
172
173
174
175
176
177
178
179
180
several different links on a physical interface, one must use either
encapsulation (ipinip, vlan, etc) or IP interface aliases. We chose IP
aliases because it does not affect the MTU size. The downside of IP
aliases is that it is difficult (if not impossible) to determine what
flow a packet is part of, and thus what ipfw pipe to stuff the packet
into. In other words, the rules used above:
    <code><pre>
    ipfw add ... out xmit fxp0
    ipfw add ... in recv fxp0 </code></pre>

181
do not work because there are now multiple flows multiplexed onto the
182
183
184
185
186
187
188
189
190
191
interface (multiple IPs) and so there is no way to distinguish which
flow. Consider a duplex link in which we use the first rule above.
If the packet is not addressed to a direct neighbor, the routing
code lookup will return a nexthop, which <b>does</b> indicate the
flow, but because the rule is based simply on the interface (fxp0), all
flows match! Unfortunately, ipfw does not provide an interface for
matching on the nexthop address, but seeing as we are kernel hackers,
this is easy to deal with by adding new syntax to ipfw to allow
matching on nexthop:
    <code><pre>
Leigh B. Stoller's avatar
Leigh B. Stoller committed
192
    ipfw add ... out xmit fxp0 nexthop 192.168.2.3:255.255.255.0 </code></pre>
193
194
195

Now, no matter how the user alters the routing table, packets will be
stuffed into the proper pipe since the nexthop indicates which
Leigh B. Stoller's avatar
Leigh B. Stoller committed
196
197
directly connected virtual link the packet was sent over. The use of a
mask allows for matching when directly connected to a lan (a simplification).
198
199
200

<br>
<br>
201
Multiplexed lans present even worse problems because of the need to figure
202
203
204
out which flow a incoming packet is part of. When a packet arrives at
an interface, there is nothing in the packet to indicate which IP
alias the packet was intended for (or which it came from) when the
Leigh B. Stoller's avatar
Leigh B. Stoller committed
205
packet is not destined for the local node (is being forwarded).