hardware-mail.mbox 52.2 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
From ricci@cs.utah.edu Mon Oct 27 17:25:49 2003
Date: Mon, 27 Oct 2003 17:25:49 -0700
From: Robert P Ricci <ricci@cs.utah.edu>
To: Bob Braden <braden@ISI.EDU>
Cc: testbed-ops@emulab.net, deter-isi@ISI.EDU, lepreau@cs.utah.edu
Subject: Re: Hardware configuration for Emulab clone
Message-ID: <20031027172549.X95279@cs.utah.edu>
Mail-Followup-To: Bob Braden <braden@ISI.EDU>, testbed-ops@emulab.net,
	deter-isi@ISI.EDU, lepreau@cs.utah.edu
References: <200310272219.OAA28834@gra.isi.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <200310272219.OAA28834@gra.isi.edu>; from braden@ISI.EDU on Mon, Oct 27, 2003 at 02:19:38PM -0800
Status: RO
Content-Length: 5681
Lines: 113

You may already know many of these things.

Thus spake Bob Braden on Mon, Oct 27, 2003 at 02:19:38PM -0800:
> 2) DETER will purchase 4 additional 1000bT Ethernet interfaces for each
> node.  Ideally, the 2 64bit/33MHz PCI slots of the hosts will be
> populated with dual 1000bT interface cards.

Unless these two PCI slots are on independent busses, you can probably
expect to drive no more than two gigabit interfaces full speed, and only
at half-duplex. The theoretical PCI bandwidth on 64/33 PCI is, I
believe, not much more than 2Gbps. But, of course, you'll probably have
trouble generating more than 2Gbps of traffic on a PC anyway.
 
> 3) DETER will purchase a Cisco 6513 switch with a supervisor blade and
> 6 blades of 48 1000bT ports.  That will support 4 x 64 Ethernet ports
> to the nodes.

This is a very complicated issue. My views on this come from our
perspective on Emulab, in which we want to guarantee (or at least be
pretty darn sure) that there won't be any artifacts due to switch
limitations. I don't know if your goals are as stringent, and of course
there's always budget limitations... (Reading Steve Schwab's comments
farther down, it looks like you guys also want guaranteed bandwidth.)

Are the 48-port gigabit modules you're looking at WS-X6548-GE-TX?  This is
the only 48-port GigE module I'm aware of from Cisco. If it's something
else, from the WS-X7 series, for example, a whole different set of
issues apply.

From my understand of things (which came from reading some Cisco
documents, and from talking to an ex-Cisco engineer), this module is
very oversubscribed. It has a single 8Gbps (full-duplex) connection to
the switching fabric. I was told by the Cisco engineer that these
modules are 8x oversubscribed, though the math doesn't quite add up on
that (48 ports into an 8Gbps line would seem to be 6x oversubscribed.)
So, there may be some other bottleneck in it.

In the documentation I have about the architecture of the 65xx series,
it claims that 'SFM Single-Attached Fabric-Enabled Card's (which I think
all WS-X65 modules are), have a 16Gbps bus internally. Meaning that
you're not going to get more than 8 full-duplex, full-speed gigabit
flows out of them. If you were told that they have 80Gbps backplanes, I
can't say for sure that's wrong, but I would certainly double-check that
number. The white paper I'm referring to is online at:
http://www.cisco.com/en/US/customer/products/hw/switches/ps708/products_white_paper09186a0080092389.shtml
... in particular, I believe Figure 6 and the text below it are relevant
to the 48-port GigE modules.

So, it seems that you're not going to be able to use all this equipment
at full speed. If you want to save some cash on bandwidth you won't be
able to use, you might consider switching some of your GigE equipment to
100Mbps Ethernet.

Our conclusion was that the WS-X6516-GE-TX modules were the most
economical choice to get close to guaranteed bandwidth, though not _too_
close - they have 16 GigE ports, so they're 2x oversubscribed.

Another possibility would be to build in some links that don't go
through a switch at all - just connect up some of the nodes directly.
There's an obvious loss of flexibility, though it's clearly more
economical. Our software theoretically supports this, though we haven't
tried anything like it recently.
 
> 4) The control plane on the Emulab cluster will be offloaded to cheaper
> unmanaged switch ports.  The PXE boot-capable 10/100 interface of each
> node will be connected with the boot server machine using multiple 48
> port 1U switches on a separate LAN .  Examples of the switch would be a
> 3Com 2800 series unmanaged switch.  For the first 64 machines of the
> cluster, DETER would purchase 2 such switches.

It's pretty important that this set of switch support multicast. Many
unmanaged switches simply treat multicast like broadcast. This could be
pretty disastrous when loading disk images, which consumes a whole lot
of bandwidth. Check to see if these switches support IGMP snooping to
create multicast groups.
 
> 5) DETER will purchase remote power strips and console terminal muxes.
> The DETER project would appreciate suggestions from the ISD staff for
> which equipment models to buy.

We use Cyclades serial expander boxes in one of our servers - by putting
them in a PC, we get very good control over who is allowed to access
which ones, when. We use Cyclom Ze boxes:
http://www.cyclades.com/products/8/z_series
... which let you get 128 serial lines into one PC.

We use two types of power controllers - 8-port APC Ethernet-connected
controllers, and 20-port serial controllers from BayTech. Since you'll
have serial lines, we recommend the BayTechs, because they are cheaper
per-port. The ones we have are RPC-27s:
http://www.baytech.net/cgi-private/prodlist?show=RPC27
 
> 	Our first idea was to use the same 6513 chassis and add 5
> 	blades of 48 1000bT port to it.  This would provide complete
> 	symmetry among all the 128 niodes.  However, there is some
> 	doubt about the difficulty of wiring 4 x 128 ports to one
> 	6513.  It may therefore be better to purchase a second 6513
> 	chassis for Phase 1b.

We've managed to fill up a couple 6509s with 48-port modules. Not easy,
but we managed it.
   
> ??Is there any limitation on Emulab support of the planned 6513 switch
> configuration??

Nope, our software should support it just fine.

-- 
/-----------------------------------------------------------
| Robert P Ricci <ricci@cs.utah.edu> | <ricci@flux.utah.edu>
| Research Associate, University of Utah Flux Group
| www.flux.utah.edu | www.emulab.net
\-----------------------------------------------------------

From ricci@cs.utah.edu Tue Oct 28 10:46:59 2003
Date: Tue, 28 Oct 2003 10:46:59 -0700
From: Robert P Ricci <ricci@cs.utah.edu>
To: Bob Lindell <bob@jensar.us>
Cc: Bob Braden <braden@ISI.EDU>, testbed-ops@emulab.net, deter-isi@ISI.EDU,
	lepreau@cs.utah.edu
Subject: Re: [Deter-isi] Re: Hardware configuration for Emulab clone
Message-ID: <20031028104659.C95279@cs.utah.edu>
Mail-Followup-To: Bob Lindell <bob@jensar.us>, Bob Braden <braden@ISI.EDU>,
	testbed-ops@emulab.net, deter-isi@ISI.EDU, lepreau@cs.utah.edu
References: <20031027172549.X95279@cs.utah.edu> <A29CA1D8-0910-11D8-BB39-000393DC7572@jensar.us>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <A29CA1D8-0910-11D8-BB39-000393DC7572@jensar.us>; from bob@jensar.us on Mon, Oct 27, 2003 at 10:33:25PM -0800
Status: RO
Content-Length: 2195
Lines: 37

Thus spake Bob Lindell on Mon, Oct 27, 2003 at 10:33:25PM -0800:
> WS-X6748-GE-TX		Cat6500 48-port 10/100/1000 GE Mod: fabric enabled, RJ-45

Hmm, it looks to me like this module was not available at the time I was
investigating GigE on Ciscos. It's got a different architecture than the
one I was assuming you were talking about, so many of the things I said
yesterday don't apply. I'm a bit confused though, because the data
sheets do list it as having two 20Gbps connections to the switch fabric.
But, the architecture white papers about the 6500 series clearly label
the switch fabric connectors as being 8Gbps each (with some slots having
dual connectors.) So, hopefully this means that they are able to drive
those busses at a higher rate than originally spec'ed, and that the
whitepaper is just out of date. But, it could also mean that the 20Gbps
numbers are just marketing - it could mean, for example, that the
internal buses have 40Gbps of total bandwidth, but that the module only
gets 15Gbps (full duplex) to the fabric module. If you can get your
salesperson to put you in touch with an engineer, that would probably be
the best way to find out what the truth about this matter is. If you
find anything out, we'd definitely be interested to hear it, because me
might consider these newer modules for our own gigabit expansion.

From the fact that your specs now list a 6509 instead of a 6513, I'm
guessing you already know this, but the 6513 can only handle 5 modules
with dual switch fabric interfaces. Essentially, the maximum number of
fabric connections is 18 - so the 6509s have two to every slot, while the
6513s have 5 slots with dual connectors, and 8 with a single connector.
So, if you plan to fill a switch with these dual-ported modules, you can
get better density in a 6509. If you were going to put in some 10/100
modules with a single fabric connection, you could still do this in a
6513.

-- 
/-----------------------------------------------------------
| Robert P Ricci <ricci@cs.utah.edu> | <ricci@flux.utah.edu>
| Research Associate, University of Utah Flux Group
| www.flux.utah.edu | www.emulab.net
\-----------------------------------------------------------

From ricci@cs.utah.edu Wed Oct 29 13:31:57 2003
Date: Wed, 29 Oct 2003 13:31:57 -0700
From: Robert P Ricci <ricci@cs.utah.edu>
To: Stephen_Schwab@NAI.com, John Mehringer <mehringe@isi.edu>
Cc: braden@ISI.EDU, testbed-ops@emulab.net, deter-isi@ISI.EDU,
	lepreau@cs.utah.edu
Subject: Re: [Deter-isi] Re: Hardware configuration for Emulab clone
Message-ID: <20031029133157.R51103@cs.utah.edu>
Mail-Followup-To: Stephen_Schwab@NAI.com, John Mehringer <mehringe@isi.edu>,
	braden@ISI.EDU, testbed-ops@emulab.net, deter-isi@ISI.EDU,
	lepreau@cs.utah.edu
References: <613FA566484CA74288931B35D971C77E13429A@losexmb1.corp.nai.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <613FA566484CA74288931B35D971C77E13429A@losexmb1.corp.nai.org>; from Stephen_Schwab@NAI.com on Wed, Oct 29, 2003 at 11:27:45AM -0800
Status: RO
Content-Length: 1644
Lines: 30

Thus spake Stephen_Schwab@NAI.com on Wed, Oct 29, 2003 at 11:27:45AM -0800:
> Could we just add two blades (possible cheap ones) to our 6509 and use
> those, with VLAN support, for our split-out control nets.  That way we
> would also get the multicast support we need at boot time?

I think this is probably not a good idea. Routing needs to be done
between these segments. So, this would mean enabling some layer 3 and
above features on the experimental network switches. This has the
potential to interfere with experimental net traffic in unexpected ways
- as an example, we found out that our switches were checking TCP
checksums and discarding packets with bad ones. This, despite the fact
that we had no layer 4 services enabled at all on the switch. Turning on
layer 3 services on the experimental net is probably just asking for
trouble. I would think that it's also a security risk - bugs in the IOS
that runs on the MSFC card when doing routing in a Cat6k are now exposed
to the experimental net, so it could be possible to exploit one and find
a way out.

As for the idea of buying multiple unmanaged switches, the problem with
unmanaged switches is that you're going to want to be able to cut off
access to the outside world for nodes on which you're going to be trying
out worms, etc. An unmanaged switch isn't going to give you the ability
to do this.

-- 
/-----------------------------------------------------------
| Robert P Ricci <ricci@cs.utah.edu> | <ricci@flux.utah.edu>
| Research Associate, University of Utah Flux Group
| www.flux.utah.edu | www.emulab.net
\-----------------------------------------------------------

From mailnull@bas.flux.utah.edu Wed Oct 29 13:50:27 2003
Received: from bas.flux.utah.edu (localhost [127.0.0.1])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9TKoRLj096989
	for <testbed-ops-hidden@bas.flux.utah.edu>; Wed, 29 Oct 2003 13:50:27 -0700 (MST)
	(envelope-from mailnull@bas.flux.utah.edu)
Received: (from mailnull@localhost)
	by bas.flux.utah.edu (8.12.9/8.12.5/Submit) id h9TKoRbB096988
	for testbed-ops-hidden; Wed, 29 Oct 2003 13:50:27 -0700 (MST)
Received: from slow.flux.utah.edu (slow.flux.utah.edu [155.98.63.200])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9TKoRLj096984
	for <testbed-ops@[155.98.60.2]>; Wed, 29 Oct 2003 13:50:27 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by slow.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9TKoQPe006870
	for <testbed-ops@flux.utah.edu>; Wed, 29 Oct 2003 13:50:26 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from ops.emulab.net (ops.emulab.net [155.101.129.74])
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id NAA05684
	for <testbed-ops@fast.flux.utah.edu>; Wed, 29 Oct 2003 13:50:21 -0700 (MST)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by ops.emulab.net (8.12.9/8.12.6) with ESMTP id h9TKnpbD056579
	for <testbed-ops@emulab.net>; Wed, 29 Oct 2003 13:49:51 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (lepreau@localhost)
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id NAA05680;
	Wed, 29 Oct 2003 13:49:24 -0700 (MST)
Message-Id: <200310292049.NAA05680@fast.cs.utah.edu>
From: Jay Lepreau <lepreau@cs.utah.edu>
To: Stephen_Schwab@NAI.com, John Mehringer <mehringe@isi.edu>, braden@isi.edu,
   testbed-ops@emulab.net, deter-isi@isi.edu
Subject: Re: [Deter-isi] Re: Hardware configuration for Emulab clone
In-Reply-To: <20031029133157.R51103@cs.utah.edu>; from Robert P Ricci on Wed, 29 Oct 2003 13:31:57 MST
Date: Wed, 29 Oct 2003 13:49:24 MST
X-Spam-Status: No, hits=-8 required=5 tests=ACADEMICS,GLOB_WHITELIST version=FluxMilter1.2
X-Scanned-By: MIMEDefang 2.26 (www . roaringpenguin . com / mimedefang)
Status: RO
X-Status: A
Content-Length: 902
Lines: 18

Here's another datapoint from Kentucky's experience:
After we and they struggled for days or weeks to use some cheap 29xx
(?) router for this purpose, always running into unexplained glitches,
we suggested they toss it and use a PC running FreeBSD as a router, at
least to get going.  Worked great.

I don't think it's going to be fast or secure enough for you long
term, but it will get you off the ground.  However, I would want to
hear Rob's comments.  I'm sure there are small Cisco or other vendoes'
routers that would work... but which ones?

Aside: I noticed in your equip list you had "MSFC memory".  Not sure
that is correct, as an MSFC is the daughter card that is required to
turn a 65xx switch into a router.

We keep MSFC's out of our switches, partly to save money, but partly
to make triple sure that some higher layer stuff doesn't get turned
on by accident.  These Ciscos are complex.

From ricci@cs.utah.edu Wed Oct 29 14:14:21 2003
Date: Wed, 29 Oct 2003 14:14:21 -0700
From: Robert P Ricci <ricci@cs.utah.edu>
To: Jay Lepreau <lepreau@cs.utah.edu>
Cc: Stephen_Schwab@NAI.com, John Mehringer <mehringe@isi.edu>,
	braden@isi.edu, testbed-ops@emulab.net, deter-isi@isi.edu
Subject: Re: [Deter-isi] Re: Hardware configuration for Emulab clone
Message-ID: <20031029141421.U51103@cs.utah.edu>
Mail-Followup-To: Jay Lepreau <lepreau@cs.utah.edu>, Stephen_Schwab@NAI.com,
	John Mehringer <mehringe@isi.edu>, braden@isi.edu,
	testbed-ops@emulab.net, deter-isi@isi.edu
References: <20031029133157.R51103@cs.utah.edu> <200310292049.NAA05680@fast.cs.utah.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <200310292049.NAA05680@fast.cs.utah.edu>; from lepreau@cs.utah.edu on Wed, Oct 29, 2003 at 01:49:24PM -0700
Status: RO
Content-Length: 1661
Lines: 31

Thus spake Jay Lepreau on Wed, Oct 29, 2003 at 01:49:24PM -0700:
> I don't think it's going to be fast or secure enough for you long
> term, but it will get you off the ground.  However, I would want to
> hear Rob's comments.  I'm sure there are small Cisco or other vendoes'
> routers that would work... but which ones?

I think you want a router with at least 4 ports - one to connect to the
outside world, one to connect to the private VLAN, one to connect to the
public VLAN, and one to the nodes' control network interfaces. You
_could_ combine the private and public VLANs, but, as outlined in the
document I sent, this makes boss (which needs to be fairly secure, since
it's the source of all configuration informations and commands) more
open to attack from ops, a machine on which we traditionally give all
users shells. Since you're building a security testbed, I would think
you'd want to keep the infrastructure as safe from attack as possible,
and not make this shortcut.

To actually get any security out of this arrangement, you'll need a
router that can do firewalling. I believe all Cisco IOS routers can do
this, but my experience with the router side of Cisco is very limited,
so you'd have to check with a sales rep about this.

Yeah, if you have to be budget-conscious, a PC could do this job. As you
suggest, I would only view this as a temporary thing, though.

-- 
/-----------------------------------------------------------
| Robert P Ricci <ricci@cs.utah.edu> | <ricci@flux.utah.edu>
| Research Associate, University of Utah Flux Group
| www.flux.utah.edu | www.emulab.net
\-----------------------------------------------------------

From lepreau@fast.cs.utah.edu Mon Oct 27 23:52:02 2003
Received: from slow.flux.utah.edu (slow.flux.utah.edu [155.98.63.200])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9S6q2Lj023116
	for <ricci@[155.98.60.2]>; Mon, 27 Oct 2003 23:52:02 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by slow.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9S6pwPe092690
	for <ricci@flux.utah.edu>; Mon, 27 Oct 2003 23:51:59 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (lepreau@localhost)
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id XAA14364;
	Mon, 27 Oct 2003 23:51:48 -0700 (MST)
Message-Id: <200310280651.XAA14364@fast.cs.utah.edu>
From: Jay Lepreau <lepreau@cs.utah.edu>
To: Bob Braden <braden@ISI.EDU>, deter-isi@ISI.EDU
cc: ricci@flux.utah.edu, testbed-ops@emulab.net
Subject: Re: Hardware configuration for Emulab clone
In-Reply-To: <20031027172549.X95279@cs.utah.edu>; from Robert P Ricci on Mon, 27 Oct 2003 17:25:49 MST
Date: Mon, 27 Oct 2003 23:51:48 MST
X-Spam-Status: No, hits=-8 required=5 tests=ACADEMICS,GLOB_WHITELIST version=FluxMilter1.2
X-Scanned-By: MIMEDefang 2.26 (www . roaringpenguin . com / mimedefang)
Status: RO
Content-Length: 1548
Lines: 32

Bob:
	> 5) DETER will purchase remote power strips and console terminal muxes.
	> The DETER project would appreciate suggestions from the ISD staff for
	> which equipment models to buy.


	We use Cyclades serial expander boxes in one of our servers - by putting
	them in a PC, we get very good control over who is allowed to access
	which ones, when. We use Cyclom Ze boxes:
	http://www.cyclades.com/products/8/z_series
	... which let you get 128 serial lines into one PC.

Our software supports multiple terminal servers.  We actually run
serial lines in two servers now, since we have >128 hosts.

	We use two types of power controllers - 8-port APC Ethernet-connected
	controllers, and 20-port serial controllers from BayTech. Since you'll
	have serial lines, we recommend the BayTechs, because they are cheaper
	per-port. The ones we have are RPC-27s:
	http://www.baytech.net/cgi-private/prodlist?show=RPC27

I dis-recommend anything except the above two, although probably
others from the same vendors would be ok.  That is because this type
of device can be idiosyncratic and cost you and us time.  In
particular, the RPCs have little operating systems inside them with
idiosyncrasies and we had to evolve our software to cope.
Eg, we had to batch power requests because they have N second dead
times after processing a command.  Don't want to go through the
same trial and error with another vendor/device.

All our hardware is listed on our site, with URLs to the vendor's pages.
http://www.emulab.net/docwrapper.php3?docname=hardware.html

From mailnull@bas.flux.utah.edu Tue Oct 28 00:26:14 2003
Received: from bas.flux.utah.edu (localhost [127.0.0.1])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9S7QELj023947
	for <testbed-ops-hidden@bas.flux.utah.edu>; Tue, 28 Oct 2003 00:26:14 -0700 (MST)
	(envelope-from mailnull@bas.flux.utah.edu)
Received: (from mailnull@localhost)
	by bas.flux.utah.edu (8.12.9/8.12.5/Submit) id h9S7QEVF023946
	for testbed-ops-hidden; Tue, 28 Oct 2003 00:26:14 -0700 (MST)
Received: from slow.flux.utah.edu (slow.flux.utah.edu [155.98.63.200])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9S7QELj023942
	for <testbed-ops@[155.98.60.2]>; Tue, 28 Oct 2003 00:26:14 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by slow.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9S7QBPe092882
	for <testbed-ops@flux.utah.edu>; Tue, 28 Oct 2003 00:26:11 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from ops.emulab.net (ops.emulab.net [155.101.129.74])
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id AAA14522
	for <testbed-ops@fast.flux.utah.edu>; Tue, 28 Oct 2003 00:26:05 -0700 (MST)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by ops.emulab.net (8.12.9/8.12.6) with ESMTP id h9S7PZbD021125
	for <testbed-ops@emulab.net>; Tue, 28 Oct 2003 00:25:35 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (lepreau@localhost)
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id AAA14518;
	Tue, 28 Oct 2003 00:25:24 -0700 (MST)
Message-Id: <200310280725.AAA14518@fast.cs.utah.edu>
From: Jay Lepreau <lepreau@cs.utah.edu>
To: Bob Braden <braden@ISI.EDU>
Cc: testbed-ops@emulab.net, deter-isi@ISI.EDU
Subject: Re: Hardware configuration for Emulab clone
In-Reply-To: <200310272219.OAA28834@gra.isi.edu>; from Bob Braden on Mon, 27 Oct 2003 14:19:38 PST
Date: Tue, 28 Oct 2003 00:25:24 MST
X-Spam-Status: No, hits=-10 required=5 tests=ACADEMICS,GEEKWORDS1,GLOB_WHITELIST version=FluxMilter1.2
X-Scanned-By: MIMEDefang 2.26 (www . roaringpenguin . com / mimedefang)
Status: RO
Content-Length: 1522
Lines: 31


> A candidate for the Boot Server/Data Logger Equipment would be:
> ...

The boot server is our so-called "boss" (as in "master") server, and
you should make sure all the devices on it will work with FreeBSD.  A
port of the Emulab servers to Linux could be done... but it won't be
by us.  It would greatly complicate maintenance and upgrades and QA.

We also have a so-called "users" server (for user login accounts and
terminal service) and a logically separate fileserver (but that has
always been the same as the "users" machine, so there would probably
be small glitches in splitting that off).  "users" is also a FreeBSD
machine; porting it to Linux would probably be much easier thatn boss,
and users would find it friendlier.

Arguments can be made both ways about the security of having
logins on a persistent server.  But Emulab currently needs it,
including for a few non-login related things.

> An understanding of the needed modifications to Emulab software will
> become more evident as the project progresses.  For example, it is very
> plausible that Emulab will need to be modified to allow the ability to
> mirror traffic from a given link(s) in the emulated topology to a given
> piece of monitoring equipment that can perform protocol analysis or
> data logging at link rate.

In fact, that's a good example.  When people need that, we provide
it manually.  Would be nice to provide more generally, but there
hasn't been sufficient demand.  OTOH, what is easy to use oftern
determines what gets used.

From mailnull@bas.flux.utah.edu Tue Oct 28 00:45:41 2003
Received: from bas.flux.utah.edu (localhost [127.0.0.1])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9S7jeLj024385
	for <testbed-ops-hidden@bas.flux.utah.edu>; Tue, 28 Oct 2003 00:45:40 -0700 (MST)
	(envelope-from mailnull@bas.flux.utah.edu)
Received: (from mailnull@localhost)
	by bas.flux.utah.edu (8.12.9/8.12.5/Submit) id h9S7jen4024384
	for testbed-ops-hidden; Tue, 28 Oct 2003 00:45:40 -0700 (MST)
Received: from slow.flux.utah.edu (slow.flux.utah.edu [155.98.63.200])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9S7jeLj024380
	for <testbed-ops@[155.98.60.2]>; Tue, 28 Oct 2003 00:45:40 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by slow.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9S7jePe092959
	for <testbed-ops@flux.utah.edu>; Tue, 28 Oct 2003 00:45:40 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from ops.emulab.net (ops.emulab.net [155.101.129.74])
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id AAA14610
	for <testbed-ops@fast.flux.utah.edu>; Tue, 28 Oct 2003 00:45:34 -0700 (MST)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by ops.emulab.net (8.12.9/8.12.6) with ESMTP id h9S7j4bD021384
	for <testbed-ops@emulab.net>; Tue, 28 Oct 2003 00:45:04 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (lepreau@localhost)
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id AAA14588;
	Tue, 28 Oct 2003 00:44:51 -0700 (MST)
Message-Id: <200310280744.AAA14588@fast.cs.utah.edu>
From: Jay Lepreau <lepreau@cs.utah.edu>
To: Bob Braden <braden@ISI.EDU>, bob@jensar.us, Stephen_Schwab@NAI.com
Cc: testbed-ops@emulab.net, deter-isi@ISI.EDU
Subject: Re: Hardware configuration for Emulab clone
In-Reply-To: <200310272219.OAA28834@gra.isi.edu>; from Bob Braden on Mon, 27 Oct 2003 14:19:38 PST
Date: Tue, 28 Oct 2003 00:44:51 MST
X-Spam-Status: No, hits=-10 required=5 tests=ACADEMICS,GEEKWORDS2,GLOB_WHITELIST version=FluxMilter1.2
X-Scanned-By: MIMEDefang 2.26 (www . roaringpenguin . com / mimedefang)
Status: RO
Content-Length: 1745
Lines: 46



>	[which blades to get]
>	...
>	This would provide complete
>	symmetry among all the 128 niodes.
>	...
> 	We are generally trying to obtain as much homogeneity as
>	possible, but in the near term we won't need the maximum
>	capacity so we can compromise to save money.

As I said in our phone call, strong homogeneity of nodes wrt to their
links (link symmetry) is not generally needed, as Emulab abstracts over
that, and experimenters don't specify large completely uniform topologies.
They do care that nodes themselves (eg CPUs) be homogeneous.
The only downside of modest link asymmetry is that the mapper will
take a little longer to run, and it will be harder to "approximate
the mapping in your head," which is sometimes handy.

For Dummynet, you probably do want an even number of links of the same
speed on each node.

Steve S:
>	In any
>	event, any time our topology carries enough traffic to saturate
>	the VLANs on the switch, the illusion of multiple simulated
>	networks is going to break.  Over-provisioning the switch is
>	one way to avoid having to worry about how this affects the
>	correctness of our experiments.  But if we have to worry about
>	this, then so be it.]

We've talked about changing the switch model fed to our resource
mapper to be hierarchical, ie adding a "blade" with higher intra-blade
BW than inter-blade.  I would think this wouldn't be hard, but I think
Rob said it could be.  If that was done, then we could accurately
conservatively allocate resources.

However, Cisco BW probably depends on packet size.

Bob Lindell:
> Either way, 48 GE ports is 48Gb/s FD.  That will slightly over  
> subscribe the blade to backplane interface.


What blade to backplane BW have you been told?
How sure are you?

From mailnull@bas.flux.utah.edu Tue Oct 28 22:56:31 2003
Received: from bas.flux.utah.edu (localhost [127.0.0.1])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9T5uULj074631
	for <testbed-ops-hidden@bas.flux.utah.edu>; Tue, 28 Oct 2003 22:56:30 -0700 (MST)
	(envelope-from mailnull@bas.flux.utah.edu)
Received: (from mailnull@localhost)
	by bas.flux.utah.edu (8.12.9/8.12.5/Submit) id h9T5uUAL074630
	for testbed-ops-hidden; Tue, 28 Oct 2003 22:56:30 -0700 (MST)
Received: from slow.flux.utah.edu (slow.flux.utah.edu [155.98.63.200])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9T5uULj074626
	for <testbed-ops@[155.98.60.2]>; Tue, 28 Oct 2003 22:56:30 -0700 (MST)
	(envelope-from Stephen_Schwab@NAI.com)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by slow.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9T5uQPe002020
	for <testbed-ops@flux.utah.edu>; Tue, 28 Oct 2003 22:56:26 -0700 (MST)
	(envelope-from Stephen_Schwab@NAI.com)
Received: from ops.emulab.net (ops.emulab.net [155.101.129.74])
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id WAA27611
	for <testbed-ops@fast.flux.utah.edu>; Tue, 28 Oct 2003 22:56:21 -0700 (MST)
From: Stephen_Schwab@NAI.com
Received: from RelayDAL.nai.com (relaydal.nai.com [205.227.136.197])
	by ops.emulab.net (8.12.9/8.12.6) with ESMTP id h9T5tobD042474
	for <testbed-ops@emulab.net>; Tue, 28 Oct 2003 22:55:50 -0700 (MST)
	(envelope-from Stephen_Schwab@NAI.com)
Received: from dalexwsout2.na.nai.com (dalexwsout2.na.nai.com [161.69.212.93] (may be forged))
	by RelayDAL.nai.com (Switch-2.2.8/Switch-2.2.6) with SMTP id h9T5qMV15609;
	Tue, 28 Oct 2003 23:52:22 -0600 (CST)
Received: from mail.na.nai.com(161.69.111.81) by dalexwsout2.na.nai.com via csmap 
	 id 278c5c60_09d4_11d8_880c_00304811fc74_7761;
	Tue, 28 Oct 2003 23:53:00 -0600 (CST)
Received: from losexmb1.corp.nai.org ([161.69.83.203]) by DALEXBR1.corp.nai.org with Microsoft SMTPSVC(5.0.2195.5329);
	 Tue, 28 Oct 2003 23:55:38 -0600
content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Subject: shedding some light on the new Cisco 720Gb/s switch fabric
X-MimeOLE: Produced By Microsoft Exchange V6.0.6487.1
Date: Tue, 28 Oct 2003 21:55:37 -0800
Message-ID: <613FA566484CA74288931B35D971C77E13428C@losexmb1.corp.nai.org>
Thread-Topic: shedding some light on the new Cisco 720Gb/s switch fabric
Thread-Index: AcOd4UaDpbU7QPVxR5SqgTKHrLS9EQ==
To: <deter-isi@isi.edu>, <testbed-ops@emulab.net>
X-OriginalArrivalTime: 29 Oct 2003 05:55:38.0460 (UTC) FILETIME=[473899C0:01C39DE1]
X-Spam-Status: No, hits=-2.715 required=5 tests=ACADEMICS,NO_REAL_NAME version=FluxMilter1.2
X-Scanned-By: MIMEDefang 2.26 (www . roaringpenguin . com / mimedefang)
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by bas.flux.utah.edu id h9T5uULj074626
Status: RO
X-Status: A
Content-Length: 2269
Lines: 27

Hi,

I think I see the confusion -- it appears that Cisco dropped a new switch fabric into the 6500s by putting the switch fabric on the supervisor module.

If you search through this web page:

http://www.cisco.com/en/US/products/hw/switches/ps708/products_data_sheet09186a00800ff916.html

you can find a reference buried where it describe the switch fabrics.

I can't quite see how they wire this beast -- perhaps they physically re-cable the slot connectors from the internal 256 Gb/s switch fabric to the 720 Gb/s switch fabric on the supervisor module.

There is a reference somewhere else to the auto-sensing/auto-switching cababilities of the 720 Gb/s switch fabric -- so if you happen to plug in older 16 or 8 Gb/s blades, the switch fabric will still talk to them.

The WS-X6748-GE-TX blades are definitely designed to talk to the 720 Gb/s switch fabric.  The way we will use them, it is unlikely that more than 24 Gb/s will ever be sourced or sinked on a blade, so 40 Gb/s will be enough headroom.

But there is a gotcha: we didn't plan to order any WS-F6700-DFC3A daughter cards!

The way Cisco gets the packet processing rate up is to decentralize the forwarding onto daughter cards -- each of these dCEF (distributed Cisco Express Forwarding) cards is co-located on a blade, and the supervisor module downloads forwarding rules to all the dCEFs.  It is really unclear to me what happens if you try to forward all those packets from 6 blades through a single supervisor module 720s CEF engine (the MSFC3 PFC3A daughter card).  The performance is listed as 400 Mpps with dCEF, but the table doesn't have any numbers for centralized CEF.

However, I think we should just go ahead and get 6 of the WS-X6748-GE-TX blades and try out CEF.  That will give us 288 10/100/1000 ports, allowing us to support up to 72 machines.  If we find we are over-subscribing something, we can decide whether to redistribute across more 6509s, or add the dCEF modules.  

The alternative is to just use 100BaseT in the first 64 PCs, and plan to upgrade to gigabit later, on the assumption that the price will drop.  If we did that, we could just buy the 256 Gb/s switch fabric also -- in fact, we would just be buying the 6509 configurations that Utah's emulab uses. 

--Steve



From ricci@cs.utah.edu Wed Oct 29 10:59:55 2003
Date: Wed, 29 Oct 2003 10:59:55 -0700
From: Robert P Ricci <ricci@cs.utah.edu>
To: Stephen_Schwab@NAI.com
Cc: deter-isi@isi.edu, testbed-ops@emulab.net
Subject: Re: shedding some light on the new Cisco 720Gb/s switch fabric
Message-ID: <20031029105955.F51103@cs.utah.edu>
Mail-Followup-To: Stephen_Schwab@NAI.com, deter-isi@isi.edu,
	testbed-ops@emulab.net
References: <613FA566484CA74288931B35D971C77E13428C@losexmb1.corp.nai.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <613FA566484CA74288931B35D971C77E13428C@losexmb1.corp.nai.org>; from Stephen_Schwab@NAI.com on Tue, Oct 28, 2003 at 09:55:37PM -0800
Status: RO
Content-Length: 3472
Lines: 71

Thus spake Stephen_Schwab@NAI.com on Tue, Oct 28, 2003 at 09:55:37PM -0800:
> I think I see the confusion -- it appears that Cisco dropped a new
> switch fabric into the 6500s by putting the switch fabric on the
> supervisor module.

This is how the 'older' (CEF256) fabric modules work too - we have a
6513 with fabric-enabled cards, and it also has a fabric module. It
seems that one of the main things they've done with the Sup720 is put
the fabric module into the supervisor. This is very nice, since in our
6513, these are separate modules, taking up two slots.  

I finally found that whitepaper I've been talking about - it's at:
http://www.cisco.com/en/US/products/hw/switches/ps708/products_white_paper09186a0080092389.shtml

If you look at Figure 3 (more readable in the PDF version linked from
the top of the page), it clearly labels the crossbar connectors as being
8Gbps. But, the more I'm seeing, the more I'm convinced that this
whitepaper is just out of date, and the new modules really can drive
those connectors at a higher rate.

> There is a reference somewhere else to the auto-sensing/auto-switching
> cababilities of the 720 Gb/s switch fabric -- so if you happen to plug
> in older 16 or 8 Gb/s blades, the switch fabric will still talk to
> them.

Good to hear it! The older fabric modules, we found out the hard way,
don't interoperate with other types of modules.
 
> The WS-X6748-GE-TX blades are definitely designed to talk to the 720
> Gb/s switch fabric.  The way we will use them, it is unlikely that
> more than 24 Gb/s will ever be sourced or sinked on a blade, so 40
> Gb/s will be enough headroom.

Yeah, I agree, that sounds like plenty.
 
> However, I think we should just go ahead and get 6 of the
> WS-X6748-GE-TX blades and try out CEF.  That will give us 288
> 10/100/1000 ports, allowing us to support up to 72 machines.  If we
> find we are over-subscribing something, we can decide whether to
> redistribute across more 6509s, or add the dCEF modules.  

Adding the dCEF modules would probably be preferable - since
interconnecting the 6509s will cost quite a bit (presumably, you're
going to want to connect the switches with links at least an order of
magnitude faster that what's on the PCs). Might as well have fewer
switches, that you can fill up all the way.
 
> The alternative is to just use 100BaseT in the first 64 PCs, and plan
> to upgrade to gigabit later, on the assumption that the price will
> drop.  If we did that, we could just buy the 256 Gb/s switch fabric
> also -- in fact, we would just be buying the 6509 configurations that
> Utah's emulab uses. 

Our experience so far has been that Cisco prices don't drop - looks to
us like their model is to just leave prices alone, and price new modules
when they come out. Which leads to some weird pricing anomalies - IIRC,
last time I looked, single-port 10Gbit modules were more expensive than
the newer 4-port 10Gbmit modules.

Also, for most of our switches, we actually don't use the fabric at all
- we're just using the 32Gbps backplane bus, which is fine for a switch
full of 10/100 ports. But obviously fabric is the way to go from her eon
out, with gigabit and such coming.


-- 
/-----------------------------------------------------------
| Robert P Ricci <ricci@cs.utah.edu> | <ricci@flux.utah.edu>
| Research Associate, University of Utah Flux Group
| www.flux.utah.edu | www.emulab.net
\-----------------------------------------------------------

From mailnull@bas.flux.utah.edu Wed Oct 29 14:01:24 2003
Received: from bas.flux.utah.edu (localhost [127.0.0.1])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9TL1OLj097498
	for <testbed-ops-hidden@bas.flux.utah.edu>; Wed, 29 Oct 2003 14:01:24 -0700 (MST)
	(envelope-from mailnull@bas.flux.utah.edu)
Received: (from mailnull@localhost)
	by bas.flux.utah.edu (8.12.9/8.12.5/Submit) id h9TL1OFd097497
	for testbed-ops-hidden; Wed, 29 Oct 2003 14:01:24 -0700 (MST)
Received: from slow.flux.utah.edu (slow.flux.utah.edu [155.98.63.200])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9TL1OLj097493
	for <testbed-ops@[155.98.60.2]>; Wed, 29 Oct 2003 14:01:24 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by slow.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9TL1NPe006939
	for <testbed-ops@flux.utah.edu>; Wed, 29 Oct 2003 14:01:23 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from ops.emulab.net (ops.emulab.net [155.101.129.74])
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id OAA05815
	for <testbed-ops@fast.flux.utah.edu>; Wed, 29 Oct 2003 14:01:18 -0700 (MST)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by ops.emulab.net (8.12.9/8.12.6) with ESMTP id h9TL0mbD056856
	for <testbed-ops@emulab.net>; Wed, 29 Oct 2003 14:00:48 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (lepreau@localhost)
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id OAA05793;
	Wed, 29 Oct 2003 14:00:36 -0700 (MST)
Message-Id: <200310292100.OAA05793@fast.cs.utah.edu>
From: Jay Lepreau <lepreau@cs.utah.edu>
To: Stephen_Schwab@NAI.com
Cc: deter-isi@isi.edu, testbed-ops@emulab.net
Subject: Re: shedding some light on the new Cisco 720Gb/s switch fabric
In-Reply-To: <613FA566484CA74288931B35D971C77E13428C@losexmb1.corp.nai.org>; from Stephen_Schwab@NAI.com on Tue, 28 Oct 2003 21:55:37 PST
Date: Wed, 29 Oct 2003 14:00:36 MST
X-Spam-Status: No, hits=-8 required=5 tests=ACADEMICS,GLOB_WHITELIST version=FluxMilter1.2
X-Scanned-By: MIMEDefang 2.26 (www . roaringpenguin . com / mimedefang)
Status: RO
Content-Length: 947
Lines: 22


> However, I think we should just go ahead and get 6 of the
> WS-X6748-GE-TX blades and try out CEF.  That will give us 288
> 10/100/1000 ports, allowing us to support up to 72 machines.  If we
> find we are over-subscribing something, we can decide whether to
> redistribute across more 6509s, or add the dCEF modules.

> The alternative is to just use 100BaseT in the first 64 PCs, and plan
> to upgrade to gigabit later, on the assumption that the price will
> drop.  If we did that, we could just buy the 256 Gb/s switch fabric
> also -- in fact, we would just be buying the 6509 configurations that
> Utah's emulab uses.

A reasonable and probably better alternative is to use 2 Gbit and 2
100Mbit on each machine.  Your PC won't drive 4 Gbit interfaces anyway.

Or perhaps:
    16 nodes with 4G lines
    32 nodes with 2G + 2 100Mbit

That is the sort of expansion we were going to do.
Emulab will handle the resource assignment just fine.

From mailnull@bas.flux.utah.edu Wed Oct 29 16:28:09 2003
Received: from bas.flux.utah.edu (localhost [127.0.0.1])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9TNS9Lj002647
	for <testbed-ops-hidden@bas.flux.utah.edu>; Wed, 29 Oct 2003 16:28:09 -0700 (MST)
	(envelope-from mailnull@bas.flux.utah.edu)
Received: (from mailnull@localhost)
	by bas.flux.utah.edu (8.12.9/8.12.5/Submit) id h9TNS9v1002645
	for testbed-ops-hidden; Wed, 29 Oct 2003 16:28:09 -0700 (MST)
Received: from slow.flux.utah.edu (slow.flux.utah.edu [155.98.63.200])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9TNS9Lj002641
	for <testbed-ops@[155.98.60.2]>; Wed, 29 Oct 2003 16:28:09 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by slow.flux.utah.edu (8.12.9/8.12.5) with ESMTP id h9TNS5Pe008007
	for <testbed-ops@flux.utah.edu>; Wed, 29 Oct 2003 16:28:05 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from ops.emulab.net (ops.emulab.net [155.101.129.74])
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id QAA06698
	for <testbed-ops@fast.flux.utah.edu>; Wed, 29 Oct 2003 16:27:59 -0700 (MST)
Received: from fast.cs.utah.edu (fast.cs.utah.edu [155.99.212.1])
	by ops.emulab.net (8.12.9/8.12.6) with ESMTP id h9TNRTbD060624
	for <testbed-ops@emulab.net>; Wed, 29 Oct 2003 16:27:29 -0700 (MST)
	(envelope-from lepreau@fast.cs.utah.edu)
Received: from fast.cs.utah.edu (lepreau@localhost)
	by fast.cs.utah.edu (8.9.1/8.9.1) with ESMTP id QAA06685;
	Wed, 29 Oct 2003 16:27:17 -0700 (MST)
Message-Id: <200310292327.QAA06685@fast.cs.utah.edu>
From: Jay Lepreau <lepreau@cs.utah.edu>
To: Stephen_Schwab@NAI.com, deter-isi@isi.edu, testbed-ops@emulab.net
Subject: Re: shedding some light on the new Cisco 720Gb/s switch fabric
In-Reply-To: <200310292100.OAA05793@fast.cs.utah.edu>; from Jay Lepreau on Wed, 29 Oct 2003 14:00:36 MST
Date: Wed, 29 Oct 2003 16:27:17 MST
X-Spam-Status: No, hits=-7 required=5 tests=GEEKWORDS1,GLOB_WHITELIST version=FluxMilter1.2
X-Scanned-By: MIMEDefang 2.26 (www . roaringpenguin . com / mimedefang)
Status: RO
Content-Length: 533
Lines: 14

I had said:
> Or perhaps:
>    16 nodes with 4G lines
>    32 nodes with 2G + 2 100Mbit

I was thinking you had 48 total.  Do some obvious adaptation
for the 64 you do have.

One thing you will discover is that people do lots of LAN experiments,
which will only use one interface on a node.  I suspect that will
extend to some degree to your testbed, too.  For one thing, that is a
reasonable way to model the Internet, with different latency/bw
characteristics on each node's interface to the LAN (modeling a
last mile bottleneck).

From ricci@cs.utah.edu Wed Oct 29 11:01:42 2003
Date: Wed, 29 Oct 2003 11:01:42 -0700
From: Robert P Ricci <ricci@cs.utah.edu>
To: Stephen_Schwab@NAI.com
Cc: deter-isi@isi.edu, testbed-ops@emulab.net
Subject: Re: NAMs for the 6509s -- required or optional?
Message-ID: <20031029110142.G51103@cs.utah.edu>
Mail-Followup-To: Stephen_Schwab@NAI.com, deter-isi@isi.edu,
	testbed-ops@emulab.net
References: <613FA566484CA74288931B35D971C77E13428D@losexmb1.corp.nai.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <613FA566484CA74288931B35D971C77E13428D@losexmb1.corp.nai.org>; from Stephen_Schwab@NAI.com on Tue, Oct 28, 2003 at 09:57:27PM -0800
Status: RO
Content-Length: 688
Lines: 15

Thus spake Stephen_Schwab@NAI.com on Tue, Oct 28, 2003 at 09:57:27PM -0800:
> We don't have any NAMs (Network Analysis Modules) in our
> configuration.  Do we need the NAMs for anything?

They are certainly not required. We suspect that they could be
tremendously useful for experimenters. But, we have yet to use ours at
all (lack of time, not lack of interest). So, clearly, you can get by
okay without them.

-- 
/-----------------------------------------------------------
| Robert P Ricci <ricci@cs.utah.edu> | <ricci@flux.utah.edu>
| Research Associate, University of Utah Flux Group
| www.flux.utah.edu | www.emulab.net
\-----------------------------------------------------------

From braden@ISI.EDU Wed Nov 19 14:10:00 2003
Received: from slow.flux.utah.edu (slow.flux.utah.edu [155.98.63.200])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id hAJLA0LQ064563
	for <ricci@[155.98.60.2]>; Wed, 19 Nov 2003 14:10:00 -0700 (MST)
	(envelope-from braden@ISI.EDU)
Received: from mail-svr1.cs.utah.edu (brahma.cs.utah.edu [155.99.198.200])
	by slow.flux.utah.edu (8.12.9/8.12.5) with ESMTP id hAJL9ZPe091915
	for <ricci@flux.utah.edu>; Wed, 19 Nov 2003 14:09:35 -0700 (MST)
	(envelope-from braden@ISI.EDU)
Received: by mail-svr1.cs.utah.edu (Postfix)
	id 99190346F3; Wed, 19 Nov 2003 14:09:30 -0700 (MST)
Delivered-To: ricci@cs.utah.edu
Received: from boreas.isi.edu (boreas.isi.edu [128.9.160.161])
	by mail-svr1.cs.utah.edu (Postfix) with ESMTP
	id A8F30346ED; Wed, 19 Nov 2003 14:09:29 -0700 (MST)
Received: from gra.isi.edu (gra.isi.edu [128.9.160.133])
	by boreas.isi.edu (8.11.6p2+0917/8.11.2) with ESMTP id hAJL9La28057;
	Wed, 19 Nov 2003 13:09:21 -0800 (PST)
From: Bob Braden <braden@ISI.EDU>
Received: (from braden@localhost)
	by gra.isi.edu (8.9.3/8.8.6) id NAA04791;
	Wed, 19 Nov 2003 13:09:21 -0800 (PST)
Date: Wed, 19 Nov 2003 13:09:21 -0800 (PST)
Message-Id: <200311192109.NAA04791@gra.isi.edu>
To: lepreau@cs.utah.edu, ricci@cs.utah.edu, deter-isi@ISI.EDU
Subject: Performance figures
Cc: braden@ISI.EDU
X-Sun-Charset: US-ASCII
X-Spam-Status: No, hits=0 required=5 tests= version=FluxMilter1.2
X-Scanned-By: MIMEDefang 2.26 (www . roaringpenguin . com / mimedefang)
Status: RO
Content-Length: 733
Lines: 28



I have been trying to digest the performance figures that we have
been bandying about.  Comments/corrections appreciated.

			PC		1000 bT           Cisco Line Card
							   (48 ports@1000 bT)
 		  _____________________________________________________________

Max bit rate		~< 0.5 Gbps      ~< 1 Gbps          40 Gbps FD


Max pkts per            ~< 1 Mpps	~< 2 Mpps	   30 Mpps * (FD??)
sec (pps)



*Note: Rises to 48 Mpps with dCEF (distributed Cisco Express Forwarding
Cards; cost $27K for 6 line cards.

According to these figures, we might be over-subscribed if we spread
each PC across 4 line cards.  OTOH, if we plug each PC into a single
line card, the head room is about a factor of 2 - 3.

Does this make sense???

Bob


From mailnull@bas.flux.utah.edu Wed Nov 19 15:13:30 2003
Received: from bas.flux.utah.edu (localhost [127.0.0.1])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id hAJMDULQ067272
	for <testbed-ops-hidden@bas.flux.utah.edu>; Wed, 19 Nov 2003 15:13:30 -0700 (MST)
	(envelope-from mailnull@bas.flux.utah.edu)
Received: (from mailnull@localhost)
	by bas.flux.utah.edu (8.12.9/8.12.5/Submit) id hAJMDUPm067271
	for testbed-ops-hidden; Wed, 19 Nov 2003 15:13:30 -0700 (MST)
Received: from bas.flux.utah.edu (localhost [127.0.0.1])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id hAJMDULQ067267;
	Wed, 19 Nov 2003 15:13:30 -0700 (MST)
	(envelope-from lepreau@bas.flux.utah.edu)
Message-Id: <200311192213.hAJMDULQ067267@bas.flux.utah.edu>
From: Jay Lepreau <lepreau@cs.utah.edu>
To: Bob Braden <braden@ISI.EDU>
Cc: ricci@cs.utah.edu, deter-isi@ISI.EDU, testbed-ops@flux.utah.edu
Subject: Re: Performance figures
In-Reply-To: <200311192109.NAA04791@gra.isi.edu>; from Bob Braden on Wed, 19 Nov 2003 13:09:21 PST
Date: Wed, 19 Nov 2003 15:13:30 -0700
Sender: lepreau@flux.utah.edu
Status: RO
X-Status: A
Content-Length: 1541
Lines: 48

Added testbed-ops so you get more, more informed people.
Some quick remarks, not complete.  I have not digested your numbers.

	---------------
	From: Bob Braden <braden@ISI.EDU>
	Date: Wed, 19 Nov 2003 13:09:21 -0800 (PST)
	To: lepreau@cs.utah.edu, ricci@cs.utah.edu, deter-isi@ISI.EDU
	Subject: Performance figures
	Cc: braden@ISI.EDU


	I have been trying to digest the performance figures that we have
	been bandying about.  Comments/corrections appreciated.

				PC		1000 bT           Cisco Line Card
								   (48 ports@1000 bT)
			  _____________________________________________________________

	Max bit rate		~< 0.5 Gbps      ~< 1 Gbps          40 Gbps FD


	Max pkts per            ~< 1 Mpps	~< 2 Mpps	   30 Mpps * (FD??)
	sec (pps)

PCs can fwd a lot more than 1Mpps in polling mode if pkts are short.

	*Note: Rises to 48 Mpps with dCEF (distributed Cisco Express Forwarding
	Cards; cost $27K for 6 line cards.

	According to these figures, we might be over-subscribed if we spread
Which device is oversubscribed?
	each PC across 4 line cards.  OTOH, if we plug each PC into a single
	line card, the head room is about a factor of 2 - 3.
You can't set up a very interesting topology with just link per machine!
Without our virtual network stuff, that is.
But there are reasons people often want a "real" dedicated network link.


	Does this make sense???

	Bob
	------------

Get PCI-X busses on your PCs.  You need that if you run Gbit.

Lots of expts won't be running Gbit!!  Most of the Internet
is *lots* slower than that.


From ricci@cs.utah.edu Wed Nov 19 15:43:57 2003
Date: Wed, 19 Nov 2003 15:43:57 -0700
From: Robert P Ricci <ricci@cs.utah.edu>
To: Bob Braden <braden@ISI.EDU>
Cc: deter-isi@ISI.EDU, testbed-ops@flux.utah.edu,
	Jay Lepreau <lepreau@cs.utah.edu>
Subject: Re: Performance figures
Message-ID: <20031119154357.I534@cs.utah.edu>
Mail-Followup-To: Bob Braden <braden@ISI.EDU>, deter-isi@ISI.EDU,
	testbed-ops@flux.utah.edu, Jay Lepreau <lepreau@cs.utah.edu>
References: <200311192109.NAA04791@gra.isi.edu> <200311192213.hAJMDULQ067267@bas.flux.utah.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <200311192213.hAJMDULQ067267@bas.flux.utah.edu>; from lepreau@cs.utah.edu on Wed, Nov 19, 2003 at 03:13:30PM -0700
Status: RO
Content-Length: 1150
Lines: 26

Note: I have not verified your numbers.

Thus spake Jay Lepreau on Wed, Nov 19, 2003 at 03:13:30PM -0700:
> 	According to these figures, we might be over-subscribed if we spread
> 	each PC across 4 line cards.  OTOH, if we plug each PC into a single
> 	line card, the head room is about a factor of 2 - 3.
> 
> 	Does this make sense???

So, it sounds like you avoid oversubscription, accoring to your numbers,
in the latter case because, though it has 4 1000Mbps interfaces, a PC
cannot saturate all of them? To me, this would suggest that you could
save a lot by not putting more 1000Mbps interfaces on a PC than it can
handle, and giving some of them a mix of 1000Mbps and 100Mbps
interfaces.

But yes, if your PCs will be your limiting factor in traffic generation
(which is certainly beleivable), this seems to me like a reasonable way
to avoid overloading the switch.

-- 
/-----------------------------------------------------------
| Robert P Ricci <ricci@cs.utah.edu> | <ricci@flux.utah.edu>
| Research Associate, University of Utah Flux Group
| www.flux.utah.edu | www.emulab.net
\-----------------------------------------------------------

From mailnull@bas.flux.utah.edu Wed Nov 19 15:53:22 2003
Received: from bas.flux.utah.edu (localhost [127.0.0.1])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id hAJMrMLQ069501
	for <testbed-ops-hidden@bas.flux.utah.edu>; Wed, 19 Nov 2003 15:53:22 -0700 (MST)
	(envelope-from mailnull@bas.flux.utah.edu)
Received: (from mailnull@localhost)
	by bas.flux.utah.edu (8.12.9/8.12.5/Submit) id hAJMrMaH069500
	for testbed-ops-hidden; Wed, 19 Nov 2003 15:53:22 -0700 (MST)
Received: from bas.flux.utah.edu (localhost [127.0.0.1])
	by bas.flux.utah.edu (8.12.9/8.12.5) with ESMTP id hAJMrMLQ069496;
	Wed, 19 Nov 2003 15:53:22 -0700 (MST)
	(envelope-from lepreau@bas.flux.utah.edu)
Message-Id: <200311192253.hAJMrMLQ069496@bas.flux.utah.edu>
From: Jay Lepreau <lepreau@cs.utah.edu>
To: Robert P Ricci <ricci@cs.utah.edu>
Cc: Bob Braden <braden@ISI.EDU>, deter-isi@ISI.EDU, testbed-ops@flux.utah.edu
Subject: Re: Performance figures
In-Reply-To: <20031119154357.I534@cs.utah.edu>; from Robert P Ricci on Wed, 19 Nov 2003 15:43:57 MST
Date: Wed, 19 Nov 2003 15:53:22 -0700
Sender: lepreau@flux.utah.edu
Status: RO
Content-Length: 459
Lines: 10

> To me, this would suggest that you could save a lot by not putting more
> 1000Mbps interfaces on a PC than it can handle, and giving some of
> them a mix of 1000Mbps and 100Mbps interfaces.

Yes. I think I've recommended before that ISI do 2 Gbit and 2 100Mbit
lines on each machine.

Note that Gbit and FE NICs are not much different in price; it's the switch
ports that really cost.  So buy all Gbit NICs and run some at 100.
Gives you later flexibility.