Commit f4122169 authored by Mike Hibler's avatar Mike Hibler
Browse files

This is so far out of date, that it has no reason to exist

parent bd793129
[This file is absolutely not up to date and is no longer maintained.]
*** cvsupd. - Sep 17 2001 - Fixed by Mike on Sep 24 2001
The old version of cvsupd had the billenium bug, whereby number of
seconds since the epoc is greater than 1billion, and thus breaking
cvsupd. Upgrading was a disaster on Linux. It appears the new version
was trashing the boot block in Linux, and so nodes were not booting
after a cvsupd run.
*** mountd/exports - Sep 17 2001 - Needs to be fixed
Reported by Matt on Sep 17 2001, but actually a known bug with the
exports_setup script and the current mountd/kernel impl, which wipes
out all mounts before installing the new set. This causes transient
failures in NFS access from the testbed nodes since the mounts become
momentarily invalid.
*** Batch Mode Nits - Sep 18 2001 - Fixed by LBS on Sep 25 2001
Reported by Mike.
Nit: experiment create date in Experiment Info "header" is not set.
Is this field meaningless or just not filled in correctly?
Nit: there are two "header" tables shown
I assume this is bacause the batch code prints out a header and then calls
the regular experiment info script to do the rest. Anyway, the second table
is missing lots of date info as well.
Nit: web page for expr takes forever to show the nodes that were allocated.
For a regular experiment, the allocated nodes show up almost immediately,
presumably as soon as assign is done. For a batch experiment, it seems to
take minutes. I can go out and look at assign.log and see that nodes have
been assigned almost right away, they just doesn't wind up in the report
(the DB?) for awhile.
*** Shell experiments - Sep 18 2001 - Needs to be fixed
Reported by Rob.
Perhaps a feature instead of a bug. Shell experiments remain in the
"new" state, and are thus not cleaned up (nodes frees) when the
experiment is terminating. I view that as intended behaviour, but its
easy to change, so we should.
*** Broken lilo.conf - Sep 17 2001 - Partially fixed by LBS on Sep 24 2001
Reported by Matt (and then Rob).
According to Matt, our lilo.conf file is broken. In the next version
of the disk image, we should fix it. Matt sent a fixed version to
testbed-ops on Sep. 17 that we could use as a base.
Partially fixed means I hardwired some assembly code to the serial
line at 115200. A bug was causing it to go to the VGA all the time,
and the higher speeds are just a mess in lilo and I did not want to
mess with the assembly language. Too much of a time sink.
*** IPOD support - Sep 17 2001 - Needs to be fixed.
Reported by Rob.
Our Linux kernel needs to be rebuilt with ping of death support.
*** DummyNet support - Sept 19 2001 - Needs to be fixed.
Reported by Jay.
We need to pass through more DummyNet pipe configuration parameters
for delay nodes. It basically needs some front end parser work (not
too hard), and some changes to the DB where we store that stuff (a few
tables, not too hard), the tmcd to return the additional stuff (not
too hard), and the client delay configuration to use the additional
stuff when configuring the pipes (not too hard).
*** Multiple batch/reload daemons. - Sept 24 2001 - Needs to be fixed.
It is possible to start multiple batch and reload daemons (and many
other TB daemons). Can really mess things up!
*** No text box in approve user page - Sept 24 2001 - Needs to be fixed.
Reported by Mac.
We need to add a text box for sending a message to the applicant, just
like we have in the project approval page.
*** Cleanup of batch experiments - Sep 26 2001 - Needs to be fixed.
Reported by LBS.
Batch experiment system was created before Chris' big rework of the
tb system. At this point, it would be cleaner to combine the batch
experiment table into the normal experiment table (would result in a
couple of extra slots). The two tables make for consistency and
locking problems. What I do like about the current batch system is
that it is more scriptable cause most stuff is done in the backend
script; the web interface does just a bit of checking and then passes
the whole thing off. The normal experiment creation path should look
like this too, since splitting stuff between the web interface and the
perl scripts is messy, and the PHP DB interface is not as nice or
robust as the perl interface.
*** LastLogin info in web page - Sep 27 2001 - Done by LBS on Oct 1 2001.
From: Jay Lepreau <>
To: "Leigh B. Stoller" <>
Subject: Re: cvs commit: testbed/www showproject_list.php3
Date: Wed, 26 Sep 2001 19:11:11 MDT
If it were pretty easy to get wtmp info from plastic
into the database, and displayed in a separate column
on this page, that would be great.
It's the info we need to know who's really active and who's inactive,
which is not only interesting but affects who we ask to free up nodes.
LBS Comment: Probably want last web login info too. This would be
easy to add with a last_login DB table that would be updated in
DOLOGIN in the web server.
*** Need to install ssh on Linux - Sep 27 2001 - Needs to be done.
Suggested by Jay.
I think we should install xinetd and rsh by default. (But not turn on
rsh.) For one thing, many computational cluster users will want MPI.
RPMS: /proj/parmc/rpms/xinetd-
*** Needs web page to change proj trust - Oct 3 2001 - Needs to be fixed.
Reported by James Griffioen.
No way to change the trust level for a user in a project. Need to
provide a web page for it.
*** Paperbag/plasticwrap bug - Oct 5 2001 -- Fixed by Rob.
Reported by magnus.
magnus@ops ~> os_load -i UTAHPC-FBSD+LINUX -w pc121 pc122 pc123 pc124
pc125 pc126 pc127
Sorry, you used a forbidden character
SSH failed. You may need to run the following commands:
mkdir -m 0755 /users/magnus/.ssh
ssh-keygen -P '' -f /users/magnus/.ssh/identity
cp /users/magnus/.ssh/ /users/magnus/.ssh/authorized_keys
chmod 600 /users/magnus/.ssh/authorized_keys
*** Inconsistent exit in scripts. Oct 3 2001 - Needs to be fixed
When going to background we should a __DIE__ hook to make sure that
the log file gets emailed off. Generally, the fatal error amd email
stuff is very inconsistent. Normal users should get warm fuzzies.
Informational stuff should go to us.
*** Per User info change - Oct 8 2001 - Needs to be fixed
From: Jay Lepreau <>
To: Leigh Stoller <>
Subject: Re: Feature Request
Date: Mon, 08 Oct 2001 06:14:43 MDT
After I sent that, I just checked it out. Realized that you
(logically) display the info on the general user info page. What I
was originally wanting was more accurate and precise data for the
general project info page. That would take a bunch of processing: for
each proj, go thru its users, find the minimum time for each type of
login and display that.
Don't need it right now, but eventually we're going to need the kind
of info to move projects to 'inactive'.
The same info is what we'll need to deschedule experiments, or arm twist
their users. But for expts we have the creator, who will usually
be the one using the machines.
*** Node control changes lost on swapin/out - Oct 30 2001 - Fixed by LBS
> From: Mac Newbold <>
> Subject: Re: swapin/out anomoly
> Date: Tue, 30 Oct 2001 15:53:45 -0700 (MST)
> Sometimes when we update settings, we update only the physical, and not
> the virtual, and the virtual is all that persists between swaps. IMHO, I
> think what we need to do is evaluate which things make sense to keep
> between swaps, which things don't, and which things should offer a
> choice. This affects OSs, delay params, and potentially other things
> too. Perhaps there's even a unified solution we can implement as part of
> the swapout process.
Its probably a good idea to take the nodecontrol web page and remove the
stuff that changes the DB. Instead, lets use a backend perl script that
will update the DB appropriately (and can be used from the command line
too). Basically, I'm not happy about doing that much DB munging in the web
interface, especially virt_nodes and virt_lans (since it would be nice to
present a web interface to change the delay params at some point).
*** Frisbee sucks up CPU - Nov 10 2001 - Needs to be fixed
Frisbee is sucking up 15% of the CPU. Needs to be profiled.
*** Add virtual name to node control for. - Nov 12 2001 -
Fixed by Leigh on Nov 28 2001.
Requested by Dave.
*** Add quotas on /users. - Nov 12 2001 - Needs to be done.
Need to build and install a kernel on ops with quotas configured in.
*** Have Linux kernel source available - Nov 12 2001 - Needs to be done.
Requested by Jay.
*** Use switch port for 10MBs - Nov 28 2001 - Needs to be done.
LBS - Nov 28th: I worked on this. Turns out the switches do not
like it when the nodes force their ports into 10MB full duplex,
and the switch disables the port.
*** Add "Must Change Password" state. - Dec 3rd 2001 - Needs to be done.
Suggested by Mac. When we change the password and email it, we
should require that user changes his password next time he logs
*** Mysterious TCP drop problem - Nov 21 2001 - Needs to be fixed.
From: Tian Bu <>
To: Mike Hibler <>
cc: <>
Subject: News on packet drop
Date: Thu, 29 Nov 2001 14:52:04 -0500 (EST)
After spending sometime on investigating why the packet drop occur
between a pair of linux nodes, I found that this problem is not
related to the OS. Instead, the mysterious packet drop only occurs
between a pair of nodes where one end is PC850 and the other end
is PC600. The settting I first saw the drop was a link between
a PC850 and a PC600. I did report that the drop does not appear
between a pair of freeBSD nodes. That is misleading because it was
measured on a different setting where the pair of nodes I reserved happen
to be both PC850. :-). Today I start another experiment where there are
a pair of PC850 running Linux and observe no mysterious packet drop
between them. I guess there may be some incompatible features between
either the interfaces installed on these two different types of machines.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment