- 18 Jan, 2018 1 commit
-
-
David Johnson authored
I've seen this take 10-12 seconds on a heavily-loaded vhost. So now we wait 60 seconds until we see the rootdev, and if we don't, warn and let things continue.
-
- 09 Jan, 2018 1 commit
-
-
David Johnson authored
-
- 08 Jan, 2018 1 commit
-
-
David Johnson authored
If the TBScriptLock caller provides a debug message, it will be stored in a file, and other blocked TBScriptLock callers will get (possibly slightly racy) info about who holds the lock. Then, use this in libvnode_xen to get some info about long calls to xl (create|halt|reboot|etc). Also enable lockdebug in libvnode_xen for now.
-
- 22 Nov, 2017 1 commit
-
-
David Johnson authored
(They must set the XEN_FORCE_HVM node attribute.)
-
- 10 Nov, 2017 1 commit
-
-
Mike Hibler authored
Also, add a print statement in the unmount code that Hussam added so that we can see how often this (still?) happens.
-
- 09 Nov, 2017 1 commit
-
-
Hussamuddin Nasir authored
Many LV segments fail on "dmsetup remove" because they are still mounted. Need to unmount them before the remove operation.
-
- 10 Oct, 2017 1 commit
-
-
Mike Hibler authored
This was the case where we would make a copy of the dom0 root FS to create the vnode's FS.
-
- 03 Aug, 2017 1 commit
-
-
Mike Hibler authored
-
- 05 Jul, 2017 1 commit
-
-
Mike Hibler authored
Apparently there are some issues with UFS2 support in Linux. Fsck mostly fixes incorrect block counts: INCORRECT BLOCK COUNT I=1043374 (0 should be 8) for inodes that get created by Linux (e.g., /etc/ssh host keys, /var/emulab/boot stuff). Everything seems to be fine after the fsck. Also: specify "-Zy" when creating LVMs so that old GPTs, superblocks, etc. don't leak through. LVM seems to be frightfully deterministic in its allocation strategies to the extent that virtual disks created in previous experiments have their metadata show up in newer experiment LVMs. All the things that are changed
-
- 15 Jun, 2017 1 commit
-
-
Mike Hibler authored
We were expecting it to always contain "ttyS0", but it could be "ttyS1" as well depending on where we last made the image.
-
- 02 Jun, 2017 1 commit
-
-
Mike Hibler authored
-
- 16 May, 2017 1 commit
-
-
Mike Hibler authored
So we can see where things might be getting hung.
-
- 11 Apr, 2017 1 commit
-
-
Mike Hibler authored
Make sure console is properly configured for "sio1" so we can see it. Kill off left over QEMU processes that occur in Xen 4.6 (due to their use of "-no-shutdown" on the qemu command line).
-
- 07 Apr, 2017 1 commit
-
-
Mike Hibler authored
-
- 30 Mar, 2017 1 commit
-
-
Mike Hibler authored
-
- 18 Mar, 2017 1 commit
-
-
Mike Hibler authored
If an inner boss/ops has to fsck its filesystems, it can take quite a while and that happens prior to bringing up the network.
-
- 10 Mar, 2017 1 commit
-
-
Mike Hibler authored
-
- 27 Feb, 2017 2 commits
-
-
Leigh B Stoller authored
Hussam found the answer at: http://northernmost.org/blog/gre-tunnels-and-ufw/index.html
-
Mike Hibler authored
This is only done for shared nodes right now.
-
- 16 Feb, 2017 1 commit
-
-
Mike Hibler authored
Not sure how I got headed down this path, but here I am: * replace use of "ps" and "grep" with, wait for it..."pgrep"! * explicitly specify type=vif so we don't wind up with the extra, vifN.M-emu backend interface that gets left laying around, * add -F option to "xl shutdown" which is needed for HVMs else shutdown will fail and the domain won't go away (qemu left behind) and FBSD filesystem can be messed up, * Use "hd" instead of "sd" to avoid emulated SCSI driver which has caused me grief in the past (though it should never actually get used due to PVHVM config of kernel).
-
- 15 Feb, 2017 1 commit
-
-
Mike Hibler authored
Hopefully, this will decrease the chances that a vnode teardown will fail due to a busy LV.
-
- 18 Nov, 2016 1 commit
-
-
Mike Hibler authored
If you add an "extravifs" file in the same directory as the "xm.conf" file, they will get added to the dynamically created xm.conf file (which is why just adding them to the existing xm.conf doesn't work!)
-
- 24 Oct, 2016 1 commit
-
-
Mike Hibler authored
-
- 29 Sep, 2016 1 commit
-
-
Mike Hibler authored
The bigest improvement happened on day one when I took out the 20 second sleep between vnode starts in bootvnodes. That appears to have been an artifact of an older time and an older Xen. Or, someone smarter than me saw the potential of getting bogged down for, oh say three weeks, trying to micro-optimize the process and instead just went for the conservative fix! Following day one, the ensuing couple of weeks was a long strange trip to find the maximum number of simultaneous vnode creations that could be done without failure. In that time I tried a lot of things, generated a lot of graphs, produced and tweaked a lot of new constants, and in the end, wound up with the same two magic numbers (3 and 5) that were in the original code! To distinguish myself, I added a third magic number (1, the loneliest of them all). All I can say is that now, the choice of 3 or 5 (or 1), is based on more solid evidence than before. Previously it was 5 if you had a thin-provisioning LVM, 3 otherwise. Now it is based more directly on host resources, as described in a long comment in the code, the important part of which is: # # if (dom0 physical RAM < 1GB) MAX = 1; # if (any swap activity) MAX = 1; # # This captures pc3000s/other old machines and overloaded (RAM) machines. # # if (# physical CPUs <= 2) MAX = 3; # if (# physical spindles == 1) MAX = 3; # if (dom0 physical RAM <= 2GB) MAX = 3; # # This captures d710s, Apt r320, and Cloudlab m510s. We may need to # reconsider the latter since its single drive is an NVMe device. # But first we have to get Xen working with them (UEFI issues)... # # else MAX = 5; In my defense, I did fix some bugs and stuff too (and did I mention the cool graphs?) See comments in the code and gitlab emulab/emulab-devel issue #148.
-
- 02 Sep, 2016 1 commit
-
-
Leigh B Stoller authored
elabinelab experiments cause it messes up geni rack builds. Revisit this later, but elabinelab is mostly a Utah thing, and we already block port 111 at the firewall.
-
- 31 Aug, 2016 1 commit
-
-
Leigh B Stoller authored
removing anything else from the file below what we add. This was biting us on elabinelab setup. Also a fix to the fixup to fstab for device names.
-
- 12 Aug, 2016 1 commit
-
-
Mike Hibler authored
Also clean up some code that caused spurrious error messages in the log.
-
- 28 Jul, 2016 1 commit
-
-
Mike Hibler authored
-
- 21 Jul, 2016 1 commit
-
-
Mike Hibler authored
sfdisk and lvm2 have changed in incompatible ways since Xen 4.4/Ubuntu14.
-
- 07 Jun, 2016 1 commit
-
-
Mike Hibler authored
(Re)initializing the ssh keys and ssl certs breaks the elabinelab.
-
- 09 Feb, 2016 1 commit
-
-
Mike Hibler authored
-
- 04 Jan, 2016 1 commit
-
-
Leigh B Stoller authored
invoke emulab-enet so that the iptables forwarding rules are added.
-
- 21 Dec, 2015 1 commit
-
-
Leigh B Stoller authored
the guest is on the jail network (instead of routable IP), use the jail IP of the local phys host (we always add a jail network alias to the phy host). This avoids the traffic going up to the "router" and back.
-
- 01 Dec, 2015 1 commit
-
-
Leigh B Stoller authored
through the loop we look to see if signals are pending, and if so we return early with an error. The caller (libvnode_xen) can use this to avoid really long waits, when the server has said to stop what its doing. For example, a vnode setup is waiting for an image lock, but the server comes along ands to stop setting up. Previously, we would wait for the lock, now we return early. This is to help with cancelation where it is nice if the server can stop a CreateSliver() in its tracks, when it is safe to do so.
-
- 24 Nov, 2015 2 commits
-
-
Mike Hibler authored
Do the testing Mike...
-
Mike Hibler authored
We don't want to stripe an SSD in with regular HDs (specifically at Wisconsin) when creating the LVM VG for vnodes. They are typically a lot smaller and won't improve performance of the VG since any operation will likely be limited by the speed of the rotating disks.
-
- 20 Nov, 2015 1 commit
-
-
Mike Hibler authored
Specifically designed to avoid using root SSD on new d430 nodes.
-
- 27 Oct, 2015 3 commits
-
-
Leigh B Stoller authored
-
Leigh B Stoller authored
broken VM is still active, before trying to kill it (which then fails).
-
Leigh B Stoller authored
reboot that it does when the filesystems are first created, so as redo what might have been lost, say by prepare before snapshot. This required saving some private data in the non private section of the data blob we store to disk for each VM so that we have enough info to mount and fix the root FS of the node being rebooted.
-