Commit 49d2f488 authored by Mike Hibler's avatar Mike Hibler
Browse files

Resurrect/enhance delta image code. Another multi-day "one hour hack"!

Resurrect: get the basic signature matching code working again.

Enhance: add -U option to have imagezip update (or create) the
signature file. Previously, the signature file was created off-line
on boss with the imagehash command (that would be Mike's "imagehash"
(/usr/testbed/bin/imagehash) and not Leigh's "imagehash"
(/usr/testbed/sbin/imagehash)). Creating it as we create the image
makes a lot of sense...except for how we do it. We actually read and
create the hashes as a separate pass before we re-read, compress, and
create the image--so we read the disk twice. [This is primarily because we
are mooching off of the existing hash checking code (-H option). Doing
this right will require re-writing The Big Loop which makes a single
pass through the data, simultaneously dealing with disk IO, allocated
ranges, and compression blocks all of which have different size/alignment
criteria. But I digress...] Anyway, reading the disk data twice sucks,
but at least it is on the client and not on boss. The take away is:
don't create your images on pc600s.

Note that -U will always create a signature file for the complete disk
or partition even when you are creating a delta image (i.e., when combined
with -H).

Enhance: add "-P <pct>" option, used with -H, which tells imagezip that if
a resulting delta image would be over <pct> percent the size (where size
is number of uncompressed sectors in the image) of a full image, then just
create a full image instead. So "-P 50" says if it would be over half the
size, "-P 200" says if it was over twice the size, etc. If you always want
a delta image to be produced, use -H without -P. If you always want a full
image, don't use -H.

This is part 2 of supporting images. Part 1 is the DB and user interface
changes that Leigh is working on. Part 3 is next up and involves modifying
the image creation MFS to download and use signatures along with the
new imagezip when creating images. Stay tuned.
parent 784fa5af
#
# Copyright (c) 2000-2013 University of Utah and the Flux Group.
# Copyright (c) 2000-2014 University of Utah and the Flux Group.
#
# {{{EMULAB-LICENSE
#
......@@ -95,11 +95,11 @@ WITH_NTFS = @WINSUPPORT@
WITH_FAT = @WINSUPPORT@
# Note: requires WITH_CRYPTO
WITH_HASH = 0
WITH_HASH = 1
include $(OBJDIR)/Makeconf
SUBDIRCFLAGS = -Wall -O2 -g # -ansi -pedantic
SUBDIRCFLAGS = -Wall -g -O2 # -ansi -pedantic
ifeq ($(SYSTEM),Linux)
SUBDIRCFLAGS += -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D_THREAD_SAFE -DCONDVARS_WORK -pthread
else
......@@ -232,7 +232,7 @@ imagehash.o: imagehash.c
$(SUBDIRS):
@$(MAKE) SUBDIRCFLAGS="$(SUBDIRCFLAGS)" -C $@ all
imagezip.o: sliceinfo.h imagehdr.h global.h
imagezip.o: sliceinfo.h imagehdr.h global.h range.h hashmap/hashmap.h
imageunzip.o: imagehdr.h
imagehash.o: imagehdr.h imagehash.h
......
......@@ -25,7 +25,7 @@ Things to do for image*:
6. Image hashing.
[ DONE -- as a separate program, imagehash. It would be more efficient
to have imagezip create the signature as it does. ]
to have imagezip create the signature as it goes. ]
Create a "signature" file for an image using a collision-resistant hash
like MD5 or SHA-1. See TODO.hash for more.
......@@ -155,7 +155,7 @@ Things to do for image*:
the image.
So the most important operation is insert, and it isn't necessary to
maintain a sorted listed. However, it might prove to be practical to
maintain a sorted list. However, it might prove to be practical to
keep it sorted if merge operations can be efficiently performed. An
alternative approach, to avoid the invert operation, would be to start
with a single allocated range (blocks 0 - sizeof_disk) and every time
......@@ -189,6 +189,7 @@ Things to do for image*:
but that was still implemented with singly-linked lists.
12. Better handling of "insignificant" free ranges.
[ DONE: via the fixup method mentioned and a new -Z option. ]
The -F option in imagezip let's you say that free ranges below a
certain size should be "forgotten", effectively making that range
allocated. This promotes longer sequential writes at the expense of
......@@ -237,3 +238,13 @@ Things to do for image*:
Hmm...though it appears that FreeBSD doesn't export an "erase" command
to user-mode, only to filesystems. Linux does, at least "hdparm" supports
"--trim-sector-ranges" which uses it to erase sections of a disk.
15. Better filling of chunks.
Look at public-domain fitblk.c as a way to better fill the 1MB chunks
rather than our ad-hoc "give it smaller and smaller pieces the closer
we get to 1MB". Ours is more time efficient (single pass), but fitbit
will probably do a better job of compressing data into chunks (but
make three compression passes!) Note that our input granularity is
sectors (or possibly 64K hash-block sized pieces) rather than bytes so
it makes the input part a little more complicated.
/*
* Copyright (c) 2000-2012 University of Utah and the Flux Group.
* Copyright (c) 2000-2014 University of Utah and the Flux Group.
*
* {{{EMULAB-LICENSE
*
......@@ -40,7 +40,11 @@ extern void addfixup(off_t offset, off_t poffset, off_t size, void *data,
extern void addfixupfunc(void (*func)(void *, off_t, void *), off_t offset,
off_t poffset, off_t size, void *data, int dsize,
int reloctype);
extern void applyfixups(off_t offset, off_t size, void *data);
extern int hasfixup(uint32_t soffset, uint32_t ssize);
extern void savefixups(void);
extern void restorefixups(int isempty);
extern void dumpfixups(int verbose, int count);
extern SLICEMAP_PROCESS_PROTO(read_bsdslice);
extern SLICEMAP_PROCESS_PROTO(read_linuxslice);
......
This diff is collapsed.
/*
* Copyright (c) 2000-2004 University of Utah and the Flux Group.
* Copyright (c) 2000-2014 University of Utah and the Flux Group.
*
* {{{EMULAB-LICENSE
*
......@@ -21,20 +21,9 @@
* }}}
*/
/* XXX from global.h */
extern int secsize;
#define sectobytes(s) ((off_t)(s) * secsize)
#define bytestosec(b) (uint32_t)((b) / secsize)
/* XXX from imagezip.c */
struct range {
uint32_t start; /* In sectors */
uint32_t size; /* In sectors */
void *data;
struct range *next;
};
void free_ranges(struct range **);
#define MAX(x, y) (((x) > (y)) ? (x) : (y))
#define MIN(x, y) (((x) > (y)) ? (y) : (x))
int hashmap_blocksize(void);
void hashmap_update_chunk(uint32_t, uint32_t, int);
int hashmap_compute_delta(struct range *, char *, int, u_int32_t, int,
struct range **);
int hashmap_write_hashfile(char *, u_int32_t);
void hashmap_dump_stats(int pnum);
/*
* Copyright (c) 2000-2013 University of Utah and the Flux Group.
* Copyright (c) 2000-2014 University of Utah and the Flux Group.
*
* {{{EMULAB-LICENSE
*
......@@ -69,7 +69,7 @@ static int doall = 1;
static int detail = 0;
static int create = 0;
static int report = 0;
static int fixedoffset = 0;
static int withchunkno = 1;
static int regfile = 0;
static int nothreads = 0;
static int hashtype = HASH_TYPE_SHA1;
......@@ -77,7 +77,7 @@ static int hashlen = 20;
static long hashblksize = HASHBLK_SIZE;
static int hashblksizeinsec;
static unsigned long long ndatabytes;
static unsigned long nchunks, nregions, nhregions;
static unsigned long nchunks, nregions, nhregions, nsplithashes;
static char *imagename;
static char *fileid = NULL;
static char *sigfile = NULL;
......@@ -88,7 +88,7 @@ static void usage(void);
static int gethashinfo(char *name, struct hashinfo **hinfo);
static int readhashinfo(char *name, struct hashinfo **hinfop);
static int checkhash(char *name, struct hashinfo *hinfo);
static void dumphash(char *name, struct hashinfo *hinfo);
static void dumphash(char *name, struct hashinfo *hinfo, int withchunk);
static int createhash(char *name, struct hashinfo **hinfop);
static int hashimage(char *name, struct hashinfo **hinfop);
static int hashchunk(int chunkno, char *chunkbufp, struct hashinfo **hinfop);
......@@ -121,11 +121,13 @@ main(int argc, char **argv)
extern char build_info[];
struct hashinfo *hashinfo = 0;
while ((ch = getopt(argc, argv, "cb:dvhno:rD:NVRfF:")) != -1)
while ((ch = getopt(argc, argv, "cCb:dvhno:rD:NVRF:")) != -1)
switch(ch) {
case 'b':
hashblksize = atol(optarg);
if (hashblksize < 512 || hashblksize > (32*1024*1024)) {
if (hashblksize < 512 ||
hashblksize > (32*1024*1024) ||
hashblksize != sectobytes(bytestosec(hashblksize))) {
fprintf(stderr, "Invalid hash block size\n");
usage();
}
......@@ -135,11 +137,12 @@ main(int argc, char **argv)
case 'F':
fileid = strdup(optarg);
break;
case 'f':
fixedoffset = 1;
case 'C':
withchunkno = 0;
break;
case 'R':
report++;
break;
case 'c':
create++;
break;
......@@ -191,8 +194,24 @@ main(int argc, char **argv)
exit(0);
}
if ((create && argc < 1) || (!create && argc < 2))
/* XXX part of hack special case to dump a sigfile */
if (report && !create && sigfile == NULL)
create++;
if ((create && argc < 1) || (!create && sigfile == NULL && argc < 2))
usage();
hashblksizeinsec = bytestosec(hashblksize);
/* XXX hack special case to dump a sigfile */
if (!create && sigfile != NULL) {
if (readhashinfo("", &hashinfo) != 0)
exit(2);
detail = 2;
dumphash("", hashinfo, withchunkno);
exit(0);
}
imagename = argv[0];
/*
......@@ -220,8 +239,6 @@ main(int argc, char **argv)
signal(SIGINFO, dump_stats);
#endif
hashblksizeinsec = bytestosec(hashblksize);
/*
* Raw image comparison
*/
......@@ -239,7 +256,7 @@ main(int argc, char **argv)
} else {
if (createhash(argv[0], &hashinfo))
exit(2);
dumphash(argv[0], hashinfo);
dumphash(argv[0], hashinfo, withchunkno);
}
exit(0);
}
......@@ -249,7 +266,7 @@ main(int argc, char **argv)
*/
if (gethashinfo(argv[0], &hashinfo))
exit(2);
dumphash(argv[0], hashinfo);
dumphash(argv[0], hashinfo, withchunkno);
if (checkhash(argv[1], hashinfo))
exit(1);
exit(0);
......@@ -266,6 +283,8 @@ usage(void)
" create a signature file for the specified image\n"
"imagehash -R [-dr] [-b blksize] <image-filename>\n"
" output an ASCII report to stdout rather than creating a signature file\n"
"imagehash -R -o sigfile\n"
" output an ASCII report of the indicated signature file\n"
"imagehash -v\n"
" print version info and exit\n"
"\n"
......@@ -344,7 +363,7 @@ readhashinfo(char *name, struct hashinfo **hinfop)
return -1;
}
if (strcmp((char *)hi.magic, HASH_MAGIC) != 0 ||
hi.version != HASH_VERSION) {
!(hi.version == HASH_VERSION_1 || hi.version == HASH_VERSION_2)) {
fprintf(stderr, "%s: not a valid signature file\n", hname);
goto bad;
}
......@@ -374,6 +393,17 @@ readhashinfo(char *name, struct hashinfo **hinfop)
break;
}
nhregions = hinfo->nregions;
if (hinfo->version > HASH_VERSION_1) {
if (hinfo->blksize != hashblksizeinsec) {
fprintf(stderr,
"WARNING: changing hash blocksize %d -> %d sectors\n",
hashblksizeinsec, hinfo->blksize);
hashblksizeinsec = hinfo->blksize;
hashblksize = sectobytes(hashblksizeinsec);
if (maxreadbufmem < hashblksize)
maxreadbufmem = hashblksize;
}
}
return 0;
}
......@@ -390,6 +420,7 @@ addhash(struct hashinfo **hinfop, int chunkno, uint32_t start, uint32_t size,
struct hashinfo *hinfo = *hinfop;
int nreg;
assert(chunkno >= 0);
if (report) {
printf("%s\t%u\t%u",
spewhash(hash, hashlen), start, size);
......@@ -424,20 +455,40 @@ addhash(struct hashinfo **hinfop, int chunkno, uint32_t start, uint32_t size,
}
static void
dumphash(char *name, struct hashinfo *hinfo)
dumphash(char *name, struct hashinfo *hinfo, int withchunk)
{
uint32_t i;
struct hashregion *reg;
int haschunkrange = 0;
if (hinfo->version > HASH_VERSION_1)
haschunkrange = 1;
if (detail > 1) {
for (i = 0; i < hinfo->nregions; i++) {
reg = &hinfo->regions[i];
printf("[%u-%u]: chunk %d, hash %s\n",
printf("[%u-%u] (%d): ",
reg->region.start,
reg->region.start + reg->region.size-1,
reg->chunkno, spewhash(reg->hash, hashlen));
reg->region.size);
if (withchunk) {
/* upper bit indicates chunkrange */
if (HASH_CHUNKDOESSPAN(reg->chunkno)) {
int chunkno =
HASH_CHUNKNO(reg->chunkno);
printf("chunk %d-%d, ",
chunkno, chunkno + 1);
nsplithashes++;
} else
printf("chunk %d, ",
(int)reg->chunkno);
} else if (HASH_CHUNKDOESSPAN(reg->chunkno))
nsplithashes++;
printf("hash %s\n", spewhash(reg->hash, hashlen));
}
}
if (nsplithashes)
printf("%lu hashes split across chunks\n", nsplithashes);
}
static char *
......@@ -491,8 +542,9 @@ createhash(char *name, struct hashinfo **hinfop)
*/
hinfo = *hinfop;
strcpy((char *)hinfo->magic, HASH_MAGIC);
hinfo->version = HASH_VERSION;
hinfo->version = HASH_VERSION_2;
hinfo->hashtype = hashtype;
hinfo->blksize = hashblksizeinsec;
count = sizeof(*hinfo) + hinfo->nregions*sizeof(struct hashregion);
cc = write(ofd, hinfo, count);
close(ofd);
......@@ -579,9 +631,9 @@ checkhash(char *name, struct hashinfo *hinfo)
fprintf(stderr, "Checking disk contents using %s\n", hashstr);
for (i = 0, reg = hinfo->regions; i < hinfo->nregions; i++, reg++) {
if (chunkno != reg->chunkno) {
if (chunkno != HASH_CHUNKNO(reg->chunkno)) {
nchunks++;
chunkno = reg->chunkno;
chunkno = HASH_CHUNKNO(reg->chunkno);
}
size = sectobytes(reg->region.size);
rbuf = getblock(reg);
......@@ -976,8 +1028,16 @@ hashchunk(int chunkno, char *chunkbufp, struct hashinfo **hinfop)
rstart = regp->start;
rsize = regp->size;
ndatabytes += sectobytes(rsize);
if (fixedoffset)
startoff = rstart % hashblksizeinsec;
/*
* Keep hash blocks aligned with real disk offsets.
* This might result in fragments at the start and
* end of the allocated range if it doesn't line up
* with a hash block boundary and/or is not a multiple
* of the hash block size.
*/
startoff = rstart % hashblksizeinsec;
while (rsize > 0) {
if (startoff) {
hsize = hashblksizeinsec - startoff;
......
/*
* Copyright (c) 2000-2005 University of Utah and the Flux Group.
* Copyright (c) 2000-2014 University of Utah and the Flux Group.
*
* {{{EMULAB-LICENSE
*
......@@ -21,11 +21,18 @@
* }}}
*/
#define HASH_VERSION_1 0x20031107
#define HASH_VERSION_2 0x20140618
#define HASH_VERSION HASH_VERSION_2
#define HASH_MAGIC ".ndzsig"
#define HASH_VERSION 0x20031107
#define HASHBLK_SIZE (64*1024)
#define HASH_MAXSIZE 20
#define HASH_CHUNKNO(c) ((c) & ~(1 << 31))
#define HASH_CHUNKDOESSPAN(c) (((c) & (1 << 31)) ? 1 : 0)
#define HASH_CHUNKSETSPAN(c) ((c) | (1 << 31))
struct hashregion {
struct region region;
uint32_t chunkno;
......@@ -37,7 +44,8 @@ struct hashinfo {
uint32_t version;
uint32_t hashtype;
uint32_t nregions;
uint8_t pad[12];
uint32_t blksize; /* V2: make hash blocksize explicit */
uint8_t pad[8];
struct hashregion regions[0];
};
......
/*
* Copyright (c) 2000-2011 University of Utah and the Flux Group.
* Copyright (c) 2000-2014 University of Utah and the Flux Group.
*
* {{{EMULAB-LICENSE
*
......@@ -31,7 +31,7 @@
* is 1,768,515,945!
*
* V2 introduced the first and last sector fields as well
* as basic relocations.
* as basic relocations. Also dropped maintenance of blocktotal.
*
* V3 introduced LILO relocations for Linux partition images.
* Since an older imageunzip would still work, but potentially
......@@ -58,8 +58,8 @@
struct blockhdr_V1 {
uint32_t magic; /* magic/version */
uint32_t size; /* Size of compressed part */
int32_t blockindex; /* netdisk: which block we are */
int32_t blocktotal; /* netdisk: total number of blocks */
int32_t blockindex; /* which block we are */
int32_t blocktotal; /* V1: total number of blocks */
int32_t regionsize; /* sizeof header + regions */
int32_t regioncount; /* number of regions */
};
......@@ -75,8 +75,8 @@ struct blockhdr_V1 {
struct blockhdr_V2 {
uint32_t magic; /* magic/version */
uint32_t size; /* Size of compressed part */
int32_t blockindex; /* netdisk: which block we are */
int32_t blocktotal; /* netdisk: total number of blocks */
int32_t blockindex; /* which block we are */
int32_t blocktotal; /* V1: total number of blocks */
int32_t regionsize; /* sizeof header + regions */
int32_t regioncount; /* number of regions */
/* V2 follows */
......@@ -112,8 +112,8 @@ struct blockhdr_V2 {
struct blockhdr_V4 {
uint32_t magic; /* magic/version */
uint32_t size; /* Size of compressed part */
int32_t blockindex; /* netdisk: which block we are */
int32_t blocktotal; /* netdisk: total number of blocks */
int32_t blockindex; /* which block we are */
int32_t blocktotal; /* V1: total number of blocks */
int32_t regionsize; /* sizeof header + regions */
int32_t regioncount; /* number of regions */
/* V2 follows */
......
/*
* Copyright (c) 2000-2013 University of Utah and the Flux Group.
* Copyright (c) 2000-2014 University of Utah and the Flux Group.
*
* {{{EMULAB-LICENSE
*
......@@ -59,6 +59,20 @@
#include "sliceinfo.h"
#include "global.h"
#include "checksum.h"
#include "range.h"
#ifdef WITH_HASH
#include "hashmap/hashmap.h"
#endif
/*
* Attempt to split chunks so that hash blocks don't span chunk boundaries.
*
* XXX nice thought, but it doesn't do a very good job (saves less than 50%)
* of the crossings at the expense of wasting about 4% more space). Unless
* we come up with a less hacky way to fill chunks (e.g. PD fitblk.c) where
* we can avoid it entirely, don't even try.
*/
#undef WITH_HASH_CHUNKSPLIT
/* XXX this is a hack right now */
#define USE_HACKSORT 0
......@@ -98,6 +112,8 @@ static unsigned char imageid[UUID_LENGTH];
#ifdef WITH_HASH
char *hashfile;
int newhashfile;
int deltapct = -1;
#endif
#ifdef WITH_CRYPTO
......@@ -125,17 +141,8 @@ extern unsigned long getdisksize(int fd);
unsigned long inputminsec = 0;
unsigned long inputmaxsec = 0; /* 0 means the entire input image */
/*
* A list of data ranges.
*/
struct range {
uint32_t start; /* In sectors */
uint32_t size; /* In sectors */
void *data;
struct range *next;
};
struct range *ranges, *skips, *fixups;
int numranges, numskips;
int numranges, numskips, numfixups;
struct blockreloc *relocs;
int numregions, numrelocs;
......@@ -145,11 +152,11 @@ static void sortrange(struct range **head, int domerge,
int mergeskips(int verbose);
int mergeranges(struct range *head);
void makeranges(void);
void freeranges(struct range *);
void dumpranges(int verbose);
void dumpfixups(int verbose);
uint32_t sectinranges(struct range *range);
void addvalid(uint32_t start, uint32_t size);
void addreloc(off_t offset, off_t size, int reloctype);
void removereloc(off_t offset, off_t size, int reloctype);
static int cmpfixups(struct range *r1, struct range *r2);
static int read_doslabel(int infd, int lsect, int pstart,
struct doslabel *label);
......@@ -160,11 +167,6 @@ int read_raw(void);
int compress_image(void);
void usage(void);
#ifdef WITH_HASH
struct range *hashmap_compute_delta(struct range *, char *, int, u_int32_t);
void report_hash_stats(int pnum);
#endif
static SLICEMAP_PROCESS_PROTO(read_slice);
struct slicemap fsmap[] = {
......@@ -442,7 +444,7 @@ main(int argc, char *argv[])
memset(imageid, '\0', UUID_LENGTH);
gettimeofday(&sstamp, 0);
while ((ch = getopt(argc, argv, "vlbnNdihrs:c:z:ofI:13F:DR:S:XxH:Me:k:u:a:Z")) != -1)
while ((ch = getopt(argc, argv, "vlbnNdihrs:c:z:ofI:13F:DR:S:XxH:UP:Me:k:u:a:Z")) != -1)
switch(ch) {
case 'v':
version++;
......@@ -519,14 +521,26 @@ main(int argc, char *argv[])
case 'X':
forcereads++;
break;
case 'H':
#ifdef WITH_HASH
case 'H':
hashfile = optarg;
break;
case 'U':
newhashfile = 1;
break;
case 'P':
deltapct = atoi(optarg);
if (deltapct < 0)
usage();
break;
#else
fprintf(stderr, "'H' option not supported\n");
case 'H':
case 'U':
case 'P':
fprintf(stderr, "'%c' option not supported\n", ch);
usage();
#endif
break;
#endif
case 'M':
metaoptimize++;
break;
......@@ -645,8 +659,16 @@ main(int argc, char *argv[])
fprintf(stderr, "Must specify an output filename!\n\n");
usage();
}
else
else {
outfilename = argv[1];
#ifdef WITH_HASH
if (strcmp(outfilename, "-") == 0 && newhashfile) {
fprintf(stderr,
"Cannot create hashfile with outfile==stdout\n");
usage();
}
#endif
}
if (!slicemode && !filemode && dorelocs)
dorelocs = 0;
......@@ -771,38 +793,62 @@ main(int argc, char *argv[])
dumpranges(debug > 1);
sortrange(&fixups, 0, cmpfixups);
if (debug > 1)
dumpfixups(debug > 2);
dumpfixups(debug > 2, 0);
fflush(stderr);
#ifdef WITH_HASH
/*
* If we are creating a "delta" image from a hash signature,
* we read in the signature info and reconcile that with the
* known allocated range that we have just computed. The result
* is a new list of ranges that are currently allocated and that
* have changed from the signature version.
* (hashfile != NULL) we read in the signature info and reconcile
* that with the known allocated range that we have just computed.
* The result is a new list of ranges that are currently allocated
* and that have changed from the signature version.
*
* If we are creating a new signature file (newhashfile != 0)
* then we also collect hashinfo along the way, writing out the
* newfile when done.
*/
if (hashfile != NULL) {
struct range *nranges;
if (hashfile || newhashfile) {
struct range *nranges = NULL;
/*
* next compare allocated 'ranges' and 'hinfo' to find out the
* changed blocks -- computing the hashes for some 'ranges'
* in the process
*/