Commit dd90db5d authored by Mike Hibler's avatar Mike Hibler
Browse files

Always more TODO...

parent 5aff2d89
......@@ -23,11 +23,15 @@ Things to do for image*:
5. Imageunzip could be triple-threaded like frisbee, i.e., split the
file reading and decompression that are currently one in imageunzip.
6. Image hashing.
6. Image hashing and "delta" images.
[ DONE -- as a separate program, imagehash. It would be more efficient
to have imagezip create the signature as it goes. ]
[ DONE -- imagezip now can create the signature "as it goes". But it
makes an extra read of the disk data to do it. This could be further
optimized to compute the hash as it reads/compresses the data. ]
Create a "signature" file for an image using a collision-resistant hash
like MD5 or SHA-1. See TODO.hash for more.
like MD5 or SHA-1. This signature could be used to create incremental
(delta) images based on a full image. See TODO.hash for more.
7. Add an option to exclude (skip) disk blocks outside of any DOS partition.
By default, we want to include these blocks in the image since some
......@@ -248,3 +252,7 @@ Things to do for image*:
sectors (or possibly 64K hash-block sized pieces) rather than bytes so
it makes the input part a little more complicated.
16. Better tools for delta images.
There are lots of opportunities for "off-line" manipulation of delta
images; e.g., creating full images from deltas or vice-versa.
1. imagedelta
We need a tool to take two images and produce a delta based on the differences:
imagedelta image1.ndz image2.ndz delta1to2.ndz
Note that order matters here! We are generating a delta to get from "image1"
to "image2"; i.e., doing:
imageunzip image1.ndz /dev/da0
imageunzip delta1to2.ndz /dev/da0
would be identical to:
imageunzip image2.ndz /dev/da0
This gives us an off-line way to convert full images into delta images in
our versioned-image world. For example, after we have created a new version
of a standard image, we might want to replace the old one with a delta:
imagedelta FOO-V2.ndz FOO-V1.ndz FOO-delta.ndz
mv FOO-delta.ndz FOO-V1.ndz
or we could use it to retroactively make custom images deltas based on
the base image:
imagedelta FOO.ndz MIKES-FOO.ndz FOO-delta.ndz
mv FOO-delta.ndz MIKES-FOO.ndz
This should not be too hard to implement. We scan the chunks headers of
both images (image1, image2) to produce allocated range lists for both
(R1, R2). We use these to produce a range list for the delta (RD) as follows.
* Anything that is in R1 but not R2 does not go in RD.
* Anything in R2 but not in R1 must go into RD.
* For overlapping areas, we read and hash both and, if different,
include in RD.
* Using RD, select data from image2 that need to be read, decompressed
and then recompressed into the new image.
There is the usual issue of dealing with the difference in granularity and
alignment of ranges (arbitrary multiples of 512 byte) vs. hash blocks (64K
byte), but that logic exists in imagezip today.
2. imageundelta
The converse operation, producing a full image version of something that
is currently a delta (or string of deltas), is also useful:
imageundelta base.ndz delta1.ndz [... deltaN.ndz] fullimage.ndz
For example, if we wanted to archive old versions of an image, it would
probably be best if they were full images:
imageundelta FOO-V3.ndz FOO-V2.ndz FOO-V1.ndz FOO-V1-full.ndz
imageundelta FOO-V3.ndz FOO-V2.ndz FOO-V2-full.ndz
imageundelta FOO-V3.ndz FOO-V2.ndz FOO-V2-full.ndz
imageundelta FOO-V2-full.ndz FOO-V1.ndz FOO-V1-full.ndz
followed by:
scp FOO-V*-full.ndz
Similar logic to imagedelta for doing the work: reconstruct the range lists
for the full image and all the deltas, "merge" them all to produce the
range for the new full image, read/decompress/recompress as appropriate.
When merging the ranges, we need to keep track of which image to read the
data from in the resulting range list.
3. imagerebase
Another thing we want to do is to be able to squash out ("rebase" in git
terminology) "unused" delta images in a chain of images. E.g., if we have:
BASE.ndz -> D1.ndz -> D2.ndz -> D3.ndz -> D-current.ndz
and no derived images are using D[123], then we would like to be able to
recreate D-current to be directly derived from BASE.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment