- 24 Jan, 2014 40 commits
-
-
Kevin Wolf authored
We can only have a single wait_serialising_requests() call per request because otherwise we can run into deadlocks where requests are waiting for each other. The same is true when wait_serialising_requests() is not at the very beginning of a request, so that other requests can be issued between the start of the tracking and wait_serialising_requests(). Fix this by changing wait_serialising_requests() to ignore requests that are already (directly or indirectly) waiting for the calling request. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
Copy on Read wants to serialise with all requests touching the same cluster, so wait_serialising_requests() rounded to cluster boundaries. Other users like alignment RMW will have different requirements, though (requests touching the same sector), so make it dynamic. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
Change the API so that specific requests can be marked serialising. Only these requests are checked for overlaps then. This means that during a Copy on Read operation, not all requests overlapping other requests are serialised any more, but only those that actually overlap with the specific COR request. Also remove COR from function and variable names because this functionality can be useful in other contexts. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
Odd file sizes could make bdrv_aligned_preadv() shorten the request in non-aligned ways. Fix it by rounding to the required alignment instead of 512 bytes. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
Previously, it was not possible to use wait_for_overlapping_requests() between tracked_request_begin()/end() because it would wait for itself. Ignore the current request in the overlap check and run more of the bdrv_co_do_preadv/pwritev code with a BdrvTrackedRequest present. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
This is going to become the bdrv_co_do_preadv() equivalent for writes. In this patch, however, just a function taking byte offsets is created, it doesn't align anything yet. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
First waiting for all COR requests to complete and calling the throttling function afterwards means that the request could be delayed and we still need to wait for the COR request even if it was issued only after the throttled write request. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
This separates the part of bdrv_co_do_writev() that needs to happen before the request is modified to match the backend alignment, and a part that needs to be executed afterwards and passes the request to the BlockDriver. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
Similar to bdrv_pread(), which aligns byte-aligned request to 512 byte sectors, bdrv_co_do_preadv() takes a byte-aligned request and aligns it to the alignment specified in bs->request_alignment. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
This separates the part of bdrv_co_do_readv() that needs to happen before the request is modified to match the backend alignment, and a part that needs to be executed afterwards and passes the request to the BlockDriver. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Paolo Bonzini authored
Add a bs->request_alignment field that contains the required offset/length alignment for I/O requests and fill it in the raw block drivers. Use ioctls if possible, else see what alignment it takes for O_DIRECT to succeed. While at it, also expose the memory alignment requirements, which may be (and in practice are) different from the disk alignment requirements. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Paolo Bonzini authored
The alignment field is now set to the value that is promised to the guest, rather than required by the host. The next patches will make QEMU aware of the host-provided values, so make this clear. The alignment is also not about memory buffers, but about the sectors on the disk, change the documentation of the field. At this point, the field is set by the device emulation, but completely ignored by the block layer. Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
bs->buffer_alignment is set by the device emulation and contains the logical block size of the guest device. This isn't something that the block layer should know, and even less something to use for determining the right alignment of buffers to be used for the host. The new BlockLimits field opt_mem_alignment tells the qemu block layer the optimal alignment to be used so that no bounce buffer must be used in the driver. This patch may change the buffer alignment from 4k to 512 for all callers that used qemu_blockalign() with the top-level image format BlockDriverState. The value was never propagated to other levels in the tree, so in particular raw-posix never required anything else than 512. While on disks with 4k sectors direct I/O requires a 4k alignment, memory may still be okay when aligned to 512 byte boundaries. This is what must have happened in practice, because otherwise this would already have failed earlier. Therefore I don't expect regressions even with this intermediate state. Later, raw-posix can implement the hook and expose a different memory alignment requirement. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
For an O_DIRECT request to succeed, it's not only necessary that all base addresses in the qiov are aligned, but also that each length in it is aligned. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com>
-
Kevin Wolf authored
The functions used by qemu_memalign() require an alignment that is at least sizeof(void*). Adjust it if it is too small. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoît Canet <benoit@irqsave.net>
-
Kevin Wolf authored
When reopening with different flags, or when backing files disappear from the chain, the limits may change. Make sure they get updated in these cases. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoît Canet <benoit@irqsave.net>
-
Kevin Wolf authored
When there is a format driver between the backend, it's not guaranteed that exposing the opt_transfer_length for the format driver results in the optimal requests (because of fragmentation etc.), but it can't make things worse, so let's just do it. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Wenchao Xia <xiawenc@linux.vnet.ibm.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoît Canet <benoit@irqsave.net>
-
Kevin Wolf authored
This function separates filling the BlockLimits from bdrv_open(), which allows it to call it from other operations which may change the limits (e.g. modifications to the backing file chain or bdrv_reopen) Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
bdrv_commit() could return 0 or 1 on success, depending on whether or not the last sector was allocated in the overlay and whether the overlay format had a .bdrv_make_empty callback. Most callers ignored it, but qemu-img commit would print an error message while the operation actually succeeded. Also clean up the handling of I/O errors to return the real error code instead of -EIO. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Jeff Cody authored
This updates the documentation for commiting snapshot images. Specifically, this highlights what happens when the base image is either smaller or larger than the snapshot image being committed. In the case of the base image being smaller, it is resized to the larger size of the snapshot image. In the case of the base image being larger, it is not resized automatically, but once the commit has completed it is safe for the user to truncate the base image. Signed-off-by:
Jeff Cody <jcody@redhat.com> Reviewed-by:
Fam Zheng <famz@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Jeff Cody authored
If the top image to commit is the active layer, and also larger than the base image, then an I/O error will likely be returned during block-commit. For instance, if we have a base image with a virtual size 10G, and a active layer image of size 20G, then committing the snapshot via 'block-commit' will likely fail. This will automatically attempt to resize the base image, if the active layer image to be committed is larger. Signed-off-by:
Jeff Cody <jcody@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Jeff Cody authored
Currently, if an image file is logically larger than its backing file, committing it via 'qemu-img commit' will fail. For instance, if we have a base image with a virtual size 10G, and a snapshot image of size 20G, then committing the snapshot offline with 'qemu-img commit' will likely fail. This will automatically attempt to resize the base image, if the snapshot image to be committed is larger. Signed-off-by:
Jeff Cody <jcody@redhat.com> Reviewed-by:
Fam Zheng <famz@redhat.com> Reviewed-by:
Eric Blake <eblake@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Peter Maydell authored
libcurl versions 7.16.0 and later have a timer callback interface which must be implemented in order for libcurl to make forward progress (it will sometimes rely on being called back on the timeout if there are no file descriptors registered). Implement the callback, and use a QEMU AIO timer to ensure we prod libcurl again when it asks us to. Based on Peter's original patch plus my fix to add curl_multi_timeout_do. Should compile just fine even on older versions of libcurl. I also tried copy-on-read and streaming: $ ./qemu-img create -f qcow2 -o \ backing_file=http://download.fedoraproject.org/pub/fedora/linux/releases/20/Live/x86_64/Fedora-Live-Desktop-x86_64-20-1.iso \ foo.qcow2 1G $ x86_64-softmmu/qemu-system-x86_64 \ -drive if=none,file=foo.qcow2,copy-on-read=on,id=cd \ -device ide-cd,drive=cd --enable-kvm -m 1024 Direct http usage is probably too slow, but with copy-on-read ultimately the image does boot! After some time, streaming gets canceled by an EIO, which needs further investigation. Signed-off-by:
Peter Maydell <peter.maydell@linaro.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Signed-off-by:
Benoit Canet <benoit@irqsave.net> Reviewed-by:
Fam Zheng <famz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Signed-off-by:
Benoit Canet <benoit@irqsave.net> Reviewed-by:
Fam Zheng <famz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Signed-off-by:
Benoit Canet <benoit@irqsave.net> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Signed-off-by:
Benoit Canet <benoit@irqsave.net> Reviewed-by:
Fam Zheng <famz@redhat.com> There was two candidate ways to implement named node manipulation: 1) { 'command': 'block_passwd', 'data': {'*device': 'str', '*node-name': 'str', 'password': 'str'} } 2) { 'command': 'block_passwd', 'data': {'device': 'str', '*device-is-node': 'bool', 'password': 'str'} } Luiz proposed 1 and says 2 was an abuse of the QMP interface and proposed to rewrite the QMP block interface for 2.0. Luiz does not like in 1 the fact that 2 fields are optional but one of them must be specified leading to an abuse of the QMP semantic. Kevin argumented that 2 what a clear abuse of the device field and would not be practical when reading fast some log file because the user would read "device" and think that a device is manipulated when it's in fact a node name. Documentation of 1 make it pretty clear what to do for the user. Kevin argued that all bs are node including devices ones so 2 does not make sense. Kevin also argued that rewriting the QMP block interface would not make disapear the current one. Kevin pushed the argument that making the QAPI generator compatible with the semantic of the operation would need a rewrite that no one has done yet. A vote has been done on the list to elect the version to use and 1 won. For reference the complete thread is: "[Qemu-devel] [PATCH V4 4/7] qmp: Allow to change password on names block driver states." Signed-off-by:
Benoit Canet <benoit@irqsave.net> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Signed-off-by:
Benoit Canet <benoit@irqsave.net> Reviewed-by:
Fam Zheng <famz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Signed-off-by:
Benoit Canet <benoit@irqsave.net> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Benoît Canet authored
Add the minimum of code to prepare for the following patches. Signed-off-by:
Benoit Canet <benoit@irqsave.net> Reviewed-by:
Fam Zheng <famz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Fam Zheng authored
Currently there is no way to query BlockStats of the backing chain. This adds "backing" field into BlockStats to make it possible. The comment of "parent" is reworded. Signed-off-by:
Fam Zheng <famz@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net> Reviewed-by:
Eric Blake <eblake@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Fam Zheng authored
Previously the field is wrong: $ ./qemu-img create -f vmdk -o subformat=streamOptimized /tmp/a.vmdk 1G $ ./qemu-img info /tmp/a.vmdk image: /tmp/a.vmdk file format: vmdk virtual size: 1.0G (1073741824 bytes) disk size: 12K Format specific information: cid: 1390460459 parent cid: 4294967295 >>> create type: monolithicSparse <snip> Signed-off-by:
Fam Zheng <famz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Zhang Min authored
In the function mirror_iteration() -> qemu_iovec_init(), it allocates memory for op->qiov.iov, when the write request calls back, but in the function mirror_iteration_done(), it only frees the op, not free the op->qiov.iov, so this causes memory leak. It should use qemu_iovec_destroy() to free op->qiov. Signed-off-by:
Zhang Min <rudy.zhangmin@huawei.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Liu Yuan authored
It was muted in the previous commit 4bc74be9. Let's revive it since nothing prevents us to do it. With this patch, following command will work as other formats: $ qemu-img map sheepdog:image Cc: qemu-devel@nongnu.org Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by:
Liu Yuan <namei.unix@gmail.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Kevin Wolf authored
Document the SIGUSR1 behaviour of qemu-img. Also, added compare to the list of subcommands that support -p. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
Since commit a7aae221 ('Switch SIG_IPI to SIGUSR1'), SIGUSR1 is blocked during startup, breaking the progress report in tools. This patch reenables the signal when initialising a progress report. Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Kevin Wolf authored
Signed-off-by:
Kevin Wolf <kwolf@redhat.com> Reviewed-by:
Benoit Canet <benoit@irqsave.net>
-
Fam Zheng authored
Report an error if file size is even smaller than metadata. Signed-off-by:
Fam Zheng <famz@redhat.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-
Hu Tao authored
Accoring to qcow spec, the offset fields in l1e, l2e and ref table entry start at bit 9. The offset is cluster offset, and the smallest possible cluster size is 512 bytes. Signed-off-by:
Hu Tao <hutao@cn.fujitsu.com> Signed-off-by:
Kevin Wolf <kwolf@redhat.com>
-