- Dec 17, 2009
-
-
Dan Williams authored
Add explicit 11 and 12 disks cases to exercise the 0 < src_cnt % 8 < 3 corner case in the ioatdma driver. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Nov 19, 2009
-
-
Dan Williams authored
ioat3.2 does not support asynchronous error notifications which makes the driver experience latencies when non-zero pq validate results are expected. Provide a mechanism for turning off async_xor_val and async_syndrome_val via Kconfig. This approach is generally useful for any driver that specifies ASYNC_TX_DISABLE_CHANNEL_SWITCH and would like to force the async_tx api to fall back to the synchronous path for certain operations. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Oct 20, 2009
-
-
Dan Williams authored
The raid6 recovery code currently requires special handling of the 4-disk and 5-disk recovery scenarios for the native layout. Quoting from commit 0a82a623: In these situations the default N-disk algorithm will present 0-source or 1-source operations to dma devices. To cover for dma devices where the minimum source count is 2 we implement 4-disk and 5-disk handling in the recovery code. The ddf layout presents disks=6 and disks=7 to the recovery code in these situations. Instead of looking at the number of disks count the number of non-zero sources in the list and call the special case code when the number of non-failed sources is 0 or 1. [neilb@suse.de: replace 'ddf' flag with counting good sources] Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
The global scribble page is used as a temporary destination buffer when disabling the P or Q result is requested. The local scribble buffer contains memory for performing address conversions. Rename the global variable to avoid confusion. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Oct 19, 2009
-
-
Dan Williams authored
- update the kernel doc for async_syndrome to indicate what NULL in the source list means - whitespace fixups Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Oct 15, 2009
-
-
NeilBrown authored
async_syndrome_val check the P and Q blocks used for RAID6 calculations. With DDF raid6, some of the data blocks might be NULL, so this needs to be handled in the same way that async_gen_syndrome handles it. As async_syndrome_val calls async_xor, also enhance async_xor to detect and skip NULL blocks in the list. Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
md/raid6 passes a list of 'struct page *' to the async_tx routines, which then either DMA map them for offload, or take the page_address for CPU based calculations. For RAID6 we sometime leave 'blanks' in the list of pages. For CPU based calcs, we want to treat theses as a page of zeros. For offloaded calculations, we simply don't pass a page to the hardware. Currently the 'blanks' are encoded as a pointer to raid6_empty_zero_page. This is a 4096 byte memory region, not a 'struct page'. This is mostly handled correctly but is rather ugly. So change the code to pass and expect a NULL pointer for the blanks. When taking page_address of a page, we need to check for a NULL and in that case use raid6_empty_zero_page. Signed-off-by:
NeilBrown <neilb@suse.de>
-
- Sep 21, 2009
-
-
Dan Williams authored
If we are unable to offload async_mult() or async_sum_product(), then unmap the buffers before falling through to the synchronous path. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Sep 16, 2009
-
-
Dan Williams authored
Testing on x86_64 with NDISKS=255 yields: do_IRQ: modprobe near stack overflow (cur:ffff88007d19c000,sp:ffff88007d19c128) ...and eventually general protection fault: 0000 [#1] Moving the scribble buffers off the stack allows the test to complete successfully. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Sep 08, 2009
-
-
Dan Williams authored
Some engines have transfer size and address alignment restrictions. Add a per-operation alignment property to struct dma_device that the async routines and dmatest can use to check alignment capabilities. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Channel switching is problematic for some dmaengine drivers as the architecture precludes separating the ->prep from ->submit. In these cases the driver can select ASYNC_TX_DISABLE_CHANNEL_SWITCH to modify the async_tx allocator to only return channels that support all of the required asynchronous operations. For example MD_RAID456=y selects support for asynchronous xor, xor validate, pq, pq validate, and memcpy. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=y any channel with all these capabilities is marked DMA_ASYNC_TX allowing async_tx_find_channel() to quickly locate compatible channels with the guarantee that dependency chains will remain on one channel. When ASYNC_TX_DISABLE_CHANNEL_SWITCH=n async_tx_find_channel() may select channels that lead to operation chains that need to cross channel boundaries using the async_tx channel switch capability. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Some engines optimize operation by reading ahead in the descriptor chain such that descriptor2 may start execution before descriptor1 completes. If descriptor2 depends on the result from descriptor1 then a fence is required (on descriptor2) to disable this optimization. The async_tx api could implicitly identify dependencies via the 'depend_tx' parameter, but that would constrain cases where the dependency chain only specifies a completion order rather than a data dependency. So, provide an ASYNC_TX_FENCE to explicitly identify data dependencies. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Aug 29, 2009
-
-
Dan Williams authored
Port drivers/md/raid6test/test.c to use the async raid6 recovery routines. This is meant as a unit test for raid6 acceleration drivers. In addition to the 16-drive test case this implements tests for the 4-disk and 5-disk special cases (dma devices can not generically handle less than 2 sources), and adds a test for the D+Q case. Reviewed-by:
Andre Noll <maan@systemlinux.org> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
async_raid6_2data_recov() recovers two data disk failures async_raid6_datap_recov() recovers a data disk and the P disk These routines are a port of the synchronous versions found in drivers/md/raid6recov.c. The primary difference is breaking out the xor operations into separate calls to async_xor. Two helper routines are introduced to perform scalar multiplication where needed. async_sum_product() multiplies two sources by scalar coefficients and then sums (xor) the result. async_mult() simply multiplies a single source by a scalar. This implemention also includes, in contrast to the original synchronous-only code, special case handling for the 4-disk and 5-disk array cases. In these situations the default N-disk algorithm will present 0-source or 1-source operations to dma devices. To cover for dma devices where the minimum source count is 2 we implement 4-disk and 5-disk handling in the recovery code. [ Impact: asynchronous raid6 recovery routines for 2data and datap cases ] Cc: Yuri Tikhonov <yur@emcraft.com> Cc: Ilya Yanok <yanok@emcraft.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Reviewed-by:
Andre Noll <maan@systemlinux.org> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
[ Based on an original patch by Yuri Tikhonov ] This adds support for doing asynchronous GF multiplication by adding two additional functions to the async_tx API: async_gen_syndrome() does simultaneous XOR and Galois field multiplication of sources. async_syndrome_val() validates the given source buffers against known P and Q values. When a request is made to run async_pq against more than the hardware maximum number of supported sources we need to reuse the previous generated P and Q values as sources into the next operation. Care must be taken to remove Q from P' and P from Q'. For example to perform a 5 source pq op with hardware that only supports 4 sources at a time the following approach is taken: p, q = PQ(src0, src1, src2, src3, COEF({01}, {02}, {04}, {08})) p', q' = PQ(p, q, q, src4, COEF({00}, {01}, {00}, {10})) p' = p + q + q + src4 = p + src4 q' = {00}*p + {01}*q + {00}*q + {10}*src4 = q + {10}*src4 Note: 4 is the minimum acceptable maxpq otherwise we punt to synchronous-software path. The DMA_PREP_CONTINUE flag indicates to the driver to reuse p and q as sources (in the above manner) and fill the remaining slots up to maxpq with the new sources/coefficients. Note1: Some devices have native support for P+Q continuation and can skip this extra work. Devices with this capability can advertise it with dma_set_maxpq. It is up to each driver how to handle the DMA_PREP_CONTINUE flag. Note2: The api supports disabling the generation of P when generating Q, this is ignored by the synchronous path but is implemented by some dma devices to save unnecessary writes. In this case the continuation algorithm is simplified to only reuse Q as a source. Cc: H. Peter Anvin <hpa@zytor.com> Cc: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by:
Yuri Tikhonov <yur@emcraft.com> Signed-off-by:
Ilya Yanok <yanok@emcraft.com> Reviewed-by:
Andre Noll <maan@systemlinux.org> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
We currently walk the parent chain when waiting for a given tx to complete however this walk may race with the driver cleanup routine. The routines in async_raid6_recov.c may fall back to the synchronous path at any point so we need to be prepared to call async_tx_quiesce() (which calls dma_wait_for_async_tx). To remove the ->parent walk we guarantee that every time a dependency is attached ->issue_pending() is invoked, then we can simply poll the initial descriptor until completion. This also allows for a lighter weight 'issue pending' implementation as there is no longer a requirement to iterate through all the channels' ->issue_pending() routines as long as operations have been submitted in an ordered chain. async_tx_issue_pending() is added for this case. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
If module_init and module_exit are nops then neither need to be defined. [ Impact: pure cleanup ] Reviewed-by:
Andre Noll <maan@systemlinux.org> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Replace the flat zero_sum_result with a collection of flags to contain the P (xor) zero-sum result, and the soon to be utilized Q (raid6 reed solomon syndrome) zero-sum result. Use the SUM_CHECK_ namespace instead of DMA_ since these flags will be used on non-dma-zero-sum enabled platforms. Reviewed-by:
Andre Noll <maan@systemlinux.org> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Jul 01, 2009
-
-
Dan Williams authored
On HIGHMEM64G systems dma_addr_t is known to be larger than (void *) which precludes async_xor from performing dma address conversions by reusing the input parameter address list. However, other parts of the dmaengine infrastructure do not suffer this constraint, so the HIGHMEM64G restriction can be down-levelled. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Jun 03, 2009
-
-
Dan Williams authored
async_xor() needs space to perform dma and page address conversions. In most cases the code can simply reuse the struct page * array because the size of the native pointer matches the size of a dma/page address. In order to support archs where sizeof(dma_addr_t) is larger than sizeof(struct page *), or to preserve the input parameters, we utilize a memory region passed in by the caller. Since the code is now prepared to handle the case where it cannot perform address conversions on the stack, we no longer need the !HIGHMEM64G dependency in drivers/dma/Kconfig. [ Impact: don't clobber input buffers for address conversions ] Reviewed-by:
Andre Noll <maan@systemlinux.org> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Prepare the api for the arrival of a new parameter, 'scribble'. This will allow callers to identify scratchpad memory for dma address or page address conversions. As this adds yet another parameter, take this opportunity to convert the common submission parameters (flags, dependency, callback, and callback argument) into an object that is passed by reference. Also, take this opportunity to fix up the kerneldoc and add notes about the relevant ASYNC_TX_* flags for each routine. [ Impact: moves api pass-by-value parameters to a pass-by-reference struct ] Signed-off-by:
Andre Noll <maan@systemlinux.org> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
In support of inter-channel chaining async_tx utilizes an ack flag to gate whether a dependent operation can be chained to another. While the flag is not set the chain can be considered open for appending. Setting the ack flag closes the chain and flags the descriptor for garbage collection. The ASYNC_TX_DEP_ACK flag essentially means "close the chain after adding this dependency". Since each operation can only have one child the api now implicitly sets the ack flag at dependency submission time. This removes an unnecessary management burden from clients of the api. [ Impact: clean up and enforce one dependency per operation ] Reviewed-by:
Andre Noll <maan@systemlinux.org> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Apr 08, 2009
-
-
Dan Williams authored
'zero_sum' does not properly describe the operation of generating parity and checking that it validates against an existing buffer. Change the name of the operation to 'val' (for 'validate'). This is in anticipation of the p+q case where it is a requirement to identify the target parity buffers separately from the source buffers, because the target parity buffers will not have corresponding pq coefficients. Reviewed-by:
Andre Noll <maan@systemlinux.org> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Mar 25, 2009
-
-
Dan Williams authored
Provide a config option for blocking the allocation of dma channels to the async_tx api. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
To allow an async_tx routine to be compiled away on HAS_DMA=n arch it needs to be declared __always_inline otherwise the compiler may emit code and cause a link error. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Jan 06, 2009
-
-
Dan Williams authored
Now that clients no longer need to be notified of channel arrival dma_async_client_register can simply increment the dmaengine_ref_count. Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
async_tx and net_dma each have open-coded versions of issue_pending_all, so provide a common routine in dmaengine. The implementation needs to walk the global device list, so implement rcu to allow dma_issue_pending_all to run lockless. Clients protect themselves from channel removal events by holding a dmaengine reference. Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Allowing multiple clients to each define their own channel allocation scheme quickly leads to a pathological situation. For memory-to-memory offload all clients can share a central allocator. This simply moves the existing async_tx allocator to dmaengine with minimal fixups: * async_tx.c:get_chan_ref_by_cap --> dmaengine.c:nth_chan * async_tx.c:async_tx_rebalance --> dmaengine.c:dma_channel_rebalance * split out common code from async_tx.c:__async_tx_find_channel --> dma_find_channel Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Simply, if a client wants any dmaengine channel then prevent all dmaengine modules from being removed. Once the clients are done re-enable module removal. Why?, beyond reducing complication: 1/ Tracking reference counts per-transaction in an efficient manner, as is currently done, requires a complicated scheme to avoid cache-line bouncing effects. 2/ Per-transaction ref-counting gives the false impression that a dma-driver can be gracefully removed ahead of its user (net, md, or dma-slave) 3/ None of the in-tree dma-drivers talk to hot pluggable hardware, but if such an engine were built one day we still would not need to notify clients of remove events. The driver can simply return NULL to a ->prep() request, something that is much easier for a client to handle. Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Acked-by:
Maciej Sosnowski <maciej.sosnowski@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Jan 05, 2009
-
-
Dan Williams authored
async_tx.ko is a consumer of dma channels. A circular dependency arises if modules in drivers/dma rely on common code in async_tx.ko. It prevents either module from being unloaded. Move dma_wait_for_async_tx and async_tx_run_dependencies to dmaeninge.o where they should have been from the beginning. Reviewed-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Dec 08, 2008
-
-
Dan Williams authored
Mapping the destination multiple times is a misuse of the dma-api. Since the destination may be reused as a source, ensure that it is only mapped once and that it is mapped bidirectionally. This appears to add ugliness on the unmap side in that it always reads back the destination address from the descriptor, but gcc can determine that dma_unmap is a nop and not emit the code that calculates its arguments. Cc: <stable@kernel.org> Cc: Saeed Bishara <saeed@marvell.com> Acked-by:
Yuri Tikhonov <yur@emcraft.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Sep 13, 2008
-
-
Dan Williams authored
* Rename 'next' to 'dep' * Move the channel switch check inside the loop to simplify termination Acked-by:
Ilya Yanok <yanok@emcraft.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Sep 05, 2008
-
-
Yuri Tikhonov authored
Should clear the next pointer of the TX if we are sure that the next TX (say NXT) will be submitted to the channel too. Overwise, we break the chain of descriptors, because we lose the information about the next descriptor to run. So next time, when invoke async_tx_run_dependencies() with TX, it's TX->next will be NULL, and NXT will be never submitted. Cc: <stable@kernel.org> [2.6.26] Signed-off-by:
Yuri Tikhonov <yur@emcraft.com> Signed-off-by:
Ilya Yanok <yanok@emcraft.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Aug 05, 2008
-
-
Dan Williams authored
Found-by:
Yuri Tikhonov <yur@emcraft.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Jul 17, 2008
-
-
Dan Williams authored
All callers of async_tx_sync_epilog have called async_tx_quiesce on the depend_tx, so async_tx_sync_epilog need only call the callback to complete the operation. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Replace open coded "wait and acknowledge" instances with async_tx_quiesce. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
Ensure forward progress is made when a dmaengine driver is unable to allocate an xor descriptor by breaking the dependency chain with async_tx_quisce() and issue any pending descriptors. Tested with iop-adma by setting device->max_xor = 2 to force multiple calls to device_prep_dma_xor for each call to async_xor and limiting the descriptor slot pool to 5. Discovered that the minimum descriptor pool size for iop-adma is 2 * iop_chan_xor_slot_cnt(device->max_xor) + 1. Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Dan Williams authored
When the number of source buffers for an xor operation exceeds the hardware channel maximum async_xor creates a chain of dependent operations. The result of one operation is reused as an input to the next to continue the xor calculation. The destination buffer should remain mapped for the duration of the entire chain. To provide this guarantee the code must no longer be allowed to fallback to the synchronous path as this will preclude the buffer from being unmapped, i.e. the dma-driver will potentially miss the descriptor with !DMA_COMPL_SKIP_DEST_UNMAP. Cc: Neil Brown <neilb@suse.de> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Li Zefan authored
In the rcu update side, don't use list_for_each_entry_rcu(). Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
- Jul 08, 2008
-
-
Dan Williams authored
commit 636bdeaa 'dmaengine: ack to flags: make use of the unused bits in the 'ack' field' missed an ->ack conversion in crypto/async_tx/async_memset.c Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-