<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/lib/raid6, branch v4.1</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.1</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.1'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-04-21T22:00:42Z</updated>
<entry>
<title>md/raid6 algorithms: xor_syndrome() for SSE2</title>
<updated>2015-04-21T22:00:42Z</updated>
<author>
<name>Markus Stockhausen</name>
<email>stockhausen@collogia.de</email>
</author>
<published>2014-12-15T01:57:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a582564b24bec0443b5c5ff43ee6d1258f8bd658'/>
<id>urn:sha1:a582564b24bec0443b5c5ff43ee6d1258f8bd658</id>
<content type='text'>
The second and (last) optimized XOR syndrome calculation. This version
supports right and left side optimization. All CPUs with architecture
older than Haswell will benefit from it.

It should be noted that SSE2 movntdq kills performance for memory areas
that are read and written simultaneously in chunks smaller than cache
line size. So use movdqa instead for P/Q writes in sse21 and sse22 XOR
functions.

Signed-off-by: Markus Stockhausen &lt;stockhausen@collogia.de&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>md/raid6 algorithms: xor_syndrome() for generic int</title>
<updated>2015-04-21T22:00:42Z</updated>
<author>
<name>Markus Stockhausen</name>
<email>stockhausen@collogia.de</email>
</author>
<published>2014-12-15T01:57:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9a5ce91d053961b7cc8fa56bd083819a9fc92734'/>
<id>urn:sha1:9a5ce91d053961b7cc8fa56bd083819a9fc92734</id>
<content type='text'>
Start the algorithms with the very basic one. It is left and right
optimized. That means we can avoid all calculations for unneeded pages
above the right stop offset. For pages below the left start offset we
still need the syndrome multiplication but without reading data pages.

Signed-off-by: Markus Stockhausen &lt;stockhausen@collogia.de&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>md/raid6 algorithms: improve test program</title>
<updated>2015-04-21T22:00:42Z</updated>
<author>
<name>Markus Stockhausen</name>
<email>stockhausen@collogia.de</email>
</author>
<published>2014-12-15T01:57:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7e92e1d7629b00578cef22b1f4c6ada726663701'/>
<id>urn:sha1:7e92e1d7629b00578cef22b1f4c6ada726663701</id>
<content type='text'>
It is always helpful to have a test tool in place if we implement
new data critical algorithms. So add some test routines to the raid6
checker that can prove if the new xor_syndrome() works as expected.

Run through all permutations of start/stop pages per algorithm and
simulate a xor_syndrome() assisted rmw run. After each rmw check if
the recovery algorithm still confirms that the stripe is fine.

Signed-off-by: Markus Stockhausen &lt;stockhausen@collogia.de&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>md/raid6 algorithms: delta syndrome functions</title>
<updated>2015-04-21T22:00:41Z</updated>
<author>
<name>Markus Stockhausen</name>
<email>stockhausen@collogia.de</email>
</author>
<published>2014-12-15T01:57:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fe5cbc6e06c7d8b3a86f6f5491d74766bb5c2827'/>
<id>urn:sha1:fe5cbc6e06c7d8b3a86f6f5491d74766bb5c2827</id>
<content type='text'>
v3: s-o-b comment, explanation of performance and descision for
the start/stop implementation

Implementing rmw functionality for RAID6 requires optimized syndrome
calculation. Up to now we can only generate a complete syndrome. The
target P/Q pages are always overwritten. With this patch we provide
a framework for inplace P/Q modification. In the first place simply
fill those functions with NULL values.

xor_syndrome() has two additional parameters: start &amp; stop. These
will indicate the first and last page that are changing during a
rmw run. That makes it possible to avoid several unneccessary loops
and speed up calculation. The caller needs to implement the following
logic to make the functions work.

1) xor_syndrome(disks, start, stop, ...): "Remove" all data of source
blocks inside P/Q between (and including) start and end.

2) modify any block with start &lt;= block &lt;= stop

3) xor_syndrome(disks, start, stop, ...): "Reinsert" all data of
source blocks into P/Q between (and including) start and end.

Pages between start and stop that won't be changed should be filled
with a pointer to the kernel zero page. The reasons for not taking NULL
pages are:

1) Algorithms cross the whole source data line by line. Thus avoid
additional branches.

2) Having a NULL page avoids calculating the XOR P parity but still
need calulation steps for the Q parity. Depending on the algorithm
unrolling that might be only a difference of 2 instructions per loop.

The benchmark numbers of the gen_syndrome() functions are displayed in
the kernel log. Do the same for the xor_syndrome() functions. This
will help to analyze performance problems and give an rough estimate
how well the algorithm works. The choice of the fastest algorithm will
still depend on the gen_syndrome() performance.

With the start/stop page implementation the speed can vary a lot in real
life. E.g. a change of page 0 &amp; page 15 on a stripe will be harder to
compute than the case where page 0 &amp; page 1 are XOR candidates. To be not
to enthusiatic about the expected speeds we will run a worse case test
that simulates a change on the upper half of the stripe. So we do:

1) calculation of P/Q for the upper pages

2) continuation of Q for the lower (empty) pages

Signed-off-by: Markus Stockhausen &lt;stockhausen@collogia.de&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>x86/raid6: correctly check for assembler capabilities</title>
<updated>2015-02-03T21:35:51Z</updated>
<author>
<name>Jan Beulich</name>
<email>JBeulich@suse.com</email>
</author>
<published>2015-01-23T08:29:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=75aaf4c3e6a4ed48207230cf133a02258ca5abd5'/>
<id>urn:sha1:75aaf4c3e6a4ed48207230cf133a02258ca5abd5</id>
<content type='text'>
Just like for AVX2 (which simply needs an #if -&gt; #ifdef conversion),
SSSE3 assembler support should be checked for before using it.

Signed-off-by: Jan Beulich &lt;jbeulich@suse.com&gt;
Cc: Jim Kukunas &lt;james.t.kukunas@linux.intel.com&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>lib/raid6: Add log level to printks</title>
<updated>2014-10-14T02:08:29Z</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2014-10-13T12:03:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b395f75eabb3844c99244928293796ff42feaa3d'/>
<id>urn:sha1:b395f75eabb3844c99244928293796ff42feaa3d</id>
<content type='text'>
Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>Merge tag 'md/3.12' of git://neil.brown.name/md</title>
<updated>2013-09-10T20:03:41Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-09-10T20:03:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4d7696f1b05f4aeb586c74868fe3da2731daca4b'/>
<id>urn:sha1:4d7696f1b05f4aeb586c74868fe3da2731daca4b</id>
<content type='text'>
Pull md update from Neil Brown:
 "Headline item is multithreading for RAID5 so that more IO/sec can be
  supported on fast (SSD) devices.  Also TILE-Gx SIMD suppor for RAID6
  calculations and an assortment of bug fixes"

* tag 'md/3.12' of git://neil.brown.name/md:
  raid5: only wakeup necessary threads
  md/raid5: flush out all pending requests before proceeding with reshape.
  md/raid5: use seqcount to protect access to shape in make_request.
  raid5: sysfs entry to control worker thread number
  raid5: offload stripe handle to workqueue
  raid5: fix stripe release order
  raid5: make release_stripe lockless
  md: avoid deadlock when dirty buffers during md_stop.
  md: Don't test all of mddev-&gt;flags at once.
  md: Fix apparent cut-and-paste error in super_90_validate
  raid6/test: replace echo -e with printf
  RAID: add tilegx SIMD implementation of raid6
  md: fix safe_mode buglet.
  md: don't call md_allow_write in get_bitmap_file.
</content>
</entry>
<entry>
<title>raid6/test: replace echo -e with printf</title>
<updated>2013-08-27T06:06:06Z</updated>
<author>
<name>Max Filippov</name>
<email>jcmvbkbc@gmail.com</email>
</author>
<published>2013-08-22T14:53:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c28399b5943a24a214a44e973a3a8002fd36442d'/>
<id>urn:sha1:c28399b5943a24a214a44e973a3a8002fd36442d</id>
<content type='text'>
-e is a non-standard echo option, echo output is
implementation-dependent when it is used. Replace echo -e with printf as
suggested by POSIX echo manual.

Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Jim Kukunas &lt;james.t.kukunas@linux.intel.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Yuanhan Liu &lt;yuanhan.liu@linux.intel.com&gt;
Acked-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
Signed-off-by: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>RAID: add tilegx SIMD implementation of raid6</title>
<updated>2013-08-27T06:05:50Z</updated>
<author>
<name>Ken Steele</name>
<email>ken@tilera.com</email>
</author>
<published>2013-08-07T16:39:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ae77cbc1e7b90473a2b0963bce0e1eb163873214'/>
<id>urn:sha1:ae77cbc1e7b90473a2b0963bce0e1eb163873214</id>
<content type='text'>
This change adds TILE-Gx SIMD instructions to the software raid
(md), modeling the Altivec implementation. This is only for Syndrome
generation; there is more that could be done to improve recovery,
as in the recent Intel SSE3 recovery implementation.

The code unrolls 8 times; this turns out to be the best on tilegx
hardware among the set 1, 2, 4, 8 or 16.  The code reads one
cache-line of data from each disk, stores P and Q then goes to the
next cache-line.

The test code in sys/linux/lib/raid6/test reports 2008 MB/s data
read rate for syndrome generation using 18 disks (16 data and 2
parity). It was 1512 MB/s before this SIMD optimizations. This is
running on 1 core with all the data in cache.

This is based on the paper The Mathematics of RAID-6.
(http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf).

Signed-off-by: Ken Steele &lt;ken@tilera.com&gt;
Signed-off-by: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>lib/raid6: add ARM-NEON accelerated syndrome calculation</title>
<updated>2013-07-08T21:09:18Z</updated>
<author>
<name>Ard Biesheuvel</name>
<email>ard.biesheuvel@linaro.org</email>
</author>
<published>2013-05-16T15:20:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7d11965ddb9b9b1e0a5d13c58345ada1ccbc663b'/>
<id>urn:sha1:7d11965ddb9b9b1e0a5d13c58345ada1ccbc663b</id>
<content type='text'>
Rebased/reworked a patch contributed by Rob Herring that uses
NEON intrinsics to perform the RAID-6 syndrome calculations.
It uses the existing unroll.awk code to generate several
unrolled versions of which the best performing one is selected
at boot time.

Signed-off-by: Ard Biesheuvel &lt;ard.biesheuvel@linaro.org&gt;
Acked-by: Nicolas Pitre &lt;nico@linaro.org&gt;
Cc: hpa@linux.intel.com
</content>
</entry>
</feed>
