<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/md, branch v4.9</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.9</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.9'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2016-11-05T18:34:07Z</updated>
<entry>
<title>Merge tag 'md/4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md</title>
<updated>2016-11-05T18:34:07Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-11-05T18:34:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6c286e812d5f306ebf991c3f78ebc511121896e0'/>
<id>urn:sha1:6c286e812d5f306ebf991c3f78ebc511121896e0</id>
<content type='text'>
Pull MD fixes from Shaohua Li:
 "There are several bug fixes queued:

   - fix raid5-cache recovery bugs

   - fix discard IO error handling for raid1/10

   - fix array sync writes bogus position to superblock

   - fix IO error handling for raid array with external metadata"

* tag 'md/4.9-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md: be careful not lot leak internal curr_resync value into metadata. -- (all)
  raid1: handle read error also in readonly mode
  raid5-cache: correct condition for empty metadata write
  md: report 'write_pending' state when array in sync
  md/raid5: write an empty meta-block when creating log super-block
  md/raid5: initialize next_checkpoint field before use
  RAID10: ignore discard error
  RAID1: ignore discard error
</content>
</entry>
<entry>
<title>md: be careful not lot leak internal curr_resync value into metadata. -- (all)</title>
<updated>2016-10-29T05:04:05Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2016-10-28T04:59:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1217e1d1999ed6c9c1e1b1acae0a74ab70464ae2'/>
<id>urn:sha1:1217e1d1999ed6c9c1e1b1acae0a74ab70464ae2</id>
<content type='text'>
mddev-&gt;curr_resync usually records where the current resync is up to,
but during the starting phase it has some "magic" values.

 1 - means that the array is trying to start a resync, but has yielded
     to another array which shares physical devices, and also needs to
     start a resync
 2 - means the array is trying to start resync, but has found another
     array which shares physical devices and has already started resync.

 3 - means that resync has commensed, but it is possible that nothing
     has actually been resynced yet.

It is important that this value not be visible to user-space and
particularly that it doesn't get written to the metadata, as the
resync or recovery checkpoint.  In part, this is because it may be
slightly higher than the correct value, though this is very rare.
In part, because it is not a multiple of 4K, and some devices only
support 4K aligned accesses.

There are two places where this value is propagates into either
-&gt;curr_resync_completed or -&gt;recovery_cp or -&gt;recovery_offset.
These currently avoid the propagation of values 1 and 3, but will
allow 3 to leak through.

Change them to only propagate the value if it is &gt; 3.

As this can cause an array to fail, the patch is suitable for -stable.

Cc: stable@vger.kernel.org (v3.7+)
Reported-by: Viswesh &lt;viswesh.vichu@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>raid1: handle read error also in readonly mode</title>
<updated>2016-10-29T05:04:04Z</updated>
<author>
<name>Tomasz Majchrzak</name>
<email>tomasz.majchrzak@intel.com</email>
</author>
<published>2016-10-28T12:45:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7449f699b2fb23bdee0a0f03aa4efb5f96fd403f'/>
<id>urn:sha1:7449f699b2fb23bdee0a0f03aa4efb5f96fd403f</id>
<content type='text'>
If write is the first operation on a disk and it happens not to be
aligned to page size, block layer sends read request first. If read
operation fails, the disk is set as failed as no attempt to fix the
error is made because array is in auto-readonly mode. Similarily, the
disk is set as failed for read-only array.

Take the same approach as in raid10. Don't fail the disk if array is in
readonly or auto-readonly mode. Try to redirect the request first and if
unsuccessful, return a read error.

Signed-off-by: Tomasz Majchrzak &lt;tomasz.majchrzak@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>raid5-cache: correct condition for empty metadata write</title>
<updated>2016-10-29T05:04:03Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2016-10-27T22:22:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9a8b27fac5bbb77337cc2e5d31d37c9936782d87'/>
<id>urn:sha1:9a8b27fac5bbb77337cc2e5d31d37c9936782d87</id>
<content type='text'>
As long as we recover one metadata block, we should write the empty metadata
write. The original code could make recovery corrupted if only one meta is
valid.

Reported-by: Zhengyuan Liu &lt;liuzhengyuan@kylinos.cn&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'dm-4.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm</title>
<updated>2016-10-28T16:27:58Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-10-28T16:27:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e0f3e6a7ccc002be056b6a21768fceee0d44941e'/>
<id>urn:sha1:e0f3e6a7ccc002be056b6a21768fceee0d44941e</id>
<content type='text'>
Pull device mapper fixes from Mike Snitzer:

 - a couple DM raid and DM mirror fixes

 - a couple .request_fn request-based DM NULL pointer fixes

 - a fix for a DM target reference count leak, on target load error,
   that prevented associated DM target kernel module(s) from being
   removed

* tag 'dm-4.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm table: fix missing dm_put_target_type() in dm_table_add_target()
  dm rq: clear kworker_task if kthread_run() returned an error
  dm: free io_barrier after blk_cleanup_queue call
  dm raid: fix activation of existing raid4/10 devices
  dm mirror: use all available legs on multiple failures
  dm mirror: fix read error on recovery after default leg failure
  dm raid: fix compat_features validation
</content>
</entry>
<entry>
<title>md: report 'write_pending' state when array in sync</title>
<updated>2016-10-24T22:28:19Z</updated>
<author>
<name>Tomasz Majchrzak</name>
<email>tomasz.majchrzak@intel.com</email>
</author>
<published>2016-10-24T10:47:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=16f889499a5214ebe038f8bd00f4c0094ed0ed75'/>
<id>urn:sha1:16f889499a5214ebe038f8bd00f4c0094ed0ed75</id>
<content type='text'>
If there is a bad block on a disk and there is a recovery performed from
this disk, the same bad block is reported for a new disk. It involves
setting MD_CHANGE_PENDING flag in rdev_set_badblocks. For external
metadata this flag is not being cleared as array state is reported as
'clean'. The read request to bad block in RAID5 array gets stuck as it
is waiting for a flag to be cleared - as per commit c3cce6cda162
("md/raid5: ensure device failure recorded before write request
returns.").

The meaning of MD_CHANGE_PENDING and MD_CHANGE_CLEAN flags has been
clarified in commit 070dc6dd7103 ("md: resolve confusion of
MD_CHANGE_CLEAN"), however MD_CHANGE_PENDING flag has been used in
personality error handlers since and it doesn't fully comply with
initial purpose. It was supposed to notify that write request is about
to start, however now it is also used to request metadata update.
Initially (in md_allow_write, md_write_start) MD_CHANGE_PENDING flag has
been set and in_sync has been set to 0 at the same time. Error handlers
just set the flag without modifying in_sync value. Sysfs array state is
a single value so now it reports 'clean' when MD_CHANGE_PENDING flag is
set and in_sync is set to 1. Userspace has no idea it is expected to
take some action.

Swap the order that array state is checked so 'write_pending' is
reported ahead of 'clean' ('write_pending' is a misleading name but it
is too late to rename it now).

Signed-off-by: Tomasz Majchrzak &lt;tomasz.majchrzak@intel.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>md/raid5: write an empty meta-block when creating log super-block</title>
<updated>2016-10-24T22:28:18Z</updated>
<author>
<name>Zhengyuan Liu</name>
<email>liuzhengyuan@kylinos.cn</email>
</author>
<published>2016-10-24T08:15:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=56056c2e7d58ee705755efbe780aefff987a1dc8'/>
<id>urn:sha1:56056c2e7d58ee705755efbe780aefff987a1dc8</id>
<content type='text'>
If superblock points to an invalid meta block, r5l_load_log will set
create_super with true and create an new superblock, this runtime path
would always happen if we do no writing I/O to this array since it was
created. Writing an empty meta block could avoid this unnecessary
action at the first time we created log superblock.

Another reason is for the corretness of log recovery. Currently we have
bellow code to guarantee log revocery to be correct.

        if (ctx.seq &gt; log-&gt;last_cp_seq + 1) {
                int ret;

                ret = r5l_log_write_empty_meta_block(log, ctx.pos, ctx.seq + 10);
                if (ret)
                        return ret;
                log-&gt;seq = ctx.seq + 11;
                log-&gt;log_start = r5l_ring_add(log, ctx.pos, BLOCK_SECTORS);
                r5l_write_super(log, ctx.pos);
        } else {
                log-&gt;log_start = ctx.pos;
                log-&gt;seq = ctx.seq;
        }

If we just created a array with a journal device, log-&gt;log_start and
log-&gt;last_checkpoint should all be 0, then we write three meta block
which are valid except mid one and supposed crash happened. The ctx.seq
would equal to log-&gt;last_cp_seq + 1 and log-&gt;log_start would be set to
position of mid invalid meta block after we did a recovery, this will
lead to problems which could be avoided with this patch.

Signed-off-by: Zhengyuan Liu &lt;liuzhengyuan@kylinos.cn&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>md/raid5: initialize next_checkpoint field before use</title>
<updated>2016-10-24T22:28:18Z</updated>
<author>
<name>Zhengyuan Liu</name>
<email>liuzhengyuan@kylinos.cn</email>
</author>
<published>2016-10-24T01:55:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=28cd88e2b4c54a466dcae7eea1efac766d42386b'/>
<id>urn:sha1:28cd88e2b4c54a466dcae7eea1efac766d42386b</id>
<content type='text'>
No initial operation was done to this field when we
load/recovery the log, it got assignment only when IO
to raid disk was finished. So r5l_quiesce may use wrong
next_checkpoint to reclaim log space, that would make
reclaimable space calculation confused.

Signed-off-by: Zhengyuan Liu &lt;liuzhengyuan@kylinos.cn&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>RAID10: ignore discard error</title>
<updated>2016-10-24T22:28:17Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2016-10-06T21:13:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=579ed34f7b751b8add233cba4cf755258dbdd60a'/>
<id>urn:sha1:579ed34f7b751b8add233cba4cf755258dbdd60a</id>
<content type='text'>
This is the counterpart of raid10 fix. If a write error occurs, raid10
will try to rewrite the bio in small chunk size. If the rewrite fails,
raid10 will record the error in bad block. narrow_write_error will
always use WRITE for the bio, but actually it could be a discard. Since
discard bio hasn't payload, write the bio will cause different issues.
But discard error isn't fatal, we can safely ignore it. This is what
this patch does.

This issue should exist since discard is added, but only exposed with
recent arbitrary bio size feature.

Cc: Sitsofe Wheeler &lt;sitsofe@gmail.com&gt;
Cc: stable@vger.kernel.org (v3.6)
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
<entry>
<title>RAID1: ignore discard error</title>
<updated>2016-10-24T22:28:17Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2016-10-06T21:09:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e3f948cd3283e4fbe5907f1f3967c839912f480e'/>
<id>urn:sha1:e3f948cd3283e4fbe5907f1f3967c839912f480e</id>
<content type='text'>
If a write error occurs, raid1 will try to rewrite the bio in small
chunk size. If the rewrite fails, raid1 will record the error in bad
block. narrow_write_error will always use WRITE for the bio, but
actually it could be a discard. Since discard bio hasn't payload, write
the bio will cause different issues. But discard error isn't fatal, we
can safely ignore it. This is what this patch does.

This issue should exist since discard is added, but only exposed with
recent arbitrary bio size feature.

Reported-and-tested-by: Sitsofe Wheeler &lt;sitsofe@gmail.com&gt;
Cc: stable@vger.kernel.org (v3.6)
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
</content>
</entry>
</feed>
