<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/block/drbd, branch v3.16</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.16</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.16'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2014-07-10T09:06:03Z</updated>
<entry>
<title>drbd: fix regression 'out of mem, failed to invoke fence-peer helper'</title>
<updated>2014-07-10T09:06:03Z</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2014-07-09T19:18:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9'/>
<id>urn:sha1:bbc1c5e8ad6dfebf9d13b8a4ccdf66c92913eac9</id>
<content type='text'>
Since linux kernel 3.13, kthread_run() internally uses
wait_for_completion_killable().  We sometimes may use kthread_run()
while we still have a signal pending, which we used to kick our threads
out of potentially blocking network functions, causing kthread_run() to
mistake that as a new fatal signal and fail.

Fix: flush_signals() before kthread_run().

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>drbd: fix NULL pointer deref in blk_add_request_payload</title>
<updated>2014-06-25T15:53:47Z</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2014-06-25T15:52:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=54ed4ed8f9a5071a3bbae2e037376d8c02b9629b'/>
<id>urn:sha1:54ed4ed8f9a5071a3bbae2e037376d8c02b9629b</id>
<content type='text'>
Discards don't have any payload.
But the scsi layer still expects a bio_vec it can use internally,
see sd_setup_discard_cmnd() and blk_add_request_payload().

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>drbd: use list_first_entry_or_null in first_peer_device/first_connection</title>
<updated>2014-04-30T19:46:56Z</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2014-04-28T16:43:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ec4a340789be16831ae96be5f7552238a7a6e903'/>
<id>urn:sha1:ec4a340789be16831ae96be5f7552238a7a6e903</id>
<content type='text'>
If there are no peer_devices or connections, I'd rather have NULL
than some "arbitrary" address pretending to point to a struct.

Helps to avoid hard to debug symptoms, in case we ever try to use
and dereference a drbd_connection or drbd_peer_device
where we in fact don't have any connection at all.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>drbd: Allow attaching of a newly created device to any backing device</title>
<updated>2014-04-30T19:46:56Z</updated>
<author>
<name>Philipp Reisner</name>
<email>philipp.reisner@linbit.com</email>
</author>
<published>2014-04-28T16:43:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=babea49ebe010d0a533b5db20fa63c327402a71c'/>
<id>urn:sha1:babea49ebe010d0a533b5db20fa63c327402a71c</id>
<content type='text'>
A newly created device was never exposed before, i.e. has a
exposed_data_uuid of 0. Then it is valid to attach to any current_uuid
of a backing device (of course also to a newly created one (4))

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>drbd: Test cstate while holding req_lock</title>
<updated>2014-04-30T19:46:56Z</updated>
<author>
<name>Philipp Reisner</name>
<email>philipp.reisner@linbit.com</email>
</author>
<published>2014-04-28T16:43:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=02df6fe145715f1d3858c0c65aed991f148b70b4'/>
<id>urn:sha1:02df6fe145715f1d3858c0c65aed991f148b70b4</id>
<content type='text'>
In case a connection transitions into C_TIMEOUT within the timer
function (request_timer_fn()) we need to make sure that the receiver
thread (potentially running on a different CPU) sees the updated
cstate later on.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>drbd: use blk_set_stacking_limits()</title>
<updated>2014-04-30T19:46:55Z</updated>
<author>
<name>Philipp Reisner</name>
<email>philipp.reisner@linbit.com</email>
</author>
<published>2014-04-28T16:43:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c1b3156f121fd301191e0b4c5fa2fec42cd17871'/>
<id>urn:sha1:c1b3156f121fd301191e0b4c5fa2fec42cd17871</id>
<content type='text'>
...instead directly assigning to q-&gt;limits.discard_zeroes_data

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>drbd: evaluate disk and network timeout on different requests</title>
<updated>2014-04-30T19:46:55Z</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars.ellenberg@linbit.com</email>
</author>
<published>2014-04-28T16:43:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=08535466bce6bd91320990b9a614d52a3dc0f21d'/>
<id>urn:sha1:08535466bce6bd91320990b9a614d52a3dc0f21d</id>
<content type='text'>
Just because it is the oldest not yet completed request
does not make it the oldest request waiting for disk.
Or waiting for the peer.

And we completely missed already completed requests
that would still hold references to activity log extents,
waiting only for the barrier ack.

Find two oldest not yet completely processed requests,
one that is still waiting for local completion,
and one that is still waiting for some response from the peer.
These may or may not be the same request object.

Then separately apply the network and disk timeouts, respectively.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>drbd: Fix a hole in the challange-response connection authentication</title>
<updated>2014-04-30T19:46:55Z</updated>
<author>
<name>Philipp Reisner</name>
<email>philipp.reisner@linbit.com</email>
</author>
<published>2014-04-28T16:43:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=67cca286caa6e33f3134bd36834d2484538f4f78'/>
<id>urn:sha1:67cca286caa6e33f3134bd36834d2484538f4f78</id>
<content type='text'>
In the implementation as it was, the two peers sent each other
a challenge, and expects the challenge hashed with the shared
secret back.

A attacker could simply wait for the challenge of the peer, and
send the same challenge back. Then it waits for the response, and
sends the same response back.

Prevent this by not accepting a challenge from the peer that is
the same as the challenge sent to the peer.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>drbd: always implicitly close last epoch when idle</title>
<updated>2014-04-30T19:46:55Z</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars@linbit.com</email>
</author>
<published>2014-04-28T16:43:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f9c78128f833ae3057884ca219259c8ae5db8898'/>
<id>urn:sha1:f9c78128f833ae3057884ca219259c8ae5db8898</id>
<content type='text'>
Once our sender thread needs to wait_for_work(),
and actually needs to schedule(), just before we do that,
we already check if it is useful to implicitly close the last epoch.

The condition was too strict: only implicitly close the epoch,
if there have been no new (write) requests at all.

The assumption was that if there were new requests, they would
always be communicated one way or another, and would send necessary
epoch separating barriers explicitly.

This is not always true, e.g. when becoming diskless,
or while explicitly starting a full resync.

The last communicated epoch could stay open for a long time,
locking down corresponding activity log extents.

It is safe to always implicitly send that last barrier, as soon as we
determin that there cannot be more requests in the last communicated
epoch, even if there have been (uncommunicated) new requests in new
epochs meanwhile.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>drbd: add back some fairness to AL transactions</title>
<updated>2014-04-30T19:46:55Z</updated>
<author>
<name>Lars Ellenberg</name>
<email>lars@linbit.com</email>
</author>
<published>2014-04-28T16:43:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e4d7d6f4d36daff6aad84f96e48debde8e6ed09e'/>
<id>urn:sha1:e4d7d6f4d36daff6aad84f96e48debde8e6ed09e</id>
<content type='text'>
When batching more updates to the activity log into single transactions,
we lost the ability for new requests to force themselves into the active
set: all preparation steps became non-blocking, and if all currently
hot extents keep busy, they could starve out new incoming requests
to cold extents for quite a while.

This can only happen if your IO backend accepts more IO operations per
average DRBD replication round trip time than you have al-extents
configured.

If we have incoming requests to cold extents,
at least do one blocking update per transaction.

In an artificial worst-case workload on SSD with an asynchronous 600 ms
replication link, with al-extents = 7 (the minimum we allow), and
concurrent full resynch, without this patch, some write requests have
been observed to be starved for 40 seconds.
With this patch, application observed a worst case latency of twice the
replication round trip time.

Signed-off-by: Philipp Reisner &lt;philipp.reisner@linbit.com&gt;
Signed-off-by: Lars Ellenberg &lt;lars.ellenberg@linbit.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
</feed>
