<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/ceph, branch v4.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-04-07T16:08:35Z</updated>
<entry>
<title>Revert "libceph: use memalloc flags for net IO"</title>
<updated>2015-04-07T16:08:35Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-04-02T11:40:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6d7fdb0ab351b33d4c12d53fe44be030b90fc9d4'/>
<id>urn:sha1:6d7fdb0ab351b33d4c12d53fe44be030b90fc9d4</id>
<content type='text'>
This reverts commit 89baaa570ab0b476db09408d209578cfed700e9f.

Dirty page throttling should be sufficient for us in the general case
so there is no need to use __GFP_MEMALLOC - it would be needed only in
the swap-over-rbd case, which we currently don't support.  (It would
probably take approximately the commit that is being reverted to add
that support, but we would also need the "swap" option to distinguish
from the general case and make sure swap ceph_client-s aren't shared
with anything else.)  See ceph-devel threads [1] and [2] for the
details of why enabling pfmemalloc reserves for all cases is a bad
thing.

On top of potential system lockups related to drained emergency
reserves, this turned out to cause ceph lockups in case peers are on
the same host and communicating via loopback due to sk_filter()
dropping pfmemalloc skbs on the receiving side because the receiving
loopback socket is not tagged with SOCK_MEMALLOC.

[1] "SOCK_MEMALLOC vs loopback"
    http://www.spinics.net/lists/ceph-devel/msg22998.html
[2] "[PATCH] libceph: don't set memalloc flags in loopback case"
    http://www.spinics.net/lists/ceph-devel/msg23392.html

Conflicts:
	net/ceph/messenger.c [ context: tcp_nodelay option ]

Cc: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Sage Weil &lt;sage@redhat.com&gt;
Cc: stable@vger.kernel.org # 3.18+, needs backporting
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Acked-by: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client</title>
<updated>2015-02-19T22:14:42Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-02-19T22:14:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4533f6e27a366ecc3da4876074ebfe0cc0ea4f0f'/>
<id>urn:sha1:4533f6e27a366ecc3da4876074ebfe0cc0ea4f0f</id>
<content type='text'>
Pull Ceph changes from Sage Weil:
 "On the RBD side, there is a conversion to blk-mq from Christoph,
  several long-standing bug fixes from Ilya, and some cleanup from
  Rickard Strandqvist.

  On the CephFS side there is a long list of fixes from Zheng, including
  improved session handling, a few IO path fixes, some dcache management
  correctness fixes, and several blocking while !TASK_RUNNING fixes.

  The core code gets a few cleanups and Chaitanya has added support for
  TCP_NODELAY (which has been used on the server side for ages but we
  somehow missed on the kernel client).

  There is also an update to MAINTAINERS to fix up some email addresses
  and reflect that Ilya and Zheng are doing most of the maintenance for
  RBD and CephFS these days.  Do not be surprised to see a pull request
  come from one of them in the future if I am unavailable for some
  reason"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (27 commits)
  MAINTAINERS: update Ceph and RBD maintainers
  libceph: kfree() in put_osd() shouldn't depend on authorizer
  libceph: fix double __remove_osd() problem
  rbd: convert to blk-mq
  ceph: return error for traceless reply race
  ceph: fix dentry leaks
  ceph: re-send requests when MDS enters reconnecting stage
  ceph: show nocephx_require_signatures and notcp_nodelay options
  libceph: tcp_nodelay support
  rbd: do not treat standalone as flatten
  ceph: fix atomic_open snapdir
  ceph: properly mark empty directory as complete
  client: include kernel version in client metadata
  ceph: provide seperate {inode,file}_operations for snapdir
  ceph: fix request time stamp encoding
  ceph: fix reading inline data when i_size &gt; PAGE_SIZE
  ceph: avoid block operation when !TASK_RUNNING (ceph_mdsc_close_sessions)
  ceph: avoid block operation when !TASK_RUNNING (ceph_get_caps)
  ceph: avoid block operation when !TASK_RUNNING (ceph_mdsc_sync)
  rbd: fix error paths in rbd_dev_refresh()
  ...
</content>
</entry>
<entry>
<title>libceph: kfree() in put_osd() shouldn't depend on authorizer</title>
<updated>2015-02-19T11:27:51Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-02-16T08:49:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b28ec2f37e6a2bbd0bdf74b39cb89c74e4ad17f3'/>
<id>urn:sha1:b28ec2f37e6a2bbd0bdf74b39cb89c74e4ad17f3</id>
<content type='text'>
a255651d4cad ("ceph: ensure auth ops are defined before use") made
kfree() in put_osd() conditional on the authorizer.  A mechanical
mistake most likely - fix it.

Cc: Alex Elder &lt;elder@linaro.org&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
</content>
</entry>
<entry>
<title>libceph: fix double __remove_osd() problem</title>
<updated>2015-02-19T11:27:50Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2015-02-17T16:37:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7eb71e0351fbb1b242ae70abb7bb17107fe2f792'/>
<id>urn:sha1:7eb71e0351fbb1b242ae70abb7bb17107fe2f792</id>
<content type='text'>
It turns out it's possible to get __remove_osd() called twice on the
same OSD.  That doesn't sit well with rb_erase() - depending on the
shape of the tree we can get a NULL dereference, a soft lockup or
a random crash at some point in the future as we end up touching freed
memory.  One scenario that I was able to reproduce is as follows:

            &lt;osd3 is idle, on the osd lru list&gt;
&lt;con reset - osd3&gt;
con_fault_finish()
  osd_reset()
                              &lt;osdmap - osd3 down&gt;
                              ceph_osdc_handle_map()
                                &lt;takes map_sem&gt;
                                kick_requests()
                                  &lt;takes request_mutex&gt;
                                  reset_changed_osds()
                                    __reset_osd()
                                      __remove_osd()
                                  &lt;releases request_mutex&gt;
                                &lt;releases map_sem&gt;
    &lt;takes map_sem&gt;
    &lt;takes request_mutex&gt;
    __kick_osd_requests()
      __reset_osd()
        __remove_osd() &lt;-- !!!

A case can be made that osd refcounting is imperfect and reworking it
would be a proper resolution, but for now Sage and I decided to fix
this by adding a safe guard around __remove_osd().

Fixes: http://tracker.ceph.com/issues/8087

Cc: Sage Weil &lt;sage@redhat.com&gt;
Cc: stable@vger.kernel.org # 3.9+: 7c6e6fc53e73: libceph: assert both regular and lingering lists in __remove_osd()
Cc: stable@vger.kernel.org # 3.9+: cc9f1f518cec: libceph: change from BUG to WARN for __remove_osd() asserts
Cc: stable@vger.kernel.org # 3.9+
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Sage Weil &lt;sage@redhat.com&gt;
Reviewed-by: Alex Elder &lt;elder@linaro.org&gt;
</content>
</entry>
<entry>
<title>libceph: tcp_nodelay support</title>
<updated>2015-02-19T10:31:40Z</updated>
<author>
<name>Chaitanya Huilgol</name>
<email>chaitanya.huilgol@gmail.com</email>
</author>
<published>2015-01-23T11:11:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ba988f87f532cd2b8c4740aa8ec49056521ae833'/>
<id>urn:sha1:ba988f87f532cd2b8c4740aa8ec49056521ae833</id>
<content type='text'>
TCP_NODELAY socket option set on connection sockets,
disables Nagle’s algorithm and improves latency characteristics.
tcp_nodelay(default)/notcp_nodelay option flags provided to
enable/disable setting the socket option.

Signed-off-by: Chaitanya Huilgol &lt;chaitanya.huilgol@sandisk.com&gt;
[idryomov@redhat.com: NO_TCP_NODELAY -&gt; TCP_NODELAY, minor adjustments]
Signed-off-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
</content>
</entry>
<entry>
<title>libceph: use mon_client.c/put_generic_request() more</title>
<updated>2015-02-19T10:31:37Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@redhat.com</email>
</author>
<published>2014-12-22T16:35:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f646912d1044ace6c6c5b5785b2579177781873b'/>
<id>urn:sha1:f646912d1044ace6c6c5b5785b2579177781873b</id>
<content type='text'>
Signed-off-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
</content>
</entry>
<entry>
<title>libceph: nuke pool op infrastructure</title>
<updated>2015-02-19T10:31:37Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@redhat.com</email>
</author>
<published>2014-12-22T16:14:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7a6fdeb2b1e93548854063c46c9724458564c76b'/>
<id>urn:sha1:7a6fdeb2b1e93548854063c46c9724458564c76b</id>
<content type='text'>
On Mon, Dec 22, 2014 at 5:35 PM, Sage Weil &lt;sage@newdream.net&gt; wrote:
&gt; On Mon, 22 Dec 2014, Ilya Dryomov wrote:
&gt;&gt; Actually, pool op stuff has been unused for over two years - looks like
&gt;&gt; it was added for rbd create_snap and that got ripped out in 2012.  It's
&gt;&gt; unlikely we'd ever need to manage pools or snaps from the kernel client
&gt;&gt; so I think it makes sense to nuke it.  Sage?
&gt;
&gt; Yep!

Signed-off-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
</content>
</entry>
<entry>
<title>mm: gup: use get_user_pages_unlocked</title>
<updated>2015-02-12T01:06:05Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2015-02-11T23:27:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7e339128496284cc21977fba5416166ee81f5172'/>
<id>urn:sha1:7e339128496284cc21977fba5416166ee81f5172</id>
<content type='text'>
This allows those get_user_pages calls to pass FAULT_FLAG_ALLOW_RETRY to
the page fault in order to release the mmap_sem during the I/O.

Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Reviewed-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Andres Lagar-Cavilla &lt;andreslc@google.com&gt;
Cc: Peter Feiner &lt;pfeiner@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>libceph: fix sparse endianness warnings</title>
<updated>2015-01-08T17:36:57Z</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@redhat.com</email>
</author>
<published>2014-12-19T11:00:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d7d5a007b1c64c617ce3ee30c973ed0bb93443d9'/>
<id>urn:sha1:d7d5a007b1c64c617ce3ee30c973ed0bb93443d9</id>
<content type='text'>
The only real issue is the one in auth_x.c and it came with
3.19-rc1 merge.

Signed-off-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
</content>
</entry>
<entry>
<title>libceph: specify position of extent operation</title>
<updated>2014-12-17T17:09:52Z</updated>
<author>
<name>Yan, Zheng</name>
<email>zyan@redhat.com</email>
</author>
<published>2014-11-13T06:40:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=715e4cd405cfd67bd058e410b3e599bab2072645'/>
<id>urn:sha1:715e4cd405cfd67bd058e410b3e599bab2072645</id>
<content type='text'>
allow specifying position of extent operation in multi-operations
osd request. This is required for cephfs to convert inline data to
normal data (compare xattr, then write object).

Signed-off-by: Yan, Zheng &lt;zyan@redhat.com&gt;
Reviewed-by: Ilya Dryomov &lt;idryomov@redhat.com&gt;
</content>
</entry>
</feed>
