<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/dax, branch v5.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-06-05T02:06:23Z</updated>
<entry>
<title>device-dax: add memory via add_memory_driver_managed()</title>
<updated>2020-06-05T02:06:23Z</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2020-06-04T23:48:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8a725e4694b52ffad755500277d36f3b2eb34755'/>
<id>urn:sha1:8a725e4694b52ffad755500277d36f3b2eb34755</id>
<content type='text'>
Currently, when adding memory, we create entries in /sys/firmware/memmap/
as "System RAM".  This will lead to kexec-tools to add that memory to the
fixed-up initial memmap for a kexec kernel (loaded via kexec_load()).  The
memory will be considered initial System RAM by the kexec'd kernel and can
no longer be reconfigured.  This is not what happens during a real reboot.

Let's add our memory via add_memory_driver_managed() now, so we won't
create entries in /sys/firmware/memmap/ and indicate the memory as "System
RAM (kmem)" in /proc/iomem.  This allows everybody (especially
kexec-tools) to identify that this memory is special and has to be treated
differently than ordinary (hotplugged) System RAM.

Before configuring the namespace:
	[root@localhost ~]# cat /proc/iomem
	...
	140000000-33fffffff : Persistent Memory
	  140000000-33fffffff : namespace0.0
	3280000000-32ffffffff : PCI Bus 0000:00

After configuring the namespace:
	[root@localhost ~]# cat /proc/iomem
	...
	140000000-33fffffff : Persistent Memory
	  140000000-1481fffff : namespace0.0
	  148200000-33fffffff : dax0.0
	3280000000-32ffffffff : PCI Bus 0000:00

After loading kmem before this change:
	[root@localhost ~]# cat /proc/iomem
	...
	140000000-33fffffff : Persistent Memory
	  140000000-1481fffff : namespace0.0
	  150000000-33fffffff : dax0.0
	    150000000-33fffffff : System RAM
	3280000000-32ffffffff : PCI Bus 0000:00

After loading kmem after this change:
	[root@localhost ~]# cat /proc/iomem
	...
	140000000-33fffffff : Persistent Memory
	  140000000-1481fffff : namespace0.0
	  150000000-33fffffff : dax0.0
	    150000000-33fffffff : System RAM (kmem)
	3280000000-32ffffffff : PCI Bus 0000:00

After a proper reboot:
	[root@localhost ~]# cat /proc/iomem
	...
	140000000-33fffffff : Persistent Memory
	  140000000-1481fffff : namespace0.0
	  148200000-33fffffff : dax0.0
	3280000000-32ffffffff : PCI Bus 0000:00

Within the kexec kernel before this change:
	[root@localhost ~]# cat /proc/iomem
	...
	140000000-33fffffff : Persistent Memory
	  140000000-1481fffff : namespace0.0
	  150000000-33fffffff : System RAM
	3280000000-32ffffffff : PCI Bus 0000:00

Within the kexec kernel after this change:
	[root@localhost ~]# cat /proc/iomem
	...
	140000000-33fffffff : Persistent Memory
	  140000000-1481fffff : namespace0.0
	  148200000-33fffffff : dax0.0
	3280000000-32ffffffff : PCI Bus 0000:00

/sys/firmware/memmap/ before this change:
	0000000000000000-000000000009fc00 (System RAM)
	000000000009fc00-00000000000a0000 (Reserved)
	00000000000f0000-0000000000100000 (Reserved)
	0000000000100000-00000000bffdf000 (System RAM)
	00000000bffdf000-00000000c0000000 (Reserved)
	00000000feffc000-00000000ff000000 (Reserved)
	00000000fffc0000-0000000100000000 (Reserved)
	0000000100000000-0000000140000000 (System RAM)
	0000000150000000-0000000340000000 (System RAM)

/sys/firmware/memmap/ after a proper reboot:
	0000000000000000-000000000009fc00 (System RAM)
	000000000009fc00-00000000000a0000 (Reserved)
	00000000000f0000-0000000000100000 (Reserved)
	0000000000100000-00000000bffdf000 (System RAM)
	00000000bffdf000-00000000c0000000 (Reserved)
	00000000feffc000-00000000ff000000 (Reserved)
	00000000fffc0000-0000000100000000 (Reserved)
	0000000100000000-0000000140000000 (System RAM)

/sys/firmware/memmap/ after this change:
	0000000000000000-000000000009fc00 (System RAM)
	000000000009fc00-00000000000a0000 (Reserved)
	00000000000f0000-0000000000100000 (Reserved)
	0000000000100000-00000000bffdf000 (System RAM)
	00000000bffdf000-00000000c0000000 (Reserved)
	00000000feffc000-00000000ff000000 (Reserved)
	00000000fffc0000-0000000100000000 (Reserved)
	0000000100000000-0000000140000000 (System RAM)

kexec-tools already seem to basically ignore any System RAM that's not on
top level when searching for areas to place kexec images - but also for
determining crash areas to dump via kdump.  Changing the resource name
won't have an impact.

Handle unloading of the driver after memory hotremove failed properly, by
duplicating the string if necessary.

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Pavel Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Link: http://lkml.kernel.org/r/20200508084217.9160-5-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>vfs: track per-sb writeback errors and report them to syncfs</title>
<updated>2020-06-02T17:59:05Z</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2020-06-02T04:45:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=735e4ae5ba28c886d249ad04d3c8cc097dad6336'/>
<id>urn:sha1:735e4ae5ba28c886d249ad04d3c8cc097dad6336</id>
<content type='text'>
Patch series "vfs: have syncfs() return error when there are writeback
errors", v6.

Currently, syncfs does not return errors when one of the inodes fails to
be written back.  It will return errors based on the legacy AS_EIO and
AS_ENOSPC flags when syncing out the block device fails, but that's not
particularly helpful for filesystems that aren't backed by a blockdev.
It's also possible for a stray sync to lose those errors.

The basic idea in this set is to track writeback errors at the
superblock level, so that we can quickly and easily check whether
something bad happened without having to fsync each file individually.
syncfs is then changed to reliably report writeback errors after they
occur, much in the same fashion as fsync does now.

This patch (of 2):

Usually we suggest that applications call fsync when they want to ensure
that all data written to the file has made it to the backing store, but
that can be inefficient when there are a lot of open files.

Calling syncfs on the filesystem can be more efficient in some
situations, but the error reporting doesn't currently work the way most
people expect.  If a single inode on a filesystem reports a writeback
error, syncfs won't necessarily return an error.  syncfs only returns an
error if __sync_blockdev fails, and on some filesystems that's a no-op.

It would be better if syncfs reported an error if there were any
writeback failures.  Then applications could call syncfs to see if there
are any errors on any open files, and could then call fsync on all of
the other descriptors to figure out which one failed.

This patch adds a new errseq_t to struct super_block, and has
mapping_set_error also record writeback errors there.

To report those errors, we also need to keep an errseq_t in struct file
to act as a cursor.  This patch adds a dedicated field for that purpose,
which slots nicely into 4 bytes of padding at the end of struct file on
x86_64.

An earlier version of this patch used an O_PATH file descriptor to cue
the kernel that the open file should track the superblock error and not
the inode's writeback error.

I think that API is just too weird though.  This is simpler and should
make syncfs error reporting "just work" even if someone is multiplexing
fsync and syncfs on the same fds.

Signed-off-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Andres Freund &lt;andres@anarazel.de&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Link: http://lkml.kernel.org/r/20200428135155.19223-1-jlayton@kernel.org
Link: http://lkml.kernel.org/r/20200428135155.19223-2-jlayton@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>device-dax: don't leak kernel memory to user space after unloading kmem</title>
<updated>2020-05-23T17:26:31Z</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2020-05-23T05:22:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=60858c00e5f018eda711a3aa84cf62214ef62d61'/>
<id>urn:sha1:60858c00e5f018eda711a3aa84cf62214ef62d61</id>
<content type='text'>
Assume we have kmem configured and loaded:

  [root@localhost ~]# cat /proc/iomem
  ...
  140000000-33fffffff : Persistent Memory$
    140000000-1481fffff : namespace0.0
    150000000-33fffffff : dax0.0
      150000000-33fffffff : System RAM

Assume we try to unload kmem. This force-unloading will work, even if
memory cannot get removed from the system.

  [root@localhost ~]# rmmod kmem
  [   86.380228] removing memory fails, because memory [0x0000000150000000-0x0000000157ffffff] is onlined
  ...
  [   86.431225] kmem dax0.0: DAX region [mem 0x150000000-0x33fffffff] cannot be hotremoved until the next reboot

Now, we can reconfigure the namespace:

  [root@localhost ~]# ndctl create-namespace --force --reconfig=namespace0.0 --mode=devdax
  [  131.409351] nd_pmem namespace0.0: could not reserve region [mem 0x140000000-0x33fffffff]dax
  [  131.410147] nd_pmem: probe of namespace0.0 failed with error -16namespace0.0 --mode=devdax
  ...

This fails as expected due to the busy memory resource, and the memory
cannot be used.  However, the dax0.0 device is removed, and along its
name.

The name of the memory resource now points at freed memory (name of the
device):

  [root@localhost ~]# cat /proc/iomem
  ...
  140000000-33fffffff : Persistent Memory
    140000000-1481fffff : namespace0.0
    150000000-33fffffff : �_�^7_��/_��wR��WQ���^��� ...
    150000000-33fffffff : System RAM

We have to make sure to duplicate the string.  While at it, remove the
superfluous setting of the name and fixup a stale comment.

Fixes: 9f960da72b25 ("device-dax: "Hotremove" persistent memory that is used like normal RAM")
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Pavel Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[5.3]
Link: http://lkml.kernel.org/r/20200508084217.9160-2-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>dax: Move mandatory -&gt;zero_page_range() check in alloc_dax()</title>
<updated>2020-04-03T02:15:03Z</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2020-04-01T16:11:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4e4ced93794acb42adb19484132966defba8f3a6'/>
<id>urn:sha1:4e4ced93794acb42adb19484132966defba8f3a6</id>
<content type='text'>
zero_page_range() dax operation is mandatory for dax devices. Right now
that check happens in dax_zero_page_range() function. Dan thinks that's
too late and its better to do the check earlier in alloc_dax().

I also modified alloc_dax() to return pointer with error code in it in
case of failure. Right now it returns NULL and caller assumes failure
happened due to -ENOMEM. But with this -&gt;zero_page_range() check, I
need to return -EINVAL instead.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Link: https://lore.kernel.org/r/20200401161125.GB9398@redhat.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>dax, pmem: Add a dax operation zero_page_range</title>
<updated>2020-04-03T02:15:03Z</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2020-02-28T16:34:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f605a263e0690177ecc180417eacf2b5507dd177'/>
<id>urn:sha1:f605a263e0690177ecc180417eacf2b5507dd177</id>
<content type='text'>
Add a dax operation zero_page_range, to zero a page. This will also clear any
known poison in the page being zeroed.

As of now, zeroing of one page is allowed in a single call. There
are no callers which are trying to zero more than a page in a single call.
Once we grow the callers which zero more than a page in single call, we
can add that support. Primary reason for not doing that yet is that this
will add little complexity in dm implementation where a range might be
spanning multiple underlying targets and one will have to split the range
into multiple sub ranges and call zero_page_range() on individual targets.

Suggested-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Reviewed-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Link: https://lore.kernel.org/r/20200228163456.1587-3-vgoyal@redhat.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>dax: Get rid of fs_dax_get_by_host() helper</title>
<updated>2020-01-16T17:52:27Z</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2020-01-06T18:11:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f01b16a85bfae2e6b4f32de0a1f37ac4050dc316'/>
<id>urn:sha1:f01b16a85bfae2e6b4f32de0a1f37ac4050dc316</id>
<content type='text'>
Looks like nobody is using fs_dax_get_by_host() except fs_dax_get_by_bdev()
and it can easily use dax_get_by_host() instead.

IIUC, fs_dax_get_by_host() was only introduced so that one could compile
with CONFIG_FS_DAX=n and CONFIG_DAX=m. fs_dax_get_by_bdev() achieves
the same purpose and hence it looks like fs_dax_get_by_host() is not
needed anymore.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20200106181117.GA16248@redhat.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'libnvdimm-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm</title>
<updated>2019-12-02T02:43:25Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-12-02T02:43:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d10032dd539c93dbff016f5667e5627c6c2a4467'/>
<id>urn:sha1:d10032dd539c93dbff016f5667e5627c6c2a4467</id>
<content type='text'>
Pull libnvdimm updates from Dan Williams:
 "The highlight this cycle is continuing integration fixes for PowerPC
  and some resulting optimizations.

  Summary:

   - Updates to better support vmalloc space restrictions on PowerPC
     platforms.

   - Cleanups to move common sysfs attributes to core 'struct
     device_type' objects.

   - Export the 'target_node' attribute (the effective numa node if pmem
     is marked online) for regions and namespaces.

   - Miscellaneous fixups and optimizations"

* tag 'libnvdimm-for-5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (21 commits)
  MAINTAINERS: Remove Keith from NVDIMM maintainers
  libnvdimm: Export the target_node attribute for regions and namespaces
  dax: Add numa_node to the default device-dax attributes
  libnvdimm: Simplify root read-only definition for the 'resource' attribute
  dax: Simplify root read-only definition for the 'resource' attribute
  dax: Create a dax device_type
  libnvdimm: Move nvdimm_bus_attribute_group to device_type
  libnvdimm: Move nvdimm_attribute_group to device_type
  libnvdimm: Move nd_mapping_attribute_group to device_type
  libnvdimm: Move nd_region_attribute_group to device_type
  libnvdimm: Move nd_numa_attribute_group to device_type
  libnvdimm: Move nd_device_attribute_group to device_type
  libnvdimm: Move region attribute group definition
  libnvdimm: Move attribute groups to device type
  libnvdimm: Remove prototypes for nonexistent functions
  libnvdimm/btt: fix variable 'rc' set but not used
  libnvdimm/pmem: Delete include of nd-core.h
  libnvdimm/namespace: Differentiate between probe mapping and runtime mapping
  libnvdimm/pfn_dev: Don't clear device memmap area during generic namespace probe
  libnvdimm: Trivial comment fix
  ...
</content>
</entry>
<entry>
<title>dax: Add numa_node to the default device-dax attributes</title>
<updated>2019-11-19T17:52:12Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2019-11-13T01:13:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cb4dd729ee6ccc67ad604b1750990eb8c18783fa'/>
<id>urn:sha1:cb4dd729ee6ccc67ad604b1750990eb8c18783fa</id>
<content type='text'>
It is confusing that device-dax instances publish a 'target_node'
attribute, but not a 'numa_node'. The 'numa_node' information is
available elsewhere in the sysfs device hierarchy, but it is not obvious
and not reliable from one device-dax instance-type (e.g. child devices
of nvdimm namespaces) to the next (e.g. 'hmem' devices defined by EFI
Specific Purpose Memory and the ACPI HMAT).

Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/157309906102.1582359.4262088001244476001.stgit@dwillia2-desk3.amr.corp.intel.com</content>
</entry>
<entry>
<title>dax: Simplify root read-only definition for the 'resource' attribute</title>
<updated>2019-11-19T17:52:12Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2019-11-13T01:12:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=153dd28647d63086dc6e1df5ee06fd0a5d6435a5'/>
<id>urn:sha1:153dd28647d63086dc6e1df5ee06fd0a5d6435a5</id>
<content type='text'>
Rather than update the permission in -&gt;is_visible() set the permission
directly at declaration time.

Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/157309904959.1582359.7281180042781955506.stgit@dwillia2-desk3.amr.corp.intel.com</content>
</entry>
<entry>
<title>dax: Create a dax device_type</title>
<updated>2019-11-19T17:52:12Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2019-11-13T01:12:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=770619a95106340230a72a725c958c037284ec1f'/>
<id>urn:sha1:770619a95106340230a72a725c958c037284ec1f</id>
<content type='text'>
Move the open coded release method and attribute groups to a 'struct
device_type' instance.

Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/157309904365.1582359.5451327195246651379.stgit@dwillia2-desk3.amr.corp.intel.com</content>
</entry>
</feed>
