<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/resource.c, branch v6.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2024-01-09T19:46:20Z</updated>
<entry>
<title>Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2024-01-09T19:46:20Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-01-09T19:46:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9f2a635235823cf016eb8af0aeb3c0b2b25cea64'/>
<id>urn:sha1:9f2a635235823cf016eb8af0aeb3c0b2b25cea64</id>
<content type='text'>
Pull non-MM updates from Andrew Morton:
 "Quite a lot of kexec work this time around. Many singleton patches in
  many places. The notable patch series are:

   - nilfs2 folio conversion from Matthew Wilcox in 'nilfs2: Folio
     conversions for file paths'.

   - Additional nilfs2 folio conversion from Ryusuke Konishi in 'nilfs2:
     Folio conversions for directory paths'.

   - IA64 remnant removal in Heiko Carstens's 'Remove unused code after
     IA-64 removal'.

   - Arnd Bergmann has enabled the -Wmissing-prototypes warning
     everywhere in 'Treewide: enable -Wmissing-prototypes'. This had
     some followup fixes:

      - Nathan Chancellor has cleaned up the hexagon build in the series
        'hexagon: Fix up instances of -Wmissing-prototypes'.

      - Nathan also addressed some s390 warnings in 's390: A couple of
        fixes for -Wmissing-prototypes'.

      - Arnd Bergmann addresses the same warnings for MIPS in his series
        'mips: address -Wmissing-prototypes warnings'.

   - Baoquan He has made kexec_file operate in a top-down-fitting manner
     similar to kexec_load in the series 'kexec_file: Load kernel at top
     of system RAM if required'

   - Baoquan He has also added the self-explanatory 'kexec_file: print
     out debugging message if required'.

   - Some checkstack maintenance work from Tiezhu Yang in the series
     'Modify some code about checkstack'.

   - Douglas Anderson has disentangled the watchdog code's logging when
     multiple reports are occurring simultaneously. The series is
     'watchdog: Better handling of concurrent lockups'.

   - Yuntao Wang has contributed some maintenance work on the crash code
     in 'crash: Some cleanups and fixes'"

* tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (157 commits)
  crash_core: fix and simplify the logic of crash_exclude_mem_range()
  x86/crash: use SZ_1M macro instead of hardcoded value
  x86/crash: remove the unused image parameter from prepare_elf_headers()
  kdump: remove redundant DEFAULT_CRASH_KERNEL_LOW_SIZE
  scripts/decode_stacktrace.sh: strip unexpected CR from lines
  watchdog: if panicking and we dumped everything, don't re-enable dumping
  watchdog/hardlockup: use printk_cpu_sync_get_irqsave() to serialize reporting
  watchdog/softlockup: use printk_cpu_sync_get_irqsave() to serialize reporting
  watchdog/hardlockup: adopt softlockup logic avoiding double-dumps
  kexec_core: fix the assignment to kimage-&gt;control_page
  x86/kexec: fix incorrect end address passed to kernel_ident_mapping_init()
  lib/trace_readwrite.c:: replace asm-generic/io with linux/io
  nilfs2: cpfile: fix some kernel-doc warnings
  stacktrace: fix kernel-doc typo
  scripts/checkstack.pl: fix no space expression between sp and offset
  x86/kexec: fix incorrect argument passed to kexec_dprintk()
  x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs
  nilfs2: add missing set_freezable() for freezable kthread
  kernel: relay: remove relay_file_splice_read dead code, doesn't work
  docs: submit-checklist: remove all of "make namespacecheck"
  ...
</content>
</entry>
<entry>
<title>resource: add walk_system_ram_res_rev()</title>
<updated>2023-12-11T01:21:44Z</updated>
<author>
<name>Baoquan He</name>
<email>bhe@redhat.com</email>
</author>
<published>2023-11-15T13:00:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7acf164b259d9007264d9d8501da1023f140a3b4'/>
<id>urn:sha1:7acf164b259d9007264d9d8501da1023f140a3b4</id>
<content type='text'>
This function, being a variant of walk_system_ram_res() introduced in
commit 8c86e70acead ("resource: provide new functions to walk through
resources"), walks through a list of all the resources of System RAM in
reversed order, i.e., from higher to lower.

It will be used in kexec_file code to load kernel, initrd etc when
preparing kexec reboot.

Link: https://lkml.kernel.org/r/ZVTA6z/06cLnWKUz@MiWiFi-R3L-srv
Signed-off-by: AKASHI Takahiro &lt;takahiro.akashi@linaro.org&gt;
Signed-off-by: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/resource: Increment by align value in get_free_mem_region()</title>
<updated>2023-12-05T01:19:03Z</updated>
<author>
<name>Alison Schofield</name>
<email>alison.schofield@intel.com</email>
</author>
<published>2023-11-13T22:13:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=659aa050a53817157b7459529538598a6449c1d3'/>
<id>urn:sha1:659aa050a53817157b7459529538598a6449c1d3</id>
<content type='text'>
Currently get_free_mem_region() searches for available capacity
in increments equal to the region size being requested. This can
cause the search to take giant steps through the resource leaving
needless gaps and missing available space.

Specifically 'cxl create-region' fails with ERANGE even though capacity
of the given size and CXL's expected 256M x InterleaveWays alignment can
be satisfied.

Replace the total-request-size increment with a next alignment increment
so that the next possible address is always examined for availability.

Fixes: 14b80582c43e ("resource: Introduce alloc_free_mem_region()")
Reported-by: Dmytro Adamenko &lt;dmytro.adamenko@intel.com&gt;
Reported-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Alison Schofield &lt;alison.schofield@intel.com&gt;
Reviewed-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Link: https://lore.kernel.org/r/20231113221324.1118092-1-alison.schofield@intel.com
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>resource: Unify next_resource() and next_resource_skip_children()</title>
<updated>2023-10-05T11:58:07Z</updated>
<author>
<name>Andy Shevchenko</name>
<email>andriy.shevchenko@linux.intel.com</email>
</author>
<published>2023-09-12T16:53:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=10dabdf45ed34caaaad97978306fe6e9ee7581d9'/>
<id>urn:sha1:10dabdf45ed34caaaad97978306fe6e9ee7581d9</id>
<content type='text'>
We have the next_resource() is used once and no user for the
next_resource_skip_children() outside of the for_each_resource().

Unify them by adding skip_children parameter to the next_resource().

Signed-off-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20230912165312.402422-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>resource: Reuse for_each_resource() macro</title>
<updated>2023-10-05T11:58:07Z</updated>
<author>
<name>Andy Shevchenko</name>
<email>andriy.shevchenko@linux.intel.com</email>
</author>
<published>2023-09-12T16:53:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=441f0dd8fa035a2c7cfe972047bb905d3be05c1b'/>
<id>urn:sha1:441f0dd8fa035a2c7cfe972047bb905d3be05c1b</id>
<content type='text'>
We have a few places where for_each_resource() is open coded.
Replace that by the macro. This makes code easier to read and
understand.

With this, compile r_next() only for CONFIG_PROC_FS=y.

Signed-off-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20230912165312.402422-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dax/kmem: Fix leak of memory-hotplug resources</title>
<updated>2023-02-17T22:58:01Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2023-02-16T08:36:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e686c32590f40bffc45f105c04c836ffad3e531a'/>
<id>urn:sha1:e686c32590f40bffc45f105c04c836ffad3e531a</id>
<content type='text'>
While experimenting with CXL region removal the following corruption of
/proc/iomem appeared.

Before:
f010000000-f04fffffff : CXL Window 0
  f010000000-f02fffffff : region4
    f010000000-f02fffffff : dax4.0
      f010000000-f02fffffff : System RAM (kmem)

After (modprobe -r cxl_test):
f010000000-f02fffffff : **redacted binary garbage**
  f010000000-f02fffffff : System RAM (kmem)

...and testing further the same is visible with persistent memory
assigned to kmem:

Before:
480000000-243fffffff : Persistent Memory
  480000000-57e1fffff : namespace3.0
  580000000-243fffffff : dax3.0
    580000000-243fffffff : System RAM (kmem)

After (ndctl disable-region all):
480000000-243fffffff : Persistent Memory
  580000000-243fffffff : ***redacted binary garbage***
    580000000-243fffffff : System RAM (kmem)

The corrupted data is from a use-after-free of the "dax4.0" and "dax3.0"
resources, and it also shows that the "System RAM (kmem)" resource is
not being removed. The bug does not appear after "modprobe -r kmem", it
requires the parent of "dax4.0" and "dax3.0" to be removed which
re-parents the leaked "System RAM (kmem)" instances. Those in turn
reference the freed resource as a parent.

First up for the fix is release_mem_region_adjustable() needs to
reliably delete the resource inserted by add_memory_driver_managed().
That is thwarted by a check for IORESOURCE_SYSRAM that predates the
dax/kmem driver, from commit:

65c78784135f ("kernel, resource: check for IORESOURCE_SYSRAM in release_mem_region_adjustable")

That appears to be working around the behavior of HMM's
"MEMORY_DEVICE_PUBLIC" facility that has since been deleted. With that
check removed the "System RAM (kmem)" resource gets removed, but
corruption still occurs occasionally because the "dax" resource is not
reliably removed.

The dax range information is freed before the device is unregistered, so
the driver can not reliably recall (another use after free) what it is
meant to release. Lastly if that use after free got lucky, the driver
was covering up the leak of "System RAM (kmem)" due to its use of
release_resource() which detaches, but does not free, child resources.
The switch to remove_resource() forces remove_memory() to be responsible
for the deletion of the resource added by add_memory_driver_managed().

Fixes: c2f3011ee697 ("device-dax: add an allocation interface for device-dax instances")
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Pavel Tatashin &lt;pasha.tatashin@soleen.com&gt;
Reviewed-by: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Reviewed-by: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Reviewed-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Link: https://lore.kernel.org/r/167653656244.3147810.5705900882794040229.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core</title>
<updated>2022-12-16T11:54:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-12-16T11:54:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=71a7507afbc3f27c346898f13ab9bfd918613c34'/>
<id>urn:sha1:71a7507afbc3f27c346898f13ab9bfd918613c34</id>
<content type='text'>
Pull driver core updates from Greg KH:
 "Here is the set of driver core and kernfs changes for 6.2-rc1.

  The "big" change in here is the addition of a new macro,
  container_of_const() that will preserve the "const-ness" of a pointer
  passed into it.

  The "problem" of the current container_of() macro is that if you pass
  in a "const *", out of it can comes a non-const pointer unless you
  specifically ask for it. For many usages, we want to preserve the
  "const" attribute by using the same call. For a specific example, this
  series changes the kobj_to_dev() macro to use it, allowing it to be
  used no matter what the const value is. This prevents every subsystem
  from having to declare 2 different individual macros (i.e.
  kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
  the const value at build time, which having 2 macros would not do
  either.

  The driver for all of this have been discussions with the Rust kernel
  developers as to how to properly mark driver core, and kobject,
  objects as being "non-mutable". The changes to the kobject and driver
  core in this pull request are the result of that, as there are lots of
  paths where kobjects and device pointers are not modified at all, so
  marking them as "const" allows the compiler to enforce this.

  So, a nice side affect of the Rust development effort has been already
  to clean up the driver core code to be more obvious about object
  rules.

  All of this has been bike-shedded in quite a lot of detail on lkml
  with different names and implementations resulting in the tiny version
  we have in here, much better than my original proposal. Lots of
  subsystem maintainers have acked the changes as well.

  Other than this change, included in here are smaller stuff like:

   - kernfs fixes and updates to handle lock contention better

   - vmlinux.lds.h fixes and updates

   - sysfs and debugfs documentation updates

   - device property updates

  All of these have been in the linux-next tree for quite a while with
  no problems"

* tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits)
  device property: Fix documentation for fwnode_get_next_parent()
  firmware_loader: fix up to_fw_sysfs() to preserve const
  usb.h: take advantage of container_of_const()
  device.h: move kobj_to_dev() to use container_of_const()
  container_of: add container_of_const() that preserves const-ness of the pointer
  driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion.
  driver core: fix up missed scsi/cxlflash class.devnode() conversion.
  driver core: fix up some missing class.devnode() conversions.
  driver core: make struct class.devnode() take a const *
  driver core: make struct class.dev_uevent() take a const *
  cacheinfo: Remove of_node_put() for fw_token
  device property: Add a blank line in Kconfig of tests
  device property: Rename goto label to be more precise
  device property: Move PROPERTY_ENTRY_BOOL() a bit down
  device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*()
  kernfs: fix all kernel-doc warnings and multiple typos
  driver core: pass a const * into of_device_uevent()
  kobject: kset_uevent_ops: make name() callback take a const *
  kobject: kset_uevent_ops: make filter() callback take a const *
  kobject: make kobject_namespace take a const *
  ...
</content>
</entry>
<entry>
<title>PCI: Allow drivers to request exclusive config regions</title>
<updated>2022-11-14T18:07:22Z</updated>
<author>
<name>Ira Weiny</name>
<email>ira.weiny@intel.com</email>
</author>
<published>2022-09-26T21:57:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=278294798ac9118412c9624a801d3f20f2279363'/>
<id>urn:sha1:278294798ac9118412c9624a801d3f20f2279363</id>
<content type='text'>
PCI config space access from user space has traditionally been
unrestricted with writes being an understood risk for device operation.

Unfortunately, device breakage or odd behavior from config writes lacks
indicators that can leave driver writers confused when evaluating
failures.  This is especially true with the new PCIe Data Object
Exchange (DOE) mailbox protocol where backdoor shenanigans from user
space through things such as vendor defined protocols may affect device
operation without complete breakage.

A prior proposal restricted read and writes completely.[1]  Greg and
Bjorn pointed out that proposal is flawed for a couple of reasons.
First, lspci should always be allowed and should not interfere with any
device operation.  Second, setpci is a valuable tool that is sometimes
necessary and it should not be completely restricted.[2]  Finally
methods exist for full lock of device access if required.

Even though access should not be restricted it would be nice for driver
writers to be able to flag critical parts of the config space such that
interference from user space can be detected.

Introduce pci_request_config_region_exclusive() to mark exclusive config
regions.  Such regions trigger a warning and kernel taint if accessed
via user space.

Create pci_warn_once() to restrict the user from spamming the log.

[1] https://lore.kernel.org/all/161663543465.1867664.5674061943008380442.stgit@dwillia2-desk3.amr.corp.intel.com/
[2] https://lore.kernel.org/all/YF8NGeGv9vYcMfTV@kroah.com/

Cc: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Suggested-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Acked-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Link: https://lore.kernel.org/r/20220926215711.2893286-2-ira.weiny@intel.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>resource: Replace printk(KERN_WARNING) by pr_warn(), printk() by pr_info()</title>
<updated>2022-11-10T17:58:24Z</updated>
<author>
<name>Andy Shevchenko</name>
<email>andriy.shevchenko@linux.intel.com</email>
</author>
<published>2022-11-09T15:56:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2a4e628570d42fcc13a94f1acf25e3cfeaec08f6'/>
<id>urn:sha1:2a4e628570d42fcc13a94f1acf25e3cfeaec08f6</id>
<content type='text'>
Replace printk(KERN_WARNING) by pr_warn() and printk() by pr_info().

While at it, use %pa for the resource_size_t variables. With that,
for the sake of consistency, introduce a temporary variable for
the end address in iomem_map_sanity_check() like it's done in another
function in the same module.

Signed-off-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Link: https://lore.kernel.org/r/20221109155618.42276-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>resource: Introduce alloc_free_mem_region()</title>
<updated>2022-07-22T00:19:25Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2022-05-20T20:41:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=14b80582c43e4f550acfd93c2b2cadbe36ea0874'/>
<id>urn:sha1:14b80582c43e4f550acfd93c2b2cadbe36ea0874</id>
<content type='text'>
The core of devm_request_free_mem_region() is a helper that searches for
free space in iomem_resource and performs __request_region_locked() on
the result of that search. The policy choices of the implementation
conform to what CONFIG_DEVICE_PRIVATE users want which is memory that is
immediately marked busy, and a preference to search for the first-fit
free range in descending order from the top of the physical address
space.

CXL has a need for a similar allocator, but with the following tweaks:

1/ Search for free space in ascending order

2/ Search for free space relative to a given CXL window

3/ 'insert' rather than 'request' the new resource given downstream
   drivers from the CXL Region driver (like the pmem or dax drivers) are
   responsible for request_mem_region() when they activate the memory
   range.

Rework __request_free_mem_region() into get_free_mem_region() which
takes a set of GFR_* (Get Free Region) flags to control the allocation
policy (ascending vs descending), and "busy" policy (insert_resource()
vs request_region()).

As part of the consolidation of the legacy GFR_REQUEST_REGION case with
the new default of just inserting a new resource into the free space
some minor cleanups like not checking for NULL before calling
devres_free() (which does its own check) is included.

Suggested-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Link: https://lore.kernel.org/linux-cxl/20220420143406.GY2120790@nvidia.com/
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Link: https://lore.kernel.org/r/165784333333.1758207.13703329337805274043.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
</feed>
