<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/edac, branch v5.9</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.9</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.9'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-09-15T07:42:15Z</updated>
<entry>
<title>EDAC/ghes: Check whether the driver is on the safe list correctly</title>
<updated>2020-09-15T07:42:15Z</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2020-09-11T16:17:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=251c54ea26fa6029b01a76161a37a12fde5124e4'/>
<id>urn:sha1:251c54ea26fa6029b01a76161a37a12fde5124e4</id>
<content type='text'>
With CONFIG_DEBUG_TEST_DRIVER_REMOVE=y, a system would try to probe,
unregister and probe again a driver.

When ghes_edac is attempted to be loaded on a system which is not on
the safe platforms list, ghes_edac_register() would return early. The
unregister counterpart ghes_edac_unregister() would still attempt to
unregister and exit early at the refcount test, leading to the refcount
underflow below.

In order to not do *anything* on the unregister path too, reuse the
force_load parameter and check it on that path too, before fumbling with
the refcount.

  ghes_edac: ghes_edac_register: entry
  ghes_edac: ghes_edac_register: return -ENODEV
  ------------[ cut here ]------------
  refcount_t: underflow; use-after-free.
  WARNING: CPU: 10 PID: 1 at lib/refcount.c:28 refcount_warn_saturate+0xb9/0x100
  Modules linked in:
  CPU: 10 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4+ #12
  Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018
  RIP: 0010:refcount_warn_saturate+0xb9/0x100
  Code: 82 e8 fb 8f 4d 00 90 0f 0b 90 90 c3 80 3d 55 4c f5 00 00 75 88 c6 05 4c 4c f5 00 01 90 48 c7 c7 d0 8a 10 82 e8 d8 8f 4d 00 90 &lt;0f&gt; 0b 90 90 c3 80 3d 30 4c f5 00 00 0f 85 61 ff ff ff c6 05 23 4c
  RSP: 0018:ffffc90000037d58 EFLAGS: 00010292
  RAX: 0000000000000026 RBX: ffff88840b8da000 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffffffff8216b24f RDI: 00000000ffffffff
  RBP: ffff88840c662e00 R08: 0000000000000001 R09: 0000000000000001
  R10: 0000000000000001 R11: 0000000000000046 R12: 0000000000000000
  R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff88840ee80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000800002211000 CR4: 00000000003506e0
  Call Trace:
   ghes_edac_unregister
   ghes_remove
   platform_drv_remove
   really_probe
   driver_probe_device
   device_driver_attach
   __driver_attach
   ? device_driver_attach
   ? device_driver_attach
   bus_for_each_dev
   bus_add_driver
   driver_register
   ? bert_init
   ghes_init
   do_one_initcall
   ? rcu_read_lock_sched_held
   kernel_init_freeable
   ? rest_init
   kernel_init
   ret_from_fork
   ...
  ghes_edac: ghes_edac_unregister: FALSE, refcount: -1073741824

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20200911164950.GB19320@zn.tnic
</content>
</entry>
<entry>
<title>EDAC/ghes: Clear scanned data on unload</title>
<updated>2020-09-15T07:41:28Z</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2020-09-11T10:55:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cd8100f1f3be406a4e0e122e7307ac0fca6d57cc'/>
<id>urn:sha1:cd8100f1f3be406a4e0e122e7307ac0fca6d57cc</id>
<content type='text'>
Commit

  b972fdba8665 ("EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()")

didn't clear all the information from the scanned system and, more
specifically, left ghes_hw.num_dimms to its previous value. On a
second load (CONFIG_DEBUG_TEST_DRIVER_REMOVE=y), the driver would use
the leftover num_dimms value which is not 0 and thus the 0 check in
enumerate_dimms() will get bypassed and it would go directly to the
pointer deref:

  d = &amp;hw-&gt;dimms[hw-&gt;num_dimms];

which is, of course, NULL:

  #PF: supervisor write access in kernel mode
  #PF: error_code(0x0002) - not-present page
  PGD 0 P4D 0
  Oops: 0002 [#1] PREEMPT SMP
  CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4+ #7
  Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018
  RIP: 0010:enumerate_dimms.cold+0x7b/0x375

Reset the whole ghes_hw on driver unregister so that no stale values are
used on a second system scan.

Fixes: b972fdba8665 ("EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()")
Cc: Shiju Jose &lt;shiju.jose@huawei.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20200911164817.GA19320@zn.tnic
</content>
</entry>
<entry>
<title>Merge tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras</title>
<updated>2020-08-30T17:47:23Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-08-30T17:47:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=42df60fcdfb8c02fa1800899cb33858e46efd587'/>
<id>urn:sha1:42df60fcdfb8c02fa1800899cb33858e46efd587</id>
<content type='text'>
Pull EDAC fix from Borislav Petkov:
 "A fix to properly clear ghes_edac driver state on driver remove so
  that a subsequent load can probe the system properly (Shiju Jose)"

* tag 'edac_urgent_for_v5.9_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()
</content>
</entry>
<entry>
<title>EDAC/ghes: Fix NULL pointer dereference in ghes_edac_register()</title>
<updated>2020-08-27T16:04:07Z</updated>
<author>
<name>Shiju Jose</name>
<email>shiju.jose@huawei.com</email>
</author>
<published>2020-08-27T14:04:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b972fdba8665d75109ade0357739f46af6415d2a'/>
<id>urn:sha1:b972fdba8665d75109ade0357739f46af6415d2a</id>
<content type='text'>
After

  b9cae27728d1 ("EDAC/ghes: Scan the system once on driver init")

and with CONFIG_DEBUG_TEST_DRIVER_REMOVE enabled, ghes_hw.dimms becomes
a NULL pointer after the second -&gt;probe() (aka ghes_edac_register())
which the config option causes to be called.

This happens because the static variable which holds down whether
the system has been scanned already, doesn't get reset in
ghes_edac_unregister(). Then, on the second probe, ghes_scan_system()
doesn't get to enumerate the DIMMs, leading to ghes_hw.dimms remaining
NULL.

Clear the variable and rename it to something more descriptive so that a
second probe succeeds.

 [ bp: Rewrite commit message. ]

Fixes: b9cae27728d1 ("EDAC/ghes: Scan the system once on driver init")
Suggested-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Shiju Jose &lt;shiju.jose@huawei.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20200827140450.1620-1-shiju.jose@huawei.com
</content>
</entry>
<entry>
<title>treewide: Use fallthrough pseudo-keyword</title>
<updated>2020-08-23T22:36:59Z</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavoars@kernel.org</email>
</author>
<published>2020-08-23T22:36:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=df561f6688fef775baa341a0f5d960becd248b11'/>
<id>urn:sha1:df561f6688fef775baa341a0f5d960becd248b11</id>
<content type='text'>
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
</content>
</entry>
<entry>
<title>EDAC/{i7core,sb,pnd2,skx}: Fix error event severity</title>
<updated>2020-08-18T13:40:30Z</updated>
<author>
<name>Tony Luck</name>
<email>tony.luck@intel.com</email>
</author>
<published>2020-07-07T19:43:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=45bc6098a3e279d8e391d22428396687562797e2'/>
<id>urn:sha1:45bc6098a3e279d8e391d22428396687562797e2</id>
<content type='text'>
IA32_MCG_STATUS.RIPV indicates whether the return RIP value pushed onto
the stack as part of machine check delivery is valid or not.

Various drivers copied a code fragment that uses the RIPV bit to
determine the severity of the error as either HW_EVENT_ERR_UNCORRECTED
or HW_EVENT_ERR_FATAL, but this check is reversed (marking errors where
RIPV is set as "FATAL").

Reverse the tests so that the error is marked fatal when RIPV is not set.

Reported-by: Gabriele Paoloni &lt;gabriele.paoloni@intel.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lkml.kernel.org/r/20200707194324.14884-1-tony.luck@intel.com
</content>
</entry>
<entry>
<title>Merge tag 'edac_updates_for_5.9_pt2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras</title>
<updated>2020-08-15T15:25:41Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-08-15T15:25:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6ffdcde4ee9a20beda096dec664da89002610d7d'/>
<id>urn:sha1:6ffdcde4ee9a20beda096dec664da89002610d7d</id>
<content type='text'>
Pull edac fix from Tony Luck:
 "Fix for the ie31200 driver that missed the first pull"

* tag 'edac_updates_for_5.9_pt2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/ie31200: Fallback if host bridge device is already initialized
</content>
</entry>
<entry>
<title>EDAC/ie31200: Fallback if host bridge device is already initialized</title>
<updated>2020-08-10T18:13:06Z</updated>
<author>
<name>Jason Baron</name>
<email>jbaron@akamai.com</email>
</author>
<published>2020-07-16T18:25:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=709ed1bcef12398ac1a35c149f3e582db04456c2'/>
<id>urn:sha1:709ed1bcef12398ac1a35c149f3e582db04456c2</id>
<content type='text'>
The Intel uncore driver may claim some of the pci ids from ie31200 which
means that the ie31200 edac driver will not initialize them as part of
pci_register_driver().

Let's add a fallback for this case to 'pci_get_device()' to get a
reference on the device such that it can still be configured. This is
similar in approach to other edac drivers.

Signed-off-by: Jason Baron &lt;jbaron@akamai.com&gt;
Cc: Borislav Petkov &lt;bp@suse.de&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: linux-edac &lt;linux-edac@vger.kernel.org&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Link: https://lore.kernel.org/r/1594923911-10885-1-git-send-email-jbaron@akamai.com
</content>
</entry>
<entry>
<title>Merge tag 'edac_updates_for_5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras</title>
<updated>2020-08-04T03:01:00Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-08-04T03:01:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f8851cb2d0cc2b4f8ac5dadb03148d6b58609b1c'/>
<id>urn:sha1:f8851cb2d0cc2b4f8ac5dadb03148d6b58609b1c</id>
<content type='text'>
Pull EDAC updates from Tony Luck:
 "Boris is on vacation and aske me to send you the EDAC changes"

* tag 'edac_updates_for_5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC: Fix reference count leaks
  EDAC: Remove edac_get_dimm_by_index()
  EDAC/ghes: Scan the system once on driver init
  EDAC/ghes: Remove unused members of struct ghes_edac_pvt, rename it to ghes_pvt
  EDAC/ghes: Setup DIMM label from DMI and use it in error reports
  EDAC, {skx,i10nm}: Use CPU stepping macro to pass configurations
  EDAC/mc: Call edac_inc_ue_error() before panic
  EDAC, pnd2: Set MCE_PRIO_EDAC priority for pnd2_mce_dec notifier
</content>
</entry>
<entry>
<title>Merge tag 'ras-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2020-08-04T00:42:23Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-08-04T00:42:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e53bc3ff99b413083bfef80d0fdbf7da3a09fc0c'/>
<id>urn:sha1:e53bc3ff99b413083bfef80d0fdbf7da3a09fc0c</id>
<content type='text'>
Pull x86 RAS updates from Ingo Molnar:
 "Boris is on vacation and he asked us to send you the pending RAS bits:

   - Print the PPIN field on CPUs that fill them out

   - Fix an MCE injection bug

   - Simplify a kzalloc in dev_mcelog_init_device()"

* tag 'ras-core-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce, EDAC/mce_amd: Print PPIN in machine check records
  x86/mce/dev-mcelog: Use struct_size() helper in kzalloc()
  x86/mce/inject: Fix a wrong assignment of i_mce.status
</content>
</entry>
</feed>
