<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/edac, branch v3.5</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.5</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.5'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2012-06-11T15:43:16Z</updated>
<entry>
<title>edac: Do alignment logic properly in edac_align_ptr()</title>
<updated>2012-06-11T15:43:16Z</updated>
<author>
<name>Chris Metcalf</name>
<email>cmetcalf@tilera.com</email>
</author>
<published>2012-06-06T17:11:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8447c4d15e357a458c9051ddc84aa6c8b9c27000'/>
<id>urn:sha1:8447c4d15e357a458c9051ddc84aa6c8b9c27000</id>
<content type='text'>
The logic was checking the sizeof the structure being allocated to
determine whether an alignment fixup was required.  This isn't right;
what we actually care about is the alignment of the actual pointer that's
about to be returned.  This became an issue recently because struct
edac_mc_layer has a size that is not zero modulo eight, so we were
taking the correctly-aligned pointer and forcing it to be misaligned.
On Tile this caused an alignment exception.

Signed-off-by: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
<entry>
<title>mpc85xx_edac: fix error: too few arguments to function 'edac_mc_alloc'</title>
<updated>2012-06-11T14:49:51Z</updated>
<author>
<name>Kim Phillips</name>
<email>kim.phillips@freescale.com</email>
</author>
<published>2012-06-07T00:49:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b9bc5ddb1b76d3f7ee14c533300aa95907c6969e'/>
<id>urn:sha1:b9bc5ddb1b76d3f7ee14c533300aa95907c6969e</id>
<content type='text'>
commit ca0907b "edac: Remove the legacy EDAC ABI" broke mpc85xx_edac
in the following manner:

mpc85xx_edac.c:983:35: error: too few arguments to function 'edac_mc_alloc'

this patch puts back the missing 'layers' argument.

[mchehab@redhat.com: As Ben sent a similar fix, I added his SOB on this patch]
Signed-off-by: Kim Phillips &lt;kim.phillips@freescale.com&gt;
Signed-off-by: Ben Collins &lt;bcollins@ubuntu.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
<entry>
<title>edac: fix the error about memory type detection on SandyBridge</title>
<updated>2012-06-11T14:49:51Z</updated>
<author>
<name>Chen Gong</name>
<email>gong.chen@linux.intel.com</email>
</author>
<published>2012-05-14T08:51:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2cbb587d3bc41a305168e91b4f3c5b6944a12566'/>
<id>urn:sha1:2cbb587d3bc41a305168e91b4f3c5b6944a12566</id>
<content type='text'>
On SandyBridge, DDRIOA(Dev: 17 Func: 0 Offset: 328) is used
to detect whether DIMM is RDIMM/LRDIMM, not TA(Dev: 15 Func: 0).

Signed-off-by: Chen Gong &lt;gong.chen@linux.intel.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
<entry>
<title>edac: avoid mce decoding crash after edac driver unloaded</title>
<updated>2012-06-11T14:49:51Z</updated>
<author>
<name>Chen Gong</name>
<email>gong.chen@linux.intel.com</email>
</author>
<published>2012-05-08T23:40:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e35fca4791fcdd43dc1fd769797df40c562ab491'/>
<id>urn:sha1:e35fca4791fcdd43dc1fd769797df40c562ab491</id>
<content type='text'>
Some edac drivers register themselves as mce decoders via
notifier_chain. But in current notifier_chain implementation logic,
it doesn't accept same notifier registered twice. If so, it will be
wrong when adding/removing the element from the list. For example,
on one SandyBridge platform, remove module sb_edac and then trigger
one error, it will hit oops because it has no mce decoder registered
but related notifier_chain still points to an invalid callback
function. Here is an example:

Call Trace:
 [&lt;ffffffff8150ef6a&gt;] atomic_notifier_call_chain+0x1a/0x20
 [&lt;ffffffff8102b936&gt;] mce_log+0x46/0x180
 [&lt;ffffffff8102eaea&gt;] apei_mce_report_mem_error+0x4a/0x60
 [&lt;ffffffff812e19d2&gt;] ghes_do_proc+0x192/0x210
 [&lt;ffffffff812e2066&gt;] ghes_proc+0x46/0x70
 [&lt;ffffffff812e20d8&gt;] ghes_notify_sci+0x48/0x80
 [&lt;ffffffff8150ef05&gt;] notifier_call_chain+0x55/0x80
 [&lt;ffffffff81076f1a&gt;] __blocking_notifier_call_chain+0x5a/0x80
 [&lt;ffffffff812aea11&gt;] ? acpi_os_wait_events_complete+0x23/0x23
 [&lt;ffffffff81076f56&gt;] blocking_notifier_call_chain+0x16/0x20
 [&lt;ffffffff812ddc4d&gt;] acpi_hed_notify+0x19/0x1b
 [&lt;ffffffff812b16bd&gt;] acpi_device_notify+0x19/0x1b
 [&lt;ffffffff812beb38&gt;] acpi_ev_notify_dispatch+0x67/0x7f
 [&lt;ffffffff812aea3a&gt;] acpi_os_execute_deferred+0x29/0x36
 [&lt;ffffffff81069dc2&gt;] process_one_work+0x132/0x450
 [&lt;ffffffff8106bbcb&gt;] worker_thread+0x17b/0x3c0
 [&lt;ffffffff8106ba50&gt;] ? manage_workers+0x120/0x120
 [&lt;ffffffff81070aee&gt;] kthread+0x9e/0xb0
 [&lt;ffffffff81514724&gt;] kernel_thread_helper+0x4/0x10
 [&lt;ffffffff81070a50&gt;] ? kthread_freezable_should_stop+0x70/0x70
 [&lt;ffffffff81514720&gt;] ? gs_change+0x13/0x13
Code: f3 49 89 d4 45 85 ed 4d 89 c6 48 8b 0f 74 48 48 85 c9 75 17 eb 41
0f 1f 80 00 00 00 00 41 83 ed 01 4c 89 f9 74 22 4d 85 ff 74 1d &lt;4c&gt; 8b
79 08 4c 89 e2 48 89 de 48 89 cf ff 11 4d 85 f6 74 04 41
RIP  [&lt;ffffffff8150eef6&gt;] notifier_call_chain+0x46/0x80
 RSP &lt;ffff88042868fb20&gt;
CR2: ffffffffa01af838
---[ end trace 0100930068e73e6f ]---
BUG: unable to handle kernel paging request at fffffffffffffff8
IP: [&lt;ffffffff810705b0&gt;] kthread_data+0x10/0x20
PGD 1a0d067 PUD 1a0e067 PMD 0
Oops: 0000 [#2] SMP

Only i7core_edac and sb_edac have such issues because they have more
than one memory controller which means they have to register mce
decoder many times.

Cc: &lt;stable@vger.kernel.org&gt; # 3.2 and upper
Signed-off-by: Chen Gong &lt;gong.chen@linux.intel.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'x86/trampoline' into x86/urgent</title>
<updated>2012-05-30T19:11:32Z</updated>
<author>
<name>H. Peter Anvin</name>
<email>hpa@zytor.com</email>
</author>
<published>2012-05-30T19:11:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bbd771474ec44b516107685d77e1c80bbe09f141'/>
<id>urn:sha1:bbd771474ec44b516107685d77e1c80bbe09f141</id>
<content type='text'>
x86/trampoline contains an urgent commit which is necessarily on a
newer baseline.

Signed-off-by: H. Peter Anvin &lt;hpa@zytor.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'x86/mce' into x86/urgent</title>
<updated>2012-05-30T12:12:06Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2012-05-30T12:12:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=403e1c5b7495d7b80fae9fc4d0a7a6f5abdc3307'/>
<id>urn:sha1:403e1c5b7495d7b80fae9fc4d0a7a6f5abdc3307</id>
<content type='text'>
Merge in these fixlets.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac</title>
<updated>2012-05-30T01:32:37Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-05-30T01:32:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=87a5af24e54857e7b15c1f1b0468512ee65c94e3'/>
<id>urn:sha1:87a5af24e54857e7b15c1f1b0468512ee65c94e3</id>
<content type='text'>
Pull EDAC internal API changes from Mauro Carvalho Chehab:
 "This changeset is the first part of a series of patches that fixes the
  EDAC sybsystem.  On this set, it changes the Kernel EDAC API in order
  to properly represent the Intel i3/i5/i7, Xeon 3xxx/5xxx/7xxx, and
  Intel E5-xxxx memory controllers.

  The EDAC core used to assume that:

       - the DRAM chip select pin is directly accessed by the memory
         controller

       - when multiple channels are used, they're all filled with the
         same type of memory.

  None of the above premises is true on Intel memory controllers since
  2002, when RAMBUS and FB-DIMMs were introduced, and Advanced Memory
  Buffer or by some similar technologies hides the direct access to the
  DRAM pins.

  So, the existing drivers for those chipsets had to lie to the EDAC
  core, in general telling that just one channel is filled.  That
  produces some hard to understand error messages like:

       EDAC MC0: CE row 3, channel 0, label "DIMM1": 1 Unknown error(s): memory read error on FATAL area : cpu=0 Err=0008:00c2 (ch=2), addr = 0xad1f73480 =&gt; socket=0, Channel=0(mask=2), rank=1

  The location information there (row3 channel 0) is completely bogus:
  it has no physical meaning, and are just some random values that the
  driver uses to talk with the EDAC core.  The error actually happened
  at CPU socket 0, channel 0, slot 1, but this is not reported anywhere,
  as the EDAC core doesn't know anything about the memory layout.  So,
  only advanced users that know how the EDAC driver works and that tests
  their systems to see how DIMMs are mapped can actually benefit for
  such error logs.

  This patch series fixes the error report logic, in order to allow the
  EDAC to expose the memory architecture used by them to the EDAC core.
  So, as the EDAC core now understands how the memory is organized, it
  can provide an useful report:

       EDAC MC0: CE memory read error on DIMM1 (channel:0 slot:1 page:0x364b1b offset:0x600 grain:32 syndrome:0x0 - count:1 area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:4)

  The location of the DIMM where the error happened is reported by "MC0"
  (cpu socket #0), at "channel:0 slot:1" location, and matches the
  physical location of the DIMM.

  There are two remaining issues not covered by this patch series:

       - The EDAC sysfs API will still report bogus values.  So,
         userspace tools like edac-utils will still use the bogus data;

       - Add a new tracepoint-based way to get the binary information
         about the errors.

  Those are on a second series of patches (also at -next), but will
  probably miss the train for 3.5, due to the slow review process."

Fix up trivial conflict (due to spelling correction of removed code) in
drivers/edac/edac_device.c

* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac: (42 commits)
  i7core: fix ranks information at the per-channel struct
  i5000: Fix the fatal error handling
  i5100_edac: Fix a warning when compiled with 32 bits
  i82975x_edac: Test nr_pages earlier to save a few CPU cycles
  e752x_edac: provide more info about how DIMMS/ranks are mapped
  i5000_edac: Fix the logic that retrieves memory information
  i5400_edac: improve debug messages to better represent the filled memory
  edac: Cleanup the logs for i7core and sb edac drivers
  edac: Initialize the dimm label with the known information
  edac: Remove the legacy EDAC ABI
  x38_edac: convert driver to use the new edac ABI
  tile_edac: convert driver to use the new edac ABI
  sb_edac: convert driver to use the new edac ABI
  r82600_edac: convert driver to use the new edac ABI
  ppc4xx_edac: convert driver to use the new edac ABI
  pasemi_edac: convert driver to use the new edac ABI
  mv64x60_edac: convert driver to use the new edac ABI
  mpc85xx_edac: convert driver to use the new edac ABI
  i82975x_edac: convert driver to use the new edac ABI
  i82875p_edac: convert driver to use the new edac ABI
  ...
</content>
</entry>
<entry>
<title>i7core: fix ranks information at the per-channel struct</title>
<updated>2012-05-28T22:13:55Z</updated>
<author>
<name>Mauro Carvalho Chehab</name>
<email>mchehab@redhat.com</email>
</author>
<published>2012-04-26T14:47:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0bf09e829dd4b07227ed5a8bc4ac85752a044458'/>
<id>urn:sha1:0bf09e829dd4b07227ed5a8bc4ac85752a044458</id>
<content type='text'>
There is a flag at the per-channel struct that indicates if there are
any 4R dimm on it. The way the presence of this flag were reported
is not ok, as it might give the false idea that the channel were filled
with 2R memories:

[  580.588701] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f7431): 2 ranks, UDIMMs
[  580.588704] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400

(in this case, just one 1R memory is filled on channel 1)

So, use a better way to represent the per-channel ranks information.
After the patch, it will show:

[ 2002.233978] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f7431): UDIMMs
[ 2002.233982] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
[ 2002.233988] EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400

(in this case, there isn't any 4R memories)

Reported-by: Borislav Petkov &lt;borislav.petkov@amd.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
<entry>
<title>i5000: Fix the fatal error handling</title>
<updated>2012-05-28T22:13:54Z</updated>
<author>
<name>Mauro Carvalho Chehab</name>
<email>mchehab@redhat.com</email>
</author>
<published>2012-04-25T14:47:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=486dfb1638bc49e9f3bbbefbe4832024ba6abe0d'/>
<id>urn:sha1:486dfb1638bc49e9f3bbbefbe4832024ba6abe0d</id>
<content type='text'>
The fatal error channel bits point to a single channel, and not
to a range of channels. Fix the code to properly report it,
instead of printing messages like:
	kernel: EDAC MC0: INTERNAL ERROR: channel-b out of range (4 &gt;= 4)

Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
<entry>
<title>i5100_edac: Fix a warning when compiled with 32 bits</title>
<updated>2012-05-28T22:13:54Z</updated>
<author>
<name>Mauro Carvalho Chehab</name>
<email>mchehab@redhat.com</email>
</author>
<published>2012-03-29T11:41:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9f70d08a4c4581eee802563b709f710ad492d966'/>
<id>urn:sha1:9f70d08a4c4581eee802563b709f710ad492d966</id>
<content type='text'>
drivers/edac/i5100_edac.c: In function ‘i5100_init_csrows’:
drivers/edac/i5100_edac.c:862:3: warning: format ‘%zd’ expects argument of type ‘signed size_t’, but argument 5 has type ‘long unsigned int’ [-Wformat]

Reviewed-by: Aristeu Rozanski &lt;arozansk@redhat.com&gt;
Cc: "Niklas Söderlund" &lt;niklas.soderlund@ericsson.com&gt;
Cc: Borislav Petkov &lt;borislav.petkov@amd.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab@redhat.com&gt;
</content>
</entry>
</feed>
