aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/stackcollapse.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2022-01-06cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-StateHuang Rui1-3/+3
The AMD P-State driver is based on ACPI CPPC function, so ACPI should be dependence of this driver in the kernel config. In file included from ../drivers/cpufreq/amd-pstate.c:40:0: ../include/acpi/processor.h:226:2: error: unknown type name ‘phys_cpuid_t’ phys_cpuid_t phys_id; /* CPU hardware ID such as APIC ID for x86 */ ^~~~~~~~~~~~ ../include/acpi/processor.h:355:1: error: unknown type name ‘phys_cpuid_t’; did you mean ‘phys_addr_t’? phys_cpuid_t acpi_get_phys_id(acpi_handle, int type, u32 acpi_id); ^~~~~~~~~~~~ phys_addr_t CC drivers/rtc/rtc-rv3029c2.o ../include/acpi/processor.h:356:1: error: unknown type name ‘phys_cpuid_t’; did you mean ‘phys_addr_t’? phys_cpuid_t acpi_map_madt_entry(u32 acpi_id); ^~~~~~~~~~~~ phys_addr_t ../include/acpi/processor.h:357:20: error: unknown type name ‘phys_cpuid_t’; did you mean ‘phys_addr_t’? int acpi_map_cpuid(phys_cpuid_t phys_id, u32 acpi_id); ^~~~~~~~~~~~ phys_addr_t See https://lore.kernel.org/lkml/20e286d4-25d7-fb6e-31a1-4349c805aae3@infradead.org/. Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Huang Rui <ray.huang@amd.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-06cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc commentYang Li1-0/+2
Add the description of @req and @boost_supported in struct amd_cpudata kernel-doc comment to remove warnings found by running scripts/kernel-doc, which is caused by using 'make W=1'. drivers/cpufreq/amd-pstate.c:104: warning: Function parameter or member 'req' not described in 'amd_cpudata' drivers/cpufreq/amd-pstate.c:104: warning: Function parameter or member 'boost_supported' not described in 'amd_cpudata' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-05cpuidle: use default_groups in kobj_typeGreg Kroah-Hartman1-2/+4
There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the cpuidle sysfs code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-04x86: intel_epb: Allow model specific normal EPB valueSrinivas Pandruvada1-13/+32
The current EPB "normal" is defined as 6 and set whenever power-up EPB value is 0. This setting resulted in the desired out of box power and performance for several CPU generations. But this value is not suitable for AlderLake mobile CPUs, as this resulted in higher uncore power. Since EPB is model specific, this is not unreasonable to have different behavior. Allow a capability where "normal" EPB can be redefined. For AlderLake mobile CPUs this desired normal value is 7. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-01-02Linux 5.16-rc8v5.16-rc8Linus Torvalds1-1/+1
2022-01-02perf top: Fix TUI exit screen refresh race conditionyaowenbin1-3/+5
When the following command is executed several times, a coredump file is generated. $ timeout -k 9 5 perf top -e task-clock ******* ******* ******* 0.01% [kernel] [k] __do_softirq 0.01% libpthread-2.28.so [.] __pthread_mutex_lock 0.01% [kernel] [k] __ll_sc_atomic64_sub_return double free or corruption (!prev) perf top --sort comm,dso timeout: the monitored command dumped core When we terminate "perf top" using sending signal method, SLsmg_reset_smg() called. SLsmg_reset_smg() resets the SLsmg screen management routines by freeing all memory allocated while it was active. However SLsmg_reinit_smg() maybe be called by another thread. SLsmg_reinit_smg() will free the same memory accessed by SLsmg_reset_smg(), thus it results in a double free. SLsmg_reinit_smg() is called already protected by ui__lock, so we fix the problem by adding pthread_mutex_trylock of ui__lock when calling SLsmg_reset_smg(). Signed-off-by: Wenyu Liu <liuwenyu7@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: wuxu.wu@huawei.com Link: http://lore.kernel.org/lkml/a91e3943-7ddc-f5c0-a7f5-360f073c20e6@huawei.com Signed-off-by: Hewenliang <hewenliang4@huawei.com> Signed-off-by: yaowenbin <yaowenbin1@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-01-02perf pmu: Fix alias events listJohn Garry1-6/+17
Commit 0e0ae8742207c3b4 ("perf list: Display hybrid PMU events with cpu type") changes the event list for uncore PMUs or arm64 heterogeneous CPU systems, such that duplicate aliases are incorrectly listed per PMU (which they should not be), like: # perf list ... unc_cbo_cache_lookup.any_es [Unit: uncore_cbox L3 Lookup any request that access cache and found line in E or S-state] unc_cbo_cache_lookup.any_es [Unit: uncore_cbox L3 Lookup any request that access cache and found line in E or S-state] unc_cbo_cache_lookup.any_i [Unit: uncore_cbox L3 Lookup any request that access cache and found line in I-state] unc_cbo_cache_lookup.any_i [Unit: uncore_cbox L3 Lookup any request that access cache and found line in I-state] ... Notice how the events are listed twice. The named commit changed how we remove duplicate events, in that events for different PMUs are not treated as duplicates. I suppose this is to handle how "Each hybrid pmu event has been assigned with a pmu name". Fix PMU alias listing by restoring behaviour to remove duplicates for non-hybrid PMUs. Fixes: 0e0ae8742207c3b4 ("perf list: Display hybrid PMU events with cpu type") Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1640103090-140490-1-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-31mm: vmscan: reduce throttling due to a failure to make progress -fixMel Gorman1-1/+2
Hugh Dickins reported the following My tmpfs swapping load (tweaked to use huge pages more heavily than in real life) is far from being a realistic load: but it was notably slowed down by your throttling mods in 5.16-rc, and this patch makes it well again - thanks. But: it very quickly hit NULL pointer until I changed that last line to if (first_pgdat) consider_reclaim_throttle(first_pgdat, sc); The likely issue is that huge pages are a major component of the test workload. When this is the case, first_pgdat may never get set if compaction is ready to continue due to this check if (IS_ENABLED(CONFIG_COMPACTION) && sc->order > PAGE_ALLOC_COSTLY_ORDER && compaction_ready(zone, sc)) { sc->compaction_ready = true; continue; } If this was true for every zone in the zonelist, first_pgdat would never get set resulting in a NULL pointer exception. Link: https://lkml.kernel.org/r/20211209095453.GM3366@techsingularity.net Fixes: 1b4e3f26f9f75 ("mm: vmscan: Reduce throttling due to a failure to make progress") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Reported-by: Hugh Dickins <hughd@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Rik van Riel <riel@surriel.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Darrick J. Wong <djwong@kernel.org> Cc: Shakeel Butt <shakeelb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-31mm: vmscan: Reduce throttling due to a failure to make progressMel Gorman3-10/+59
Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar problems due to reclaim throttling for excessive lengths of time. In Alexey's case, a memory hog that should go OOM quickly stalls for several minutes before stalling. In Mike and Darrick's cases, a small memcg environment stalled excessively even though the system had enough memory overall. Commit 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being made") introduced the problem although commit a19594ca4a8b ("mm/vmscan: increase the timeout if page reclaim is not making progress") made it worse. Systems at or near an OOM state that cannot be recovered must reach OOM quickly and memcg should kill tasks if a memcg is near OOM. To address this, only stall for the first zone in the zonelist, reduce the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if the scan control nr_reclaimed is 0, kswapd is still active and there were excessive pages pending for writeback. If kswapd has stopped reclaiming due to excessive failures, do not stall at all so that OOM triggers relatively quickly. Similarly, if an LRU is simply congested, only lightly throttle similar to NOPROGRESS. Alexey's original case was the most straight forward for i in {1..3}; do tail /dev/zero; done On vanilla 5.16-rc1, this test stalled heavily, after the patch the test completes in a few seconds similar to 5.15. Alexey's second test case added watching a youtube video while tail runs 10 times. On 5.15, playback only jitters slightly, 5.16-rc1 stalls a lot with lots of frames missing and numerous audio glitches. With this patch applies, the video plays similarly to 5.15. [lkp@intel.com: Fix W=1 build warning] Link: https://lore.kernel.org/r/99e779783d6c7fce96448a3402061b9dc1b3b602.camel@gmx.de Link: https://lore.kernel.org/r/20211124011954.7cab9bb4@mail.inbox.lv Link: https://lore.kernel.org/r/20211022144651.19914-1-mgorman@techsingularity.net Link: https://lore.kernel.org/r/20211202150614.22440-1-mgorman@techsingularity.net Link: https://linux-regtracking.leemhuis.info/regzbot/regression/20211124011954.7cab9bb4@mail.inbox.lv/ Reported-and-tested-by: Alexey Avramov <hakavlad@inbox.lv> Reported-and-tested-by: Mike Galbraith <efault@gmx.de> Reported-and-tested-by: Darrick J. Wong <djwong@kernel.org> Reported-by: kernel test robot <lkp@intel.com> Acked-by: Hugh Dickins <hughd@google.com> Tracked-by: Thorsten Leemhuis <regressions@leemhuis.info> Fixes: 69392a403f49 ("mm/vmscan: throttle reclaim when no progress is being made") Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-31mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()'SeongJae Park1-2/+7
DAMON debugfs interface increases the reference counts of 'struct pid's for targets from the 'target_ids' file write callback ('dbgfs_target_ids_write()'), but decreases the counts only in DAMON monitoring termination callback ('dbgfs_before_terminate()'). Therefore, when 'target_ids' file is repeatedly written without DAMON monitoring start/termination, the reference count is not decreased and therefore memory for the 'struct pid' cannot be freed. This commit fixes this issue by decreasing the reference counts when 'target_ids' is written. Link: https://lkml.kernel.org/r/20211229124029.23348-1-sj@kernel.org Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-31userfaultfd/selftests: fix hugetlb area allocationsMike Kravetz1-6/+10
Currently, userfaultfd selftest for hugetlb as run from run_vmtests.sh or any environment where there are 'just enough' hugetlb pages will always fail with: testing events (fork, remap, remove): ERROR: UFFDIO_COPY error: -12 (errno=12, line=616) The ENOMEM error code implies there are not enough hugetlb pages. However, there are free hugetlb pages but they are all reserved. There is a basic problem with the way the test allocates hugetlb pages which has existed since the test was originally written. Due to the way 'cleanup' was done between different phases of the test, this issue was masked until recently. The issue was uncovered by commit 8ba6e8640844 ("userfaultfd/selftests: reinitialize test context in each test"). For the hugetlb test, src and dst areas are allocated as PRIVATE mappings of a hugetlb file. This means that at mmap time, pages are reserved for the src and dst areas. At the start of event testing (and other tests) the src area is populated which results in allocation of huge pages to fill the area and consumption of reserves associated with the area. Then, a child is forked to fault in the dst area. Note that the dst area was allocated in the parent and hence the parent owns the reserves associated with the mapping. The child has normal access to the dst area, but can not use the reserves created/owned by the parent. Thus, if there are no other huge pages available allocation of a page for the dst by the child will fail. Fix by not creating reserves for the dst area. In this way the child can use free (non-reserved) pages. Also, MAP_PRIVATE of a file only makes sense if you are interested in the contents of the file before making a COW copy. The test does not do this. So, just use MAP_ANONYMOUS | MAP_HUGETLB to create an anonymous hugetlb mapping. There is no need to create a hugetlb file in the non-shared case. Link: https://lkml.kernel.org/r/20211217172919.7861-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-31Docs: Fixes link to I2C specificationDeep Majumder1-3/+5
The link to the I2C specification is broken. Although "https://www.nxp.com" hosts Rev 7 (2021) of this specification, it is behind a login-wall. Thus, an additional link has been added (which doesn't require a login) and the NXP official docs link has been updated. Signed-off-by: Deep Majumder <deep@fastmail.in> [wsa: minor updates to text and commit message] Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-12-31i2c: validate user data in compat ioctlPavel Skripkin1-0/+3
Wrong user data may cause warning in i2c_transfer(), ex: zero msgs. Userspace should not be able to trigger warnings, so this patch adds validation checks for user data in compact ioctl to prevent reported warnings Reported-and-tested-by: syzbot+e417648b303855b91d8a@syzkaller.appspotmail.com Fixes: 7d5cb45655f2 ("i2c compat ioctls: move to ->compat_ioctl()") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2021-12-30Input: spaceball - fix parsing of movement data packetsLeo L. Schwab1-2/+9
The spaceball.c module was not properly parsing the movement reports coming from the device. The code read axis data as signed 16-bit little-endian values starting at offset 2. In fact, axis data in Spaceball movement reports are signed 16-bit big-endian values starting at offset 3. This was determined first by visually inspecting the data packets, and later verified by consulting: http://spacemice.org/pdf/SpaceBall_2003-3003_Protocol.pdf If this ever worked properly, it was in the time before Git... Signed-off-by: Leo L. Schwab <ewhac@ewhac.org> Link: https://lore.kernel.org/r/20211221101630.1146385-1-ewhac@ewhac.org Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2021-12-30Input: appletouch - initialize work before device registrationPavel Skripkin1-2/+2
Syzbot has reported warning in __flush_work(). This warning is caused by work->func == NULL, which means missing work initialization. This may happen, since input_dev->close() calls cancel_work_sync(&dev->work), but dev->work initalization happens _after_ input_register_device() call. So this patch moves dev->work initialization before registering input device Fixes: 5a6eb676d3bc ("Input: appletouch - improve powersaving for Geyser3 devices") Reported-and-tested-by: syzbot+b88c5eae27386b252bbd@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20211230141151.17300-1-paskripkin@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2021-12-30fs/mount_setattr: always cleanup mount_kattrChristian Brauner1-5/+4
Make sure that finish_mount_kattr() is called after mount_kattr was succesfully built in both the success and failure case to prevent leaking any references we took when we built it. We returned early if path lookup failed thereby risking to leak an additional reference we took when building mount_kattr when an idmapped mount was requested. Cc: linux-fsdevel@vger.kernel.org Cc: stable@vger.kernel.org Fixes: 9caccd41541a ("fs: introduce MOUNT_ATTR_IDMAP") Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-30MAINTAINERS: Add AMD P-State driver maintainer entryHuang Rui1-0/+7
I will continue to add new feature and processor support, optimize the performance, and handle the issues for AMD P-State driver. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30Documentation: amd-pstate: Add AMD P-State driver introductionHuang Rui3-0/+385
Introduce the AMD P-State driver design and implementation. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30cpufreq: amd-pstate: Add AMD P-State performance attributesHuang Rui1-0/+18
Introduce sysfs attributes to get the different level AMD P-State performances. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30cpufreq: amd-pstate: Add AMD P-State frequencies attributesHuang Rui1-0/+47
Introduce sysfs attributes to get the different level processor frequencies. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30cpufreq: amd-pstate: Add boost mode support for AMD P-StateHuang Rui1-3/+66
If the sbios supports the boost mode of AMD P-State, let's switch to boost enabled by default. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30cpufreq: amd-pstate: Add trace for AMD P-State moduleHuang Rui4-1/+88
Add trace event to monitor the performance value changes which is controlled by cpu governors. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30cpufreq: amd-pstate: Introduce the support for the processors with shared ↵Huang Rui1-8/+97
memory solution In some of Zen2 and Zen3 based processors, they are using the shared memory that exposed from ACPI SBIOS. In this kind of the processors, there is no MSR support, so we add acpi cppc function as the backend for them. It is using a module param (shared_mem) to enable related processors manually. We will enable this by default once we address performance issue on this solution. Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30cpufreq: amd-pstate: Add fast switch function for AMD P-StateHuang Rui1-0/+36
Introduce the fast switch function for AMD P-State on the AMD processors which support the full MSR register control. It's able to decrease the latency on interrupt context. Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future ↵Huang Rui3-0/+404
processors AMD P-State is the AMD CPU performance scaling driver that introduces a new CPU frequency control mechanism on AMD Zen based CPU series in Linux kernel. The new mechanism is based on Collaborative processor performance control (CPPC) which is finer grain frequency management than legacy ACPI hardware P-States. Current AMD CPU platforms are using the ACPI P-states driver to manage CPU frequency and clocks with switching only in 3 P-states. AMD P-State is to replace the ACPI P-states controls, allows a flexible, low-latency interface for the Linux kernel to directly communicate the performance hints to hardware. AMD P-State leverages the Linux kernel governors such as *schedutil*, *ondemand*, etc. to manage the performance hints which are provided by CPPC hardware functionality. The first version for AMD P-State is to support one of the Zen3 processors, and we will support more in future after we verify the hardware and SBIOS functionalities. There are two types of hardware implementations for AMD P-State: one is full MSR support and another is shared memory support. It can use X86_FEATURE_CPPC feature flag to distinguish the different types. Using the new AMD P-State method + kernel governors (*schedutil*, *ondemand*, ...) to manage the frequency update is the most appropriate bridge between AMD Zen based hardware processor and Linux kernel, the processor is able to adjust to the most efficiency frequency according to the kernel scheduler loading. Please check the detailed CPU feature and MSR register description in Processor Programming Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors: https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30ACPI: CPPC: Add CPPC enable register functionJinzhou Su2-0/+50
Add a new function to enable CPPC feature. This function will write Continuous Performance Control package EnableRegister field on the processor. CPPC EnableRegister register described in section 8.4.7.1 of ACPI 6.4: This element is optional. If supported, contains a resource descriptor with a single Register() descriptor that describes a register to which OSPM writes a One to enable CPPC on this processor. Before this register is set, the processor will be controlled by legacy mechanisms (ACPI Pstates, firmware, etc.). This register will be used for AMD processors to enable AMD P-State function instead of legacy ACPI P-States. Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30ACPI: CPPC: Check present CPUs for determining _CPC is validMario Limonciello1-1/+1
As this is a static check, it should be based upon what is currently present on the system. This makes probeing more deterministic. While local APIC flags field (lapic_flags) of cpu core in MADT table is 0, then the cpu core won't be enabled. In this case, _CPC won't be found in this core, and return back to _CPC invalid with walking through possible cpus (include disable cpus). This is not expected, so switch to check present CPUs instead. Reported-by: Jinzhou Su <Jinzhou.Su@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30ACPI: CPPC: Implement support for SystemIO registersSteven Noonan1-3/+49
According to the ACPI v6.2 (and later) specification, SystemIO can be used for _CPC registers. This teaches cppc_acpi how to handle such registers. This patch was tested using the amd_pstate driver on my Zephyrus G15 (model GA503QS) using the current version 410 BIOS, which uses a SystemIO register for the HighestPerformance element in _CPC. Signed-off-by: Steven Noonan <steven@valvesoftware.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30x86/msr: Add AMD CPPC MSR definitionsHuang Rui1-0/+17
AMD CPPC (Collaborative Processor Performance Control) function uses MSR registers to manage the performance hints. So add the MSR register macro here. Signed-off-by: Huang Rui <ray.huang@amd.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature ↵Huang Rui1-0/+1
flag Add Collaborative Processor Performance Control feature flag for AMD processors. This feature flag will be used on the following AMD P-State driver. The AMD P-State driver has two approaches to implement the frequency control behavior. That depends on the CPU hardware implementation. One is "Full MSR Support" and another is "Shared Memory Support". The feature flag indicates the current processors with "Full MSR Support". Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-12-30fsl/fman: Fix missing put_device() call in fman_port_probeMiaoqian Lin1-5/+7
The reference taken by 'of_find_device_by_node()' must be released when not needed anymore. Add the corresponding 'put_device()' in the and error handling paths. Fixes: 18a6c85fcc78 ("fsl/fman: Add FMan Port Support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30selftests: net: using ping6 for IPv6 in udpgro_fwd.shJianguo Wu1-1/+3
udpgro_fwd.sh output following message: ping: 2001:db8:1::100: Address family for hostname not supported Using ping6 when pinging IPv6 addresses. Fixes: a062260a9d5f ("selftests: net: add UDP GRO forwarding self-tests") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30Documentation: fix outdated interpretation of ip_no_pmtu_discxu xin1-2/+4
The updating way of pmtu has changed, but documentation is still in the old way. So this patch updates the interpretation of ip_no_pmtu_disc and min_pmtu. See commit 28d35bcdd3925 ("net: ipv4: don't let PMTU updates increase route MTU") Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-29net/ncsi: check for error return from call to nla_put_u32Jiasheng Jiang1-1/+5
As we can see from the comment of the nla_put() that it could return -EMSGSIZE if the tailroom of the skb is insufficient. Therefore, it should be better to check the return value of the nla_put_u32 and return the error code if error accurs. Also, there are many other functions have the same problem, and if this patch is correct, I will commit a new version to fix all. Fixes: 955dc68cb9b2 ("net/ncsi: Add generic netlink family") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Link: https://lore.kernel.org/r/20211229032118.1706294-1-jiasheng@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net: bridge: mcast: fix br_multicast_ctx_vlan_global_disabled helperNikolay Aleksandrov1-3/+3
We need to first check if the context is a vlan one, then we need to check the global bridge multicast vlan snooping flag, and finally the vlan's multicast flag, otherwise we will unnecessarily enable vlan mcast processing (e.g. querier timers). Fixes: 7b54aaaf53cb ("net: bridge: multicast: add vlan state initialization and control") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Link: https://lore.kernel.org/r/20211228153142.536969-1-nikolay@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net: fix use-after-free in tw_timer_handlerMuchun Song1-6/+4
A real world panic issue was found as follow in Linux 5.4. BUG: unable to handle page fault for address: ffffde49a863de28 PGD 7e6fe62067 P4D 7e6fe62067 PUD 7e6fe63067 PMD f51e064067 PTE 0 RIP: 0010:tw_timer_handler+0x20/0x40 Call Trace: <IRQ> call_timer_fn+0x2b/0x120 run_timer_softirq+0x1ef/0x450 __do_softirq+0x10d/0x2b8 irq_exit+0xc7/0xd0 smp_apic_timer_interrupt+0x68/0x120 apic_timer_interrupt+0xf/0x20 This issue was also reported since 2017 in the thread [1], unfortunately, the issue was still can be reproduced after fixing DCCP. The ipv4_mib_exit_net is called before tcp_sk_exit_batch when a net namespace is destroyed since tcp_sk_ops is registered befrore ipv4_mib_ops, which means tcp_sk_ops is in the front of ipv4_mib_ops in the list of pernet_list. There will be a use-after-free on net->mib.net_statistics in tw_timer_handler after ipv4_mib_exit_net if there are some inflight time-wait timers. This bug is not introduced by commit f2bf415cfed7 ("mib: add net to NET_ADD_STATS_BH") since the net_statistics is a global variable instead of dynamic allocation and freeing. Actually, commit 61a7e26028b9 ("mib: put net statistics on struct net") introduces the bug since it put net statistics on struct net and free it when net namespace is destroyed. Moving init_ipv4_mibs() to the front of tcp_init() to fix this bug and replace pr_crit() with panic() since continuing is meaningless when init_ipv4_mibs() fails. [1] https://groups.google.com/g/syzkaller/c/p1tn-_Kc6l4/m/smuL_FMAAgAJ?pli=1 Fixes: 61a7e26028b9 ("mib: put net statistics on struct net") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Cong Wang <cong.wang@bytedance.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211228104145.9426-1-songmuchun@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29selftests: net: Fix a typo in udpgro_fwd.shJianguo Wu1-1/+1
$rvs -> $rcv Fixes: a062260a9d5f ("selftests: net: add UDP GRO forwarding self-tests") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Link: https://lore.kernel.org/r/d247d7c8-a03a-0abf-3c71-4006a051d133@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29selftests/net: udpgso_bench_tx: fix dst ip argumentwujianguo1-1/+7
udpgso_bench_tx call setup_sockaddr() for dest address before parsing all arguments, if we specify "-p ${dst_port}" after "-D ${dst_ip}", then ${dst_port} will be ignored, and using default cfg_port 8000. This will cause test case "multiple GRO socks" failed in udpgro.sh. Setup sockaddr after parsing all arguments. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/ff620d9f-5b52-06ab-5286-44b945453002@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29x86/build: Use the proper name CONFIG_FW_LOADERLukas Bulwahn1-1/+1
Commit in Fixes intends to add the expression regex only when FW_LOADER is enabled - not FW_LOADER_BUILTIN. Latter is a leftover from a previous patchset and not a valid config item. So, adjust the condition to the actual name of the config. [ bp: Cleanup commit message. ] Fixes: c8dcf655ec81 ("x86/build: Tuck away built-in firmware under FW_LOADER") Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20211229111553.5846-1-lukas.bulwahn@gmail.com
2021-12-29net: bridge: mcast: add and enforce startup query interval minimumNikolay Aleksandrov5-3/+22
As reported[1] if startup query interval is set too low in combination with large number of startup queries and we have multiple bridges or even a single bridge with multiple querier vlans configured we can crash the machine. Add a 1 second minimum which must be enforced by overwriting the value if set lower (i.e. without returning an error) to avoid breaking user-space. If that happens a log message is emitted to let the admin know that the startup interval has been set to the minimum. It doesn't make sense to make the startup interval lower than the normal query interval so use the same value of 1 second. The issue has been present since these intervals could be user-controlled. [1] https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/ Fixes: d902eee43f19 ("bridge: Add multicast count/interval sysfs entries") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29net: bridge: mcast: add and enforce query interval minimumNikolay Aleksandrov5-3/+22
As reported[1] if query interval is set too low and we have multiple bridges or even a single bridge with multiple querier vlans configured we can crash the machine. Add a 1 second minimum which must be enforced by overwriting the value if set lower (i.e. without returning an error) to avoid breaking user-space. If that happens a log message is emitted to let the administrator know that the interval has been set to the minimum. The issue has been present since these intervals could be user-controlled. [1] https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/ Fixes: d902eee43f19 ("bridge: Add multicast count/interval sysfs entries") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29ipv6: raw: check passed optlen before readingTamir Duberstein1-0/+3
Add a check that the user-provided option is at least as long as the number of bytes we intend to read. Before this patch we would blindly read sizeof(int) bytes even in cases where the user passed optlen<sizeof(int), which would potentially read garbage or fault. Discovered by new tests in https://github.com/google/gvisor/pull/6957 . The original get_user call predates history in the git repo. Signed-off-by: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20211229200947.2862255-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-29xsk: Initialise xskb free_list_nodeCiara Loftus1-0/+1
This commit initialises the xskb's free_list_node when the xskb is allocated. This prevents a potential false negative returned from a call to list_empty for that node, such as the one introduced in commit 199d983bc015 ("xsk: Fix crash on double free in buffer pool") In my environment this issue caused packets to not be received by the xdpsock application if the traffic was running prior to application launch. This happened when the first batch of packets failed the xskmap lookup and XDP_PASS was returned from the bpf program. This action is handled in the i40e zc driver (and others) by allocating an skbuff, freeing the xdp_buff and adding the associated xskb to the xsk_buff_pool's free_list if it hadn't been added already. Without this fix, the xskb is not added to the free_list because the check to determine if it was added already returns an invalid positive result. Later, this caused allocation errors in the driver and the failure to receive packets. Fixes: 199d983bc015 ("xsk: Fix crash on double free in buffer pool") Fixes: 2b43470add8c ("xsk: Introduce AF_XDP buffer allocation API") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20211220155250.2746-1-ciara.loftus@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-28net/mlx5e: Fix wrong features assignment in case of errorGal Pressman1-6/+5
In case of an error in mlx5e_set_features(), 'netdev->features' must be updated with the correct state of the device to indicate which features were updated successfully. To do that we maintain a copy of 'netdev->features' and update it after successful feature changes, so we can assign it to back to 'netdev->features' if needed. However, since not all netdev features are handled by the driver (e.g. GRO/TSO/etc), some features may not be updated correctly in case of an error updating another feature. For example, while requesting to disable TSO (feature which is not handled by the driver) and enable HW-GRO, if an error occurs during HW-GRO enable, 'oper_features' will be assigned with 'netdev->features' and HW-GRO turned off. TSO will remain enabled in such case, which is a bug. To solve that, instead of using 'netdev->features' as the baseline of 'oper_features' and changing it on set feature success, use 'features' instead and update it in case of errors. Fixes: 75b81ce719b7 ("net/mlx5e: Don't override netdev features field unless in error flow") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-28net/mlx5e: TC, Fix memory leak with rules with internal portRoi Dayan1-0/+2
Fix a memory leak with decap rule with internal port as destination device. The driver allocates a modify hdr action but doesn't set the flow attr modify hdr action which results in skipping releasing the modify hdr action when releasing the flow. backtrace: [<000000005f8c651c>] krealloc+0x83/0xd0 [<000000009f59b143>] alloc_mod_hdr_actions+0x156/0x310 [mlx5_core] [<000000002257f342>] mlx5e_tc_match_to_reg_set_and_get_id+0x12a/0x360 [mlx5_core] [<00000000b44ea75a>] mlx5e_tc_add_fdb_flow+0x962/0x1470 [mlx5_core] [<0000000003e384a0>] __mlx5e_add_fdb_flow+0x54c/0xb90 [mlx5_core] [<00000000ed8b22b6>] mlx5e_configure_flower+0xe45/0x4af0 [mlx5_core] [<00000000024f4ab5>] mlx5e_rep_indr_offload.isra.0+0xfe/0x1b0 [mlx5_core] [<000000006c3bb494>] mlx5e_rep_indr_setup_tc_cb+0x90/0x130 [mlx5_core] [<00000000d3dac2ea>] tc_setup_cb_add+0x1d2/0x420 Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-28ionic: Initialize the 'lif->dbid_inuse' bitmapChristophe JAILLET1-1/+1
When allocated, this bitmap is not initialized. Only the first bit is set a few lines below. Use bitmap_zalloc() to make sure that it is cleared before being used. Fixes: 6461b446f2a0 ("ionic: Add interrupts and doorbells") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Shannon Nelson <snelson@pensando.io> Link: https://lore.kernel.org/r/6a478eae0b5e6c63774e1f0ddb1a3f8c38fa8ade.1640527506.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-28drm/amd/display: Changed pipe split policy to allow for multi-display pipe splitAngus Wang8-8/+8
[WHY] Current implementation of pipe split policy prevents pipe split with multiple displays connected, which caused the MCLK speed to be stuck at max [HOW] Changed the pipe split policies so that pipe split is allowed for multi-display configurations Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1522 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1709 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1655 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1403 Note this is a backport of this commit from amdgpu drm-next for 5.16. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Angus Wang <angus.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-12-28drm/amd/display: Fix USB4 null pointer dereference in update_psp_stream_configNicholas Kazlauskas1-4/+1
[Why] A porting error on a previous patch left the block of code that causes the crash from a NULL pointer dereference. More specifically, we try to access link_enc before it's assigned in the USB4 case in the following assignment: config.dio_output_idx = link_enc->transmitter - TRANSMITTER_UNIPHY_A; [How] That assignment occurs later depending on the ASIC version. It's only needed on DCN31 and only after link_enc is already assigned. Fixes: 986430446c917b ("drm/amd/display: fix a crash on USB4 over C20 PHY") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28drm/amd/display: Set optimize_pwr_state for DCN31Nicholas Kazlauskas1-0/+1
[Why] We'll exit optimized power state to do link detection but we won't enter back into the optimized power state. This could potentially block s2idle entry depending on the sequencing, but it also means we're losing some power during the transition period. [How] Hook up the handler like DCN21. It was also missed like the exit_optimized_pwr_state callback. Fixes: 64b1d0e8d500 ("drm/amd/display: Add DCN3.1 HWSEQ") Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Eric Yang <Eric.Yang2@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-28drm/amd/display: Send s0i2_rdy in stream_count == 0 optimizationNicholas Kazlauskas1-0/+1
[Why] Otherwise SMU won't mark Display as idle when trying to perform s2idle. [How] Mark the bit in the dcn31 codepath, doesn't apply to older ASIC. It needed to be split from phy refclk off to prevent entering s2idle when PSR was engaged but driver was not ready. Fixes: 118a33151658 ("drm/amd/display: Add DCN3.1 clock manager support") Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Eric Yang <Eric.Yang2@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>