linux/arch/arc/include, branch v5.3

arc: prefer __section from compiler_attributes.h

2019-08-26T17:07:12Z

Reported-by: Sedat Dilek Suggested-by: Josh Poimboeuf Signed-off-by: Nick Desaulniers Signed-off-by: Vineet Gupta

ARCv2: entry: early return from exception need not clear U & DE bits

2019-08-05T07:01:29Z

Exception handlers call FAKE_RET_FROM_EXCPN to - clear AE bit: drop down from exception active to pure kernel mode allowing further excptions - set IE bit: re-enable interrupts It additionally also clears U bit (user mode) and DE bit (delay slot execution) which is redundant as hardware does that already on any taken exception. Morevoer the current software clearing is bogus anyways as the KFLAG instruction being used for purpose can't possibly write those bits anyways. So don't pretend to clear them. Signed-off-by: Alexey Brodkin Signed-off-by: Vineet Gupta [vgupta: rewrote changelog]

Merge branch 'akpm' (patches from Andrew)

2019-07-17T15:58:04Z

Merge more updates from Andrew Morton: "VM: - z3fold fixes and enhancements by Henry Burns and Vitaly Wool - more accurate reclaimed slab caches calculations by Yafang Shao - fix MAP_UNINITIALIZED UAPI symbol to not depend on config, by Christoph Hellwig - !CONFIG_MMU fixes by Christoph Hellwig - new novmcoredd parameter to omit device dumps from vmcore, by Kairui Song - new test_meminit module for testing heap and pagealloc initialization, by Alexander Potapenko - ioremap improvements for huge mappings, by Anshuman Khandual - generalize kprobe page fault handling, by Anshuman Khandual - device-dax hotplug fixes and improvements, by Pavel Tatashin - enable synchronous DAX fault on powerpc, by Aneesh Kumar K.V - add pte_devmap() support for arm64, by Robin Murphy - unify locked_vm accounting with a helper, by Daniel Jordan - several misc fixes core/lib: - new typeof_member() macro including some users, by Alexey Dobriyan - make BIT() and GENMASK() available in asm, by Masahiro Yamada - changed LIST_POISON2 on x86_64 to 0xdead000000000122 for better code generation, by Alexey Dobriyan - rbtree code size optimizations, by Michel Lespinasse - convert struct pid count to refcount_t, by Joel Fernandes get_maintainer.pl: - add --no-moderated switch to skip moderated ML's, by Joe Perches misc: - ptrace PTRACE_GET_SYSCALL_INFO interface - coda updates - gdb scripts, various" [ Using merge message suggestion from Vlastimil Babka, with some editing - Linus ] * emailed patches from Andrew Morton : (100 commits) fs/select.c: use struct_size() in kmalloc() mm: add account_locked_vm utility function arm64: mm: implement pte_devmap support mm: introduce ARCH_HAS_PTE_DEVMAP mm: clean up is_device_*_page() definitions mm/mmap: move common defines to mman-common.h mm: move MAP_SYNC to asm-generic/mman-common.h device-dax: "Hotremove" persistent memory that is used like normal RAM mm/hotplug: make remove_memory() interface usable device-dax: fix memory and resource leak if hotplug fails include/linux/lz4.h: fix spelling and copy-paste errors in documentation ipc/mqueue.c: only perform resource calculation if user valid include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures scripts/gdb: add helpers to find and list devices scripts/gdb: add lx-genpd-summary command drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl kernel/pid.c: convert struct pid count to refcount_t drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings select: shift restore_saved_sigmask_unless() into poll_select_copy_remaining() select: change do_poll() to return -ERESTARTNOHAND rather than -EINTR ...

arch: replace _BITUL() in kernel-space headers with BIT()

2019-07-17T02:23:22Z

Now that BIT() can be used from assembly code, we can safely replace _BITUL() with equivalent BIT(). UAPI headers are still required to use _BITUL(), but there is no more reason to use it in kernel headers. BIT() is shorter. Link: http://lkml.kernel.org/r/20190609153941.17249-2-yamada.masahiro@socionext.com Signed-off-by: Masahiro Yamada Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Cc: Vineet Gupta Cc: Catalin Marinas Cc: Will Deacon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

Merge tag 'arc-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc

2019-07-16T22:07:51Z

Pull ARC updates from Vineet Gupta: - long due rewrite of do_page_fault - refactoring of entry/exit code to utilize the double load/store instructions - hsdk platform updates * tag 'arc-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: [plat-hsdk]: Enable AXI DW DMAC in defconfig ARC: [plat-hsdk]: enable DW SPI controller ARC: hide unused function unw_hdr_alloc ARC: [haps] Add Virtio support ARCv2: entry: simplify return to Delay Slot via interrupt ARC: entry: EV_Trap expects r10 (vs. r9) to have exception cause ARCv2: entry: rewrite to enable use of double load/stores LDD/STD ARCv2: entry: avoid a branch ARCv2: entry: push out the Z flag unclobber from common EXCEPTION_PROLOGUE ARCv2: entry: comments about hardware auto-save on taken interrupts ARC: mm: do_page_fault refactor #8: release mmap_sem sooner ARC: mm: do_page_fault refactor #7: fold the various error handling ARC: mm: do_page_fault refactor #6: error handlers to use same pattern ARC: mm: do_page_fault refactor #5: scoot no_context to end ARC: mm: do_page_fault refactor #4: consolidate retry related logic ARC: mm: do_page_fault refactor #3: tidyup vma access permission code ARC: mm: do_page_fault refactor #2: remove short lived variable ARC: mm: do_page_fault refactor #1: remove label @good_area

Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2019-07-08T23:12:03Z

Pull locking updates from Ingo Molnar: "The main changes in this cycle are: - rwsem scalability improvements, phase #2, by Waiman Long, which are rather impressive: "On a 2-socket 40-core 80-thread Skylake system with 40 reader and writer locking threads, the min/mean/max locking operations done in a 5-second testing window before the patchset were: 40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810 40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255 After the patchset, they became: 40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741 40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098" There's a lot of changes to the locking implementation that makes it similar to qrwlock, including owner handoff for more fair locking. Another microbenchmark shows how across the spectrum the improvements are: "With a locking microbenchmark running on 5.1 based kernel, the total locking rates (in kops/s) on a 2-socket Skylake system with equal numbers of readers and writers (mixed) before and after this patchset were: # of Threads Before Patch After Patch ------------ ------------ ----------- 2 2,618 4,193 4 1,202 3,726 8 802 3,622 16 729 3,359 32 319 2,826 64 102 2,744" The changes are extensive and the patch-set has been through several iterations addressing various locking workloads. There might be more regressions, but unless they are pathological I believe we want to use this new implementation as the baseline going forward. - jump-label optimizations by Daniel Bristot de Oliveira: the primary motivation was to remove IPI disturbance of isolated RT-workload CPUs, which resulted in the implementation of batched jump-label updates. Beyond the improvement of the real-time characteristics kernel, in one test this patchset improved static key update overhead from 57 msecs to just 1.4 msecs - which is a nice speedup as well. - atomic64_t cross-arch type cleanups by Mark Rutland: over the last ~10 years of atomic64_t existence the various types used by the APIs only had to be self-consistent within each architecture - which means they became wildly inconsistent across architectures. Mark puts and end to this by reworking all the atomic64 implementations to use 's64' as the base type for atomic64_t, and to ensure that this type is consistently used for parameters and return values in the API, avoiding further problems in this area. - A large set of small improvements to lockdep by Yuyang Du: type cleanups, output cleanups, function return type and othr cleanups all around the place. - A set of percpu ops cleanups and fixes by Peter Zijlstra. - Misc other changes - please see the Git log for more details" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits) locking/lockdep: increase size of counters for lockdep statistics locking/atomics: Use sed(1) instead of non-standard head(1) option locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING x86/jump_label: Make tp_vec_nr static x86/percpu: Optimize raw_cpu_xchg() x86/percpu, sched/fair: Avoid local_clock() x86/percpu, x86/irq: Relax {set,get}_irq_regs() x86/percpu: Relax smp_processor_id() x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}() locking/rwsem: Guard against making count negative locking/rwsem: Adaptive disabling of reader optimistic spinning locking/rwsem: Enable time-based spinning on reader-owned rwsem locking/rwsem: Make rwsem->owner an atomic_long_t locking/rwsem: Enable readers spinning on writer locking/rwsem: Clarify usage of owner's nonspinaable bit locking/rwsem: Wake up almost all readers in wait queue locking/rwsem: More optimal RT task handling of null owner locking/rwsem: Always release wait_lock before waking up tasks locking/rwsem: Implement lock handoff to prevent lock starvation locking/rwsem: Make rwsem_spin_on_owner() return owner state ...

ARC: entry: EV_Trap expects r10 (vs. r9) to have exception cause

2019-07-08T08:24:44Z

avoids 1 MOV instruction in light of double load/store code Signed-off-by: Vineet Gupta

ARCv2: entry: rewrite to enable use of double load/stores LDD/STD

2019-07-01T18:02:22Z

- the motivation was to be remove blatent copy-paste due to hasty support of CONFIG_ARC_IRQ_NO_AUTOSAVE support - but with refactoring we could use LDD/STD to greatly optimize the code Signed-off-by: Vineet Gupta

ARCv2: entry: avoid a branch

2019-07-01T18:02:22Z

Signed-off-by: Vineet Gupta

ARCv2: entry: push out the Z flag unclobber from common EXCEPTION_PROLOGUE

2019-07-01T18:02:22Z

Upon a taken interrupt/exception from User mode, HS hardware auto sets Z flag. This helps shave a few instructions from EXCEPTION_PROLOGUE by eliding re-reading ERSTATUS and some bit fiddling. However TLB Miss Exception handler can clobber the CPU flags and still end up in EXCEPTION_PROLOGUE in the slow path handling TLB handling case: EV_TLBMissD do_slow_path_pf EV_TLBProtV (aliased to call_do_page_fault) EXCEPTION_PROLOGUE As a result, EXCEPTION_PROLOGUE need to "unclobber" the Z flag which this patch changes. It is now pushed out to TLB Miss Exception handler. The reasons beings: - The flag restoration is only needed for slowpath TLB Miss Exception handling, but currently being in EXCEPTION_PROLOGUE penalizes all exceptions such as ProtV and syscall Trap, where Z flag is already as expected. - Pushing unclobber out to where it was clobbered is much cleaner and also serves to document the fact. - Makes EXCEPTION_PROLGUE similar to INTERRUPT_PROLOGUE so easier to refactor the common parts which is what this series aims to do Signed-off-by: Vineet Gupta