From aa5b395b69b65450e008b95ec623b4fc4b175f9f Mon Sep 17 00:00:00 2001 From: Mikhail Zaslonko Date: Thu, 30 Jan 2020 22:16:17 -0800 Subject: lib/zlib: add s390 hardware support for kernel zlib_deflate Patch series "S390 hardware support for kernel zlib", v3. With IBM z15 mainframe the new DFLTCC instruction is available. It implements deflate algorithm in hardware (Nest Acceleration Unit - NXU) with estimated compression and decompression performance orders of magnitude faster than the current zlib. This patchset adds s390 hardware compression support to kernel zlib. The code is based on the userspace zlib implementation: https://github.com/madler/zlib/pull/410 The coding style is also preserved for future maintainability. There is only limited set of userspace zlib functions represented in kernel. Apart from that, all the memory allocation should be performed in advance. Thus, the workarea structures are extended with the parameter lists required for the DEFLATE CONVENTION CALL instruction. Since kernel zlib itself does not support gzip headers, only Adler-32 checksum is processed (also can be produced by DFLTCC facility). Like it was implemented for userspace, kernel zlib will compress in hardware on level 1, and in software on all other levels. Decompression will always happen in hardware (when enabled). Two DFLTCC compression calls produce the same results only when they both are made on machines of the same generation, and when the respective buffers have the same offset relative to the start of the page. Therefore care should be taken when using hardware compression when reproducible results are desired. However it does always produce the standard conform output which can be inflated anyway. The new kernel command line parameter 'dfltcc' is introduced to configure s390 zlib hardware support: Format: { on | off | def_only | inf_only | always } on: s390 zlib hardware support for compression on level 1 and decompression (default) off: No s390 zlib hardware support def_only: s390 zlib hardware support for deflate only (compression on level 1) inf_only: s390 zlib hardware support for inflate only (decompression) always: Same as 'on' but ignores the selected compression level always using hardware support (used for debugging) The main purpose of the integration of the NXU support into the kernel zlib is the use of hardware deflate in btrfs filesystem with on-the-fly compression enabled. Apart from that, hardware support can also be used during boot for decompressing the kernel or the ramdisk image With the patch for btrfs expanding zlib buffer from 1 to 4 pages (patch 6) the following performance results have been achieved using the ramdisk with btrfs. These are relative numbers based on throughput rate and compression ratio for zlib level 1: Input data Deflate rate Inflate rate Compression ratio NXU/Software NXU/Software NXU/Software stream of zeroes 1.46 1.02 1.00 random ASCII data 10.44 3.00 0.96 ASCII text (dickens) 6,21 3.33 0.94 binary data (vmlinux) 8,37 3.90 1.02 This means that s390 hardware deflate can provide up to 10 times faster compression (on level 1) and up to 4 times faster decompression (refers to all compression levels) for btrfs zlib. Disclaimer: Performance results are based on IBM internal tests using DD command-line utility on btrfs on a Fedora 30 based internal driver in native LPAR on a z15 system. Results may vary based on individual workload, configuration and software levels. This patch (of 9): Create zlib_dfltcc library with the s390 DEFLATE CONVERSION CALL implementation and related compression functions. Update zlib_deflate functions with the hooks for s390 hardware support and adjust workspace structures with extra parameter lists required for hardware deflate. Link: http://lkml.kernel.org/r/20200103223334.20669-2-zaslonko@linux.ibm.com Signed-off-by: Ilya Leoshkevich Signed-off-by: Mikhail Zaslonko Co-developed-by: Ilya Leoshkevich Cc: Chris Mason Cc: Christian Borntraeger Cc: David Sterba Cc: Eduard Shishkin Cc: Heiko Carstens Cc: Josef Bacik Cc: Richard Purdie Cc: Vasily Gorbik Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- lib/Makefile | 1 + 1 file changed, 1 insertion(+) (limited to 'lib/Makefile') diff --git a/lib/Makefile b/lib/Makefile index c20b1debe9b4..5fb02b93b1dc 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -140,6 +140,7 @@ obj-$(CONFIG_842_COMPRESS) += 842/ obj-$(CONFIG_842_DECOMPRESS) += 842/ obj-$(CONFIG_ZLIB_INFLATE) += zlib_inflate/ obj-$(CONFIG_ZLIB_DEFLATE) += zlib_deflate/ +obj-$(CONFIG_ZLIB_DFLTCC) += zlib_dfltcc/ obj-$(CONFIG_REED_SOLOMON) += reed_solomon/ obj-$(CONFIG_BCH) += bch.o obj-$(CONFIG_LZO_COMPRESS) += lzo/ -- cgit v1.2.3 From 43e76af85fa7e75ac9b71fc2fcc250abb1889bff Mon Sep 17 00:00:00 2001 From: Dmitry Vyukov Date: Thu, 30 Jan 2020 22:17:35 -0800 Subject: kcov: ignore fault-inject and stacktrace Don't instrument 3 more files that contain debugging facilities and produce large amounts of uninteresting coverage for every syscall. The following snippets are sprinkled all over the place in kcov traces in a debugging kernel. We already try to disable instrumentation of stack unwinding code and of most debug facilities. I guess we did not use fault-inject.c at the time, and stacktrace.c was somehow missed (or something has changed in kernel/configs). This change both speeds up kcov (kernel doesn't need to store these PCs, user-space doesn't need to process them) and frees trace buffer capacity for more useful coverage. should_fail lib/fault-inject.c:149 fail_dump lib/fault-inject.c:45 stack_trace_save kernel/stacktrace.c:124 stack_trace_consume_entry kernel/stacktrace.c:86 stack_trace_consume_entry kernel/stacktrace.c:89 ... a hundred frames skipped ... stack_trace_consume_entry kernel/stacktrace.c:93 stack_trace_consume_entry kernel/stacktrace.c:86 Link: http://lkml.kernel.org/r/20200116111449.217744-1-dvyukov@gmail.com Signed-off-by: Dmitry Vyukov Reviewed-by: Andrey Konovalov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- kernel/Makefile | 1 + lib/Makefile | 1 + mm/Makefile | 1 + 3 files changed, 3 insertions(+) (limited to 'lib/Makefile') diff --git a/kernel/Makefile b/kernel/Makefile index f2cc0d118a0b..4cb4130ced32 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -27,6 +27,7 @@ KCOV_INSTRUMENT_softirq.o := n # and produce insane amounts of uninteresting coverage. KCOV_INSTRUMENT_module.o := n KCOV_INSTRUMENT_extable.o := n +KCOV_INSTRUMENT_stacktrace.o := n # Don't self-instrument. KCOV_INSTRUMENT_kcov.o := n KASAN_SANITIZE_kcov.o := n diff --git a/lib/Makefile b/lib/Makefile index 5fb02b93b1dc..23ca78d43d24 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -16,6 +16,7 @@ KCOV_INSTRUMENT_rbtree.o := n KCOV_INSTRUMENT_list_debug.o := n KCOV_INSTRUMENT_debugobjects.o := n KCOV_INSTRUMENT_dynamic_debug.o := n +KCOV_INSTRUMENT_fault-inject.o := n # Early boot use of cmdline, don't instrument it ifdef CONFIG_AMD_MEM_ENCRYPT diff --git a/mm/Makefile b/mm/Makefile index 1937cc251883..32f08e22e824 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -20,6 +20,7 @@ KCOV_INSTRUMENT_kmemleak.o := n KCOV_INSTRUMENT_memcontrol.o := n KCOV_INSTRUMENT_mmzone.o := n KCOV_INSTRUMENT_vmstat.o := n +KCOV_INSTRUMENT_failslab.o := n CFLAGS_init-mm.o += $(call cc-disable-warning, override-init) CFLAGS_init-mm.o += $(call cc-disable-warning, initializer-overrides) -- cgit v1.2.3