path: root/arch
AgeCommit message (Collapse)Author
2019-02-17powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()Michael Ellerman
In v4.20 we changed our pgd/pud_present() to check for _PAGE_PRESENT rather than just checking that the value is non-zero, e.g.: static inline int pgd_present(pgd_t pgd) { - return !pgd_none(pgd); + return (pgd_raw(pgd) & cpu_to_be64(_PAGE_PRESENT)); } Unfortunately this is broken on big endian, as the result of the bitwise & is truncated to int, which is always zero because _PAGE_PRESENT is 0x8000000000000000ul. This means pgd_present() and pud_present() are always false at compile time, and the compiler elides the subsequent code. Remarkably with that bug present we are still able to boot and run with few noticeable effects. However under some work loads we are able to trigger a warning in the ext4 code: WARNING: CPU: 11 PID: 29593 at fs/ext4/inode.c:3927 .ext4_set_page_dirty+0x70/0xb0 CPU: 11 PID: 29593 Comm: debugedit Not tainted 4.20.0-rc1 #1 ... NIP .ext4_set_page_dirty+0x70/0xb0 LR .set_page_dirty+0xa0/0x150 Call Trace: .set_page_dirty+0xa0/0x150 .unmap_page_range+0xbf0/0xe10 .unmap_vmas+0x84/0x130 .unmap_region+0xe8/0x190 .__do_munmap+0x2f0/0x510 .__vm_munmap+0x80/0x110 .__se_sys_munmap+0x14/0x30 system_call+0x5c/0x70 The fix is simple, we need to convert the result of the bitwise & to an int before returning it. Thanks to Erhard, Jan Kara and Aneesh for help with debugging. Fixes: da7ad366b497 ("powerpc/mm/book3s: Update pmd_present to look at _PAGE_PRESENT bit") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Erhard F. <erhard_f@mailbox.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-02-01powerpc/papr_scm: Use the correct bind addressOliver O'Halloran
When binding an SCM volume to a physical address the hypervisor has the option to return early with a continue token with the expectation that the guest will resume the bind operation until it completes. A quirk of this interface is that the bind address will only be returned by the first bind h-call and the subsequent calls will return 0xFFFF_FFFF_FFFF_FFFF for the bind address. We currently do not save the address returned by the first h-call. As a result we will use the junk address as the base of the bound region if the hypervisor decides to split the bind across multiple h-calls. This bug was found when testing with very large SCM volumes where the bind process would take more time than they hypervisor's internal h-call time limit would allow. This patch fixes the issue by saving the bind address from the first call. Cc: stable@vger.kernel.org Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-31powerpc/radix: Fix kernel crash with mremap()Aneesh Kumar K.V
With support for split pmd lock, we use pmd page pmd_huge_pte pointer to store the deposited page table. In those config when we move page tables we need to make sure we move the deposited page table to the correct pmd page. Otherwise this can result in crash when we withdraw of deposited page table because we can find the pmd_huge_pte NULL. eg: __split_huge_pmd+0x1070/0x1940 __split_huge_pmd+0xe34/0x1940 (unreliable) vma_adjust_trans_huge+0x110/0x1c0 __vma_adjust+0x2b4/0x9b0 __split_vma+0x1b8/0x280 __do_munmap+0x13c/0x550 sys_mremap+0x220/0x7e0 system_call+0x5c/0x70 Fixes: 675d995297d4 ("powerpc/book3s64: Enable split pmd ptlock.") Cc: stable@vger.kernel.org # v4.18+ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15powerpc/syscalls: Fix syscall tracingMichael Ellerman
Recently in commit fbf508da7440 ("powerpc: split compat syscall table out from native table") we changed the layout of the system call table. Instead of having two entries for each syscall number, one for the regular entry point and one for the compat entry point, we now have separate tables for regular and compat entry points. This inadvertently broke syscall tracing (CONFIG_FTRACE_SYSCALLS), because our implementation of arch_syscall_addr() knew about the layout of the table (it did nr * 2). We can fix it just by dropping our version of arch_syscall_addr() and using the generic version which does: return (unsigned long)sys_call_table[nr]; Fixes: fbf508da7440 ("powerpc: split compat syscall table out from native table") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-15powerpc/pseries: Fix build break due to pnv_npu2_init()Jason A. Donenfeld
Commit 3be2df00e299 ("powerpc/pseries/npu: Enable platform support") added a call to pnv_npu2_init() in pseries code. This causes a build break if we build with CONFIG_PPC_PSERIES && !CONFIG_PPC_POWERNV: powerpc64le-pc-linux-gnu-ld: arch/powerpc/platforms/pseries/pci.o: in function `pSeries_final_fixup': pci.c:(.init.text+0x1b0): undefined reference to `pnv_npu2_init' This commit therefore wraps that line in an ifdef, so that pseries builds without powernv. Fixes: 3be2df00e299 ("powerpc/pseries/npu: Enable platform support") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> [mpe: Frob change log a bit to blame a different commit] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-11powerpc/4xx/ocm: Fix fix for phys_addr_t printf warningsMichael Ellerman
My recent commit to fix the printf warnings in ocm.c got the format specifier wrong, because I copied it from the documentation without realising the square brackets are not meant as literals. This results in the address being suffixed with a literal "[p]". Actually tested this time: # cat info /sys/kernel/debug/ppc4xx_ocm PhysAddr : 0x0000000400040000 ... NC.PhysAddr : 0x0000000400040000 ... C.PhysAddr : 0x0000000000000000 Fixes: 52b88fa1e8c7 ("powerpc/4xx/ocm: Fix phys_addr_t printf warnings") Reported-by: Christian Lamparter <chunkeey@gmail.com> Tested-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-11powerpc/powernv/npu: Fix oops in pnv_try_setup_npu_table_group()Frederic Barrat
With a recent change around IOMMU group, a system with an opencapi adapter is no longer booting and we get a kernel oops: BUG: Kernel NULL pointer dereference at 0x00000028 Faulting instruction address: 0xc0000000000aa38c ... NIP pnv_try_setup_npu_table_group+0x1c/0x1a0 LR pnv_pci_ioda_fixup+0x1f8/0x660 Call Trace: pnv_try_setup_npu_table_group+0x60/0x pnv_pci_ioda_fixup+0x20c/0x660 pcibios_resource_survey+0x2c8/0x31c pcibios_init+0xb0/0xe4 do_one_initcall+0x64/0x264 kernel_init_freeable+0x36c/0x468 kernel_init+0x2c/0x148 ret_from_kernel_thread+0x5c/0x68 An opencapi device is using a device PE, so the current code breaks because pe->pbus is not defined. More generally, there's no need to define an IOMMU group for opencapi, as the device sends real addresses directly (admittedly, the virtualization story is yet to be written). So let's fix it by skipping the IOMMU group setup for opencapi PHBs. Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-11powerpc/tm: Limit TM code inside PPC_TRANSACTIONAL_MEMBreno Leitao
Commit e1c3743e1a20 ("powerpc/tm: Set MSR[TS] just prior to recheckpoint") moved a code block around and this block uses a 'msr' variable outside of the CONFIG_PPC_TRANSACTIONAL_MEM, however the 'msr' variable is declared inside a CONFIG_PPC_TRANSACTIONAL_MEM block, causing a possible error when CONFIG_PPC_TRANSACTION_MEM is not defined. error: 'msr' undeclared (first use in this function) This is not causing a compilation error in the mainline kernel, because 'msr' is being used as an argument of MSR_TM_ACTIVE(), which is defined as the following when CONFIG_PPC_TRANSACTIONAL_MEM is *not* set: #define MSR_TM_ACTIVE(x) 0 This patch just fixes this issue avoiding the 'msr' variable usage outside the CONFIG_PPC_TRANSACTIONAL_MEM block, avoiding trusting in the MSR_TM_ACTIVE() definition. Cc: stable@vger.kernel.org Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Fixes: e1c3743e1a20 ("powerpc/tm: Set MSR[TS] just prior to recheckpoint") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-11powerpc/8xx: fix setting of pagetable for Abatron BDI debug tool.Christophe Leroy
Commit 8c8c10b90d88 ("powerpc/8xx: fix handling of early NULL pointer dereference") moved the loading of r6 earlier in the code. As some functions are called inbetween, r6 needs to be loaded again with the address of swapper_pg_dir in order to set PTE pointers for the Abatron BDI. Fixes: 8c8c10b90d88 ("powerpc/8xx: fix handling of early NULL pointer dereference") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-11powerpc/powernv/npu: Allocate enough memory in pnv_try_setup_npu_table_group()Dan Carpenter
There is a typo so we accidentally allocate enough memory for a pointer when we wanted to allocate enough for a struct. Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-08powerpc/perf: Update perf_regs structure to include MMCRAMadhavan Srinivasan
On each sample, Monitor Mode Control Register A (MMCRA) content is saved in pt_regs. MMCRA does not have a entry as-is in the pt_regs but instead, MMCRA content is saved in the "dsisr" register of pt_regs. Patch adds another entry to the perf_regs structure to include the "MMCRA" printing which internally maps to the "dsisr" of pt_regs. It also check for the MMCRA availability in the platform and present value accordingly mpe: This was the 2nd patch in a series with commit 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER") but I accidentally only merged the 1st patch, so merge this one now. Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-01-06Merge tag 'kbuild-v4.21-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - improve boolinit.cocci and use_after_iter.cocci semantic patches - fix alignment for kallsyms - move 'asm goto' compiler test to Kconfig and clean up jump_label CONFIG option - generate asm-generic wrappers automatically if arch does not implement mandatory UAPI headers - remove redundant generic-y defines - misc cleanups * tag 'kbuild-v4.21-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: rename generated .*conf-cfg to *conf-cfg kbuild: remove unnecessary stubs for archheader and archscripts kbuild: use assignment instead of define ... endef for filechk_* rules arch: remove redundant UAPI generic-y defines kbuild: generate asm-generic wrappers if mandatory headers are missing arch: remove stale comments "UAPI Header export list" riscv: remove redundant kernel-space generic-y kbuild: change filechk to surround the given command with { } kbuild: remove redundant target cleaning on failure kbuild: clean up rule_dtc_dt_yaml kbuild: remove UIMAGE_IN and UIMAGE_OUT jump_label: move 'asm goto' support test to Kconfig kallsyms: lower alignment on ARM scripts: coccinelle: boolinit: drop warnings on named constants scripts: coccinelle: check for redeclaration kconfig: remove unused "file" field of yylval union nds32: remove redundant kernel-space generic-y nios2: remove unneeded HAS_DMA define
2019-01-06Fix 'acccess_ok()' on alpha and SHLinus Torvalds
Commit 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'") broke both alpha and SH booting in qemu, as noticed by Guenter Roeck. It turns out that the bug wasn't actually in that commit itself (which would have been surprising: it was mostly a no-op), but in how the addition of access_ok() to the strncpy_from_user() and strnlen_user() functions now triggered the case where those functions would test the access of the very last byte of the user address space. The string functions actually did that user range test before too, but they did it manually by just comparing against user_addr_max(). But with user_access_begin() doing the check (using "access_ok()"), it now exposed problems in the architecture implementations of that function. For example, on alpha, the access_ok() helper macro looked like this: #define __access_ok(addr, size) \ ((get_fs().seg & (addr | size | (addr+size))) == 0) and what it basically tests is of any of the high bits get set (the USER_DS masking value is 0xfffffc0000000000). And that's completely wrong for the "addr+size" check. Because it's off-by-one for the case where we check to the very end of the user address space, which is exactly what the strn*_user() functions do. Why? Because "addr+size" will be exactly the size of the address space, so trying to access the last byte of the user address space will fail the __access_ok() check, even though it shouldn't. As a result, the user string accessor functions failed consistently - because they literally don't know how long the string is going to be, and the max access is going to be that last byte of the user address space. Side note: that alpha macro is buggy for another reason too - it re-uses the arguments twice. And SH has another version of almost the exact same bug: #define __addr_ok(addr) \ ((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg) so far so good: yes, a user address must be below the limit. But then: #define __access_ok(addr, size) \ (__addr_ok((addr) + (size))) is wrong with the exact same off-by-one case: the case when "addr+size" is exactly _equal_ to the limit is actually perfectly fine (think "one byte access at the last address of the user address space") The SH version is actually seriously buggy in another way: it doesn't actually check for overflow, even though it did copy the _comment_ that talks about overflow. So it turns out that both SH and alpha actually have completely buggy implementations of access_ok(), but they happened to work in practice (although the SH overflow one is a serious serious security bug, not that anybody likely cares about SH security). This fixes the problems by using a similar macro on both alpha and SH. It isn't trying to be clever, the end address is based on this logic: unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b; which basically says "add start and length, and then subtract one unless the length was zero". We can't subtract one for a zero length, or we'd just hit an underflow instead. For a lot of access_ok() users the length is a constant, so this isn't actually as expensive as it initially looks. Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Cc: Matt Turner <mattst88@gmail.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-06Merge tag 'dma-mapping-4.21-1' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fixes from Christoph Hellwig: "Fix various regressions introduced in this cycles: - fix dma-debug tracking for the map_page / map_single consolidatation - properly stub out DMA mapping symbols for !HAS_DMA builds to avoid link failures - fix AMD Gart direct mappings - setup the dma address for no kernel mappings using the remap allocator" * tag 'dma-mapping-4.21-1' of git://git.infradead.org/users/hch/dma-mapping: dma-direct: fix DMA_ATTR_NO_KERNEL_MAPPING for remapped allocations x86/amd_gart: fix unmapping of non-GART mappings dma-mapping: remove a few unused exports dma-mapping: properly stub out the DMA API for !CONFIG_HAS_DMA dma-mapping: remove dmam_{declare,release}_coherent_memory dma-mapping: implement dmam_alloc_coherent using dmam_alloc_attrs dma-mapping: implement dma_map_single_attrs using dma_map_page_attrs
2019-01-05Merge tag 'pci-v4.21-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - Remove unused lists from ASPM pcie_link_state (Frederick Lawler) - Fix Broadcom CNB20LE host bridge unintended sign extension (Colin Ian King) - Expand Kconfig "PF" acronyms (Randy Dunlap) - Update MAINTAINERS for arch/x86/kernel/early-quirks.c (Bjorn Helgaas) - Add missing include to drivers/pci.h (Alexandru Gagniuc) - Override Synopsys USB 3.x HAPS device class so dwc3-haps can claim it instead of xhci (Thinh Nguyen) - Clean up P2PDMA documentation (Randy Dunlap) - Allow runtime PM even if driver doesn't supply callbacks (Jarkko Nikula) - Remove status check after submitting Switchtec MRPC Firmware Download commands to avoid Completion Timeouts (Kelvin Cao) - Set Switchtec coherent DMA mask to allow 64-bit DMA (Boris Glimcher) - Fix Switchtec SWITCHTEC_IOCTL_EVENT_IDX_ALL flag overwrite issue (Joey Zhang) - Enable write combining for Switchtec MRPC Input buffers (Kelvin Cao) - Add Switchtec MRPC DMA mode support (Wesley Sheng) - Skip VF scanning on powerpc, which does this in firmware (Sebastian Ott) - Add Amlogic Meson PCIe controller driver and DT bindings (Yue Wang) - Constify histb dw_pcie_host_ops structure (Julia Lawall) - Support multiple power domains for imx6 (Leonard Crestez) - Constify layerscape driver data (Stefan Agner) - Update imx6 Kconfig to allow imx6 PCIe in imx7 kernel (Trent Piepho) - Support armada8k GPIO reset (Baruch Siach) - Support suspend/resume support on imx6 (Leonard Crestez) - Don't hard-code DesignWare DBI/ATU offst (Stephen Warren) - Skip i.MX6 PHY setup on i.MX7D (Andrey Smirnov) - Remove Jianguo Sun from HiSilicon STB maintainers (Lorenzo Pieralisi) - Mask DesignWare interrupts instead of disabling them to avoid lost interrupts (Marc Zyngier) - Add locking when acking DesignWare interrupts (Marc Zyngier) - Ack DesignWare interrupts in the proper callbacks (Marc Zyngier) - Use devm resource parser in mediatek (Honghui Zhang) - Remove unused mediatek "num-lanes" DT property (Honghui Zhang) - Add UniPhier PCIe controller driver and DT bindings (Kunihiko Hayashi) - Enable MSI for imx6 downstream components (Richard Zhu) * tag 'pci-v4.21-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (40 commits) PCI: imx: Enable MSI from downstream components s390/pci: skip VF scanning PCI/IOV: Add flag so platforms can skip VF scanning PCI/IOV: Factor out sriov_add_vfs() PCI: uniphier: Add UniPhier PCIe host controller support dt-bindings: PCI: Add UniPhier PCIe host controller description PCI: amlogic: Add the Amlogic Meson PCIe controller driver dt-bindings: PCI: meson: add DT bindings for Amlogic Meson PCIe controller arm64: dts: mt7622: Remove un-used property for PCIe arm: dts: mt7623: Remove un-used property for PCIe dt-bindings: PCI: MediaTek: Remove un-used property PCI: mediatek: Remove un-used variant in struct mtk_pcie_port MAINTAINERS: Remove Jianguo Sun from HiSilicon STB DWC entry PCI: dwc: Don't hard-code DBI/ATU offset PCI: imx: Add imx6sx suspend/resume support PCI: armada8k: Add support for gpio controlled reset signal PCI: dwc: Adjust Kconfig to allow IMX6 PCIe host on IMX7 PCI: dwc: layerscape: Constify driver data PCI: imx: Add multi-pd support PCI: Override Synopsys USB 3.x HAPS device class ...
2019-01-06kbuild: use assignment instead of define ... endef for filechk_* rulesMasahiro Yamada
You do not have to use define ... endef for filechk_* rules. For simple cases, the use of assignment looks cleaner, IMHO. I updated the usage for scripts/Kbuild.include in case somebody misunderstands the 'define ... endif' is the requirement. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2019-01-06arch: remove redundant UAPI generic-y definesMasahiro Yamada
Now that Kbuild automatically creates asm-generic wrappers for missing mandatory headers, it is redundant to list the same headers in generic-y and mandatory-y. Suggested-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Sam Ravnborg <sam@ravnborg.org>
2019-01-06arch: remove stale comments "UAPI Header export list"Masahiro Yamada
These comments are leftovers of commit fcc8487d477a ("uapi: export all headers under uapi directories"). Prior to that commit, exported headers must be explicitly added to header-y. Now, all headers under the uapi/ directories are exported. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-01-06riscv: remove redundant kernel-space generic-yMasahiro Yamada
This commit removes redundant generic-y defines in arch/riscv/include/asm/Kbuild. [1] It is redundant to define the same generic-y in both arch/$(ARCH)/include/asm/Kbuild and arch/$(ARCH)/include/uapi/asm/Kbuild. Remove the following generic-y: errno.h fcntl.h ioctl.h ioctls.h ipcbuf.h mman.h msgbuf.h param.h poll.h posix_types.h resource.h sembuf.h setup.h shmbuf.h signal.h socket.h sockios.h stat.h statfs.h swab.h termbits.h termios.h types.h [2] It is redundant to define generic-y when arch-specific implementation exists in arch/$(ARCH)/include/asm/*.h Remove the following generic-y: cacheflush.h module.h Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-01-06kbuild: change filechk to surround the given command with { }Masahiro Yamada
filechk_* rules often consist of multiple 'echo' lines. They must be surrounded with { } or ( ) to work correctly. Otherwise, only the string from the last 'echo' would be written into the target. Let's take care of that in the 'filechk' in scripts/Kbuild.include to clean up filechk_* rules. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-01-06kbuild: remove redundant target cleaning on failureMasahiro Yamada
Since commit 9c2af1c7377a ("kbuild: add .DELETE_ON_ERROR special target"), the target file is automatically deleted on failure. The boilerplate code ... || { rm -f $@; false; } is unneeded. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-01-06jump_label: move 'asm goto' support test to KconfigMasahiro Yamada
Currently, CONFIG_JUMP_LABEL just means "I _want_ to use jump label". The jump label is controlled by HAVE_JUMP_LABEL, which is defined like this: #if defined(CC_HAVE_ASM_GOTO) && defined(CONFIG_JUMP_LABEL) # define HAVE_JUMP_LABEL #endif We can improve this by testing 'asm goto' support in Kconfig, then make JUMP_LABEL depend on CC_HAS_ASM_GOTO. Ugly #ifdef HAVE_JUMP_LABEL will go away, and CONFIG_JUMP_LABEL will match to the real kernel capability. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
2019-01-06nds32: remove redundant kernel-space generic-yMasahiro Yamada
This commit removes redundant generic-y defines in arch/nds32/include/asm/Kbuild. [1] It is redundant to define the same generic-y in both arch/$(ARCH)/include/asm/Kbuild and arch/$(ARCH)/include/uapi/asm/Kbuild. Remove the following generic-y: bitsperlong.h bpf_perf_event.h errno.h fcntl.h ioctl.h ioctls.h mman.h shmbuf.h stat.h [2] It is redundant to define generic-y when arch-specific implementation exists in arch/$(ARCH)/include/asm/*.h Remove the following generic-y: ftrace.h Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-01-06nios2: remove unneeded HAS_DMA defineMasahiro Yamada
kernel/dma/Kconfig globally defines HAS_DMA as follows: config HAS_DMA bool depends on !NO_DMA default y Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-01-05Merge tag 'trace-v4.21-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace sh build fix from Steven Rostedt: "It appears that the zero-day bot did find a bug in my sh build. And that I didn't have the bad code in my config file when I cross compiled it, although there are a few other errors in sh that makes it not build for me, I missed that I added one more" * tag 'trace-v4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: sh: ftrace: Fix missing parenthesis in WARN_ON()
2019-01-05Merge branch 'mount.part1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs mount API prep from Al Viro: "Mount API prereqs. Mostly that's LSM mount options cleanups. There are several minor fixes in there, but nothing earth-shattering (leaks on failure exits, mostly)" * 'mount.part1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (27 commits) mount_fs: suppress MAC on MS_SUBMOUNT as well as MS_KERNMOUNT smack: rewrite smack_sb_eat_lsm_opts() smack: get rid of match_token() smack: take the guts of smack_parse_opts_str() into a new helper LSM: new method: ->sb_add_mnt_opt() selinux: rewrite selinux_sb_eat_lsm_opts() selinux: regularize Opt_... names a bit selinux: switch away from match_token() selinux: new helper - selinux_add_opt() LSM: bury struct security_mnt_opts smack: switch to private smack_mnt_opts selinux: switch to private struct selinux_mnt_opts LSM: hide struct security_mnt_opts from any generic code selinux: kill selinux_sb_get_mnt_opts() LSM: turn sb_eat_lsm_opts() into a method nfs_remount(): don't leak, don't ignore LSM options quietly btrfs: sanitize security_mnt_opts use selinux; don't open-code a loop in sb_finish_set_opts() LSM: split ->sb_set_mnt_opts() out of ->sb_kern_mount() new helper: security_sb_eat_lsm_opts() ...
2019-01-05Merge tag 'mips_fixes_4.21_1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Paul Burton: "A few early MIPS fixes for 4.21: - The Broadcom BCM63xx platform sees a fix for resetting the BCM6368 ethernet switch, and the removal of a platform device we've never had a driver for. - The Alchemy platform sees a few fixes for bitrot that occurred within the past few cycles. - We now enable vectored interrupt support for the MediaTek MT7620 SoC, which makes sense since they're supported by the SoC but in this case also works around a bug relating to the location of exception vectors when using a recent version of U-Boot. - The atomic64_fetch_*_relaxed() family of functions see a fix for a regression in MIPS64 kernels since v4.19. - Cavium Octeon III CN7xxx systems will now disable their RGMII interfaces rather than attempt to enable them & warn about the lack of support for doing so, as they did since initial CN7xxx ethernet support was added in v4.7. - The Microsemi/Microchip MSCC SoCs gain a MAINTAINERS entry. - .mailmap now provides consistency for Dengcheng Zhu's name & current email address" * tag 'mips_fixes_4.21_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: OCTEON: mark RGMII interface disabled on OCTEON III MIPS: Fix a R10000_LLSC_WAR logic in atomic.h MIPS: BCM63XX: drop unused and broken DSP platform device mailmap: Update name spelling and email for Dengcheng Zhu MIPS: ralink: Select CONFIG_CPU_MIPSR2_IRQ_VI on MT7620/8 MAINTAINERS: Add a maintainer for MSCC MIPS SoCs MIPS: Alchemy: update dma masks for devboard devices MIPS: Alchemy: update cpu-feature-overrides MIPS: Alchemy: drop DB1000 IrDA support bits MIPS: alchemy: cpu_all_mask is forbidden for clock event devices MIPS: BCM63XX: fix switch core reset on BCM6368
2019-01-05Merge tag 'powerpc-4.21-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "A fix for the recent access_ok() change, which broke the build. We recently added a use of type in order to squash a warning elsewhere about type being unused. A handful of other minor build fixes, and one defconfig update. Thanks to: Christian Lamparter, Christophe Leroy, Diana Craciun, Mathieu Malaterre" * tag 'powerpc-4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Drop use of 'type' from access_ok() KVM: PPC: Book3S HV: radix: Fix uninitialized var build error powerpc/configs: Add PPC4xx_OCM to ppc40x_defconfig powerpc/4xx/ocm: Fix phys_addr_t printf warnings powerpc/4xx/ocm: Fix compilation error due to PAGE_KERNEL usage powerpc/fsl: Fixed warning: orphan section `__btb_flush_fixup'
2019-01-05Merge branch 'parisc-4.21-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fix from Helge Deller: "Fix boot issues with a series of parisc servers since kernel 4.20. Remapping kernel text with set_kernel_text_rw() missed to remap from lowest up until the highest huge-page aligned kernel text addresss" * 'parisc-4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Remap hugepage-aligned pages in set_kernel_text_rw()
2019-01-05Merge tag 'for-4.21' of git://git.sourceforge.jp/gitroot/uclinux-h8/linuxLinus Torvalds
Pull h8300 fix from Yoshinori Sato: "Build problem fix" * tag 'for-4.21' of git://git.sourceforge.jp/gitroot/uclinux-h8/linux: h8300: pci: Remove local declaration of pcibios_penalize_isa_irq
2019-01-05Merge tag 'armsoc-late' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull more ARM SoC updates from Olof Johansson: "A few updates that we merged late but are low risk for regressions for other platforms (and a few other straggling patches): - I mis-tagged the 'drivers' branch, and missed 3 patches. Merged in here. They're for a driver for the PL353 SRAM controller and a build fix for the qualcomm scm driver. - A new platform, RDA Micro RDA8810PL (Cortex-A5 w/ integrated Vivante GPU, 256MB RAM, Wifi). This includes some acked platform-specific drivers (serial, etc). This also include DTs for two boards with this SoC, OrangePi 2G and OrangePi i86. - i.MX8 is another new platform (NXP, 4x Cortex-A53 + Cortex-M4, 4K video playback offload). This is the first i.MX 64-bit SoC. - Some minor updates to Samsung boards (adding a few peripherals in DTs). - Small rework for SMP bootup on STi platforms. - A couple of TEE driver fixes. - A couple of new config options (bcm2835 thermal, Uniphier MDMAC) enabled in defconfigs" * tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (27 commits) ARM: multi_v7_defconfig: enable CONFIG_UNIPHIER_MDMAC arm64: defconfig: Re-enable bcm2835-thermal driver MAINTAINERS: Add entry for RDA Micro SoC architecture tty: serial: Add RDA8810PL UART driver ARM: dts: rda8810pl: Add interrupt support for UART dt-bindings: serial: Document RDA Micro UART ARM: dts: rda8810pl: Add timer support ARM: dts: Add devicetree for OrangePi i96 board ARM: dts: Add devicetree for OrangePi 2G IoT board ARM: dts: Add devicetree for RDA8810PL SoC ARM: Prepare RDA8810PL SoC dt-bindings: arm: Document RDA8810PL and reference boards dt-bindings: Add RDA Micro vendor prefix ARM: sti: remove pen_release and boot_lock arm64: dts: exynos: Add Bluetooth chip to TM2(e) boards arm64: dts: imx8mq-evk: enable watchdog arm64: dts: imx8mq: add watchdog devices MAINTAINERS: add i.MX8 DT path to i.MX architecture arm64: add support for i.MX8M EVK board arm64: add basic DTS for i.MX8MQ ...
2019-01-05Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "I'm safely chained back up to my desk, so please pull these arm64 fixes for -rc1 that address some issues that cropped up during the merge window: - Prevent KASLR from mapping the top page of the virtual address space - Fix device-tree probing of SDEI driver - Fix incorrect register offset definition in Hisilicon DDRC PMU driver - Fix compilation issue with older binutils not liking unsigned immediates - Fix uapi headers so that libc can provide its own sigcontext definition - Fix handling of private compat syscalls - Hook up compat io_pgetevents() syscall for 32-bit tasks - Cleanup to arm64 Makefile (including now to avoid silly conflicts)" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: compat: Hook up io_pgetevents() for 32-bit tasks arm64: compat: Don't pull syscall number from regs in arm_compat_syscall arm64: compat: Avoid sending SIGILL for unallocated syscall numbers arm64/sve: Disentangle <uapi/asm/ptrace.h> from <uapi/asm/sigcontext.h> arm64/sve: ptrace: Fix SVE_PT_REGS_OFFSET definition drivers/perf: hisi: Fixup one DDRC PMU register offset arm64: replace arm64-obj-* in Makefile with obj-* arm64: kaslr: Reserve size of ARM64_MEMSTART_ALIGN in linear region firmware: arm_sdei: Fix DT platform device creation firmware: arm_sdei: fix wrong of_node_put() in init function arm64: entry: remove unused register aliases arm64: smp: Fix compilation error
2019-01-05Merge tag 'for-4.21' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: "Included in this update: - Florian Fainelli noticed that userspace segfaults caused by the lack of kernel-userspace helpers was hard to diagnose; we now issue a warning when userspace tries to use the helpers but the kernel has them disabled. - Ben Dooks wants compatibility for the old ATAG serial number with DT systems. - Some cleanup of assembly by Nicolas Pitre. - User accessors optimisation from Vincent Whitchurch. - More robust kdump on SMP systems from Yufen Wang. - Sebastian Andrzej Siewior noticed problems with the SMP "boot_lock" on RT kernels, and so we convert the Versatile series of platforms to use a raw spinlock instead, consolidating the Versatile implementation. We entirely remove the boot_lock on OMAP systems, where it's unnecessary. Further patches for other systems will be submitted for the following merge window. - Start switching old StrongARM-11x0 systems to use gpiolib rather than their private GPIO implementation - mostly PCMCIA bits. - ARM Kconfig cleanups. - Cleanup a mostly harmless mistake in the recent Spectre patch in 4.20 (which had the effect that data that can be placed into the init sections was incorrectly always placed in the rodata section)" * tag 'for-4.21' of git://git.armlinux.org.uk/~rmk/linux-arm: (25 commits) ARM: omap2: remove unnecessary boot_lock ARM: versatile: rename and comment SMP implementation ARM: versatile: convert boot_lock to raw ARM: vexpress/realview: consolidate immitation CPU hotplug ARM: fix the cockup in the previous patch ARM: sa1100/cerf: switch to using gpio_led_register_device() ARM: sa1100/assabet: switch to using gpio leds ARM: sa1100/assabet: add gpio keys support for right-hand two buttons ARM: sa1111: remove legacy GPIO interfaces pcmcia: sa1100*: remove redundant bvd1/bvd2 setting ARM: pxa/lubbock: switch PCMCIA to MAX1600 library ARM: pxa/mainstone: switch PCMCIA to MAX1600 library and gpiod APIs ARM: sa1100/neponset: switch PCMCIA to MAX1600 library and gpiod APIs ARM: sa1100/jornada720: switch PCMCIA to gpiod APIs pcmcia: add MAX1600 library ARM: sa1100: explicitly register sa11x0-pcmcia devices ARM: 8813/1: Make aligned 2-byte getuser()/putuser() atomic on ARMv6+ ARM: 8812/1: Optimise copy_{from/to}_user for !CPU_USE_DOMAINS ARM: 8811/1: always list both ldrd/strd registers explicitly ARM: 8808/1: kexec:offline panic_smp_self_stop CPU ...
2019-01-05Merge tag 'csky-for-linus-4.21' of git://github.com/c-sky/csky-linuxLinus Torvalds
Pull arch/csky updates from Guo Ren: "Here are three main features (cpu_hotplug, basic ftrace, basic perf) and some bugfixes: Features: - Add CPU-hotplug support for SMP - Add ftrace with function trace and function graph trace - Add Perf support - Add EM_CSKY_OLD 39 - optimize kernel panic print. - remove syscall_exit_work Bugfixes: - fix abiv2 mmap(... O_SYNC) failure - fix gdb coredump error - remove vdsp implement for kernel - fix qemu failure to bootup sometimes - fix ftrace call-graph panic - fix device tree node reference leak - remove meaningless header-y - fix save hi,lo,dspcr regs in switch_stack - remove unused members in processor.h" * tag 'csky-for-linus-4.21' of git://github.com/c-sky/csky-linux: csky: Add perf support for C-SKY csky: Add EM_CSKY_OLD 39 clocksource/drivers/c-sky: fixup ftrace call-graph panic csky: ftrace call graph supported. csky: basic ftrace supported csky: remove unused members in processor.h csky: optimize kernel panic print. csky: stacktrace supported. csky: CPU-hotplug supported for SMP clocksource/drivers/c-sky: fixup qemu fail to bootup sometimes. csky: fixup save hi,lo,dspcr regs in switch_stack. csky: remove syscall_exit_work csky: fixup remove vdsp implement for kernel. csky: bugfix gdb coredump error. csky: fixup abiv2 mmap(... O_SYNC) failed. csky: define syscall_get_arch() elf-em.h: add EM_CSKY csky: remove meaningless header-y csky: Don't leak device tree node reference
2019-01-05Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge more updates from Andrew Morton: - procfs updates - various misc bits - lib/ updates - epoll updates - autofs - fatfs - a few more MM bits * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (58 commits) mm/page_io.c: fix polled swap page in checkpatch: add Co-developed-by to signature tags docs: fix Co-Developed-by docs drivers/base/platform.c: kmemleak ignore a known leak fs: don't open code lru_to_page() fs/: remove caller signal_pending branch predictions mm/: remove caller signal_pending branch predictions arch/arc/mm/fault.c: remove caller signal_pending_branch predictions kernel/sched/: remove caller signal_pending branch predictions kernel/locking/mutex.c: remove caller signal_pending branch predictions mm: select HAVE_MOVE_PMD on x86 for faster mremap mm: speed up mremap by 20x on large regions mm: treewide: remove unused address argument from pte_alloc functions initramfs: cleanup incomplete rootfs scripts/gdb: fix lx-version string output kernel/kcov.c: mark write_comp_data() as notrace kernel/sysctl: add panic_print into sysctl panic: add options to print system info when panic happens bfs: extra sanity checking and static inode bitmap exec: separate MM_ANONPAGES and RLIMIT_STACK accounting ...
2019-01-05x86/amd_gart: fix unmapping of non-GART mappingsChristoph Hellwig
In many cases we don't have to create a GART mapping at all, which also means there is nothing to unmap. Fix the range check that was incorrectly modified when removing the mapping_error method. Fixes: 9e8aa6b546 ("x86/amd_gart: remove the mapping_error dma_map_ops method") Reported-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Michal Kubecek <mkubecek@suse.cz>
2019-01-04ia64: fix compile without swiotlbChristoph Hellwig
Some non-generic ia64 configs don't build swiotlb, and thus should not pull in the generic non-coherent DMA infrastructure. Fixes: 68c608345c ("swiotlb: remove dma_mark_clean") Reported-by: Tony Luck <tony.luck@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04x86: re-introduce non-generic memcpy_{to,from}ioLinus Torvalds
This has been broken forever, and nobody ever really noticed because it's purely a performance issue. Long long ago, in commit 6175ddf06b61 ("x86: Clean up mem*io functions") Brian Gerst simplified the memory copies to and from iomem, since on x86, the instructions to access iomem are exactly the same as the regular instructions. That is technically true, and things worked, and nobody said anything. Besides, back then the regular memcpy was pretty simple and worked fine. Nobody noticed except for David Laight, that is. David has a testing a TLP monitor he was writing for an FPGA, and has been occasionally complaining about how memcpy_toio() writes things one byte at a time. Which is completely unacceptable from a performance standpoint, even if it happens to technically work. The reason it's writing one byte at a time is because while it's technically true that accesses to iomem are the same as accesses to regular memory on x86, the _granularity_ (and ordering) of accesses matter to iomem in ways that they don't matter to regular cached memory. In particular, when ERMS is set, we default to using "rep movsb" for larger memory copies. That is indeed perfectly fine for real memory, since the whole point is that the CPU is going to do cacheline optimizations and executes the memory copy efficiently for cached memory. With iomem? Not so much. With iomem, "rep movsb" will indeed work, but it will copy things one byte at a time. Slowly and ponderously. Now, originally, back in 2010 when commit 6175ddf06b61 was done, we didn't use ERMS, and this was much less noticeable. Our normal memcpy() was simpler in other ways too. Because in fact, it's not just about using the string instructions. Our memcpy() these days does things like "read and write overlapping values" to handle the last bytes of the copy. Again, for normal memory, overlapping accesses isn't an issue. For iomem? It can be. So this re-introduces the specialized memcpy_toio(), memcpy_fromio() and memset_io() functions. It doesn't particularly optimize them, but it tries to at least not be horrid, or do overlapping accesses. In fact, this uses the existing __inline_memcpy() function that we still had lying around that uses our very traditional "rep movsl" loop followed by movsw/movsb for the final bytes. Somebody may decide to try to improve on it, but if we've gone almost a decade with only one person really ever noticing and complaining, maybe it's not worth worrying about further, once it's not _completely_ broken? Reported-by: David Laight <David.Laight@aculab.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04Use __put_user_goto in __put_user_size() and unsafe_put_user()Linus Torvalds
This actually enables the __put_user_goto() functionality in unsafe_put_user(). For an example of the effect of this, this is the code generated for the unsafe_put_user(signo, &infop->si_signo, Efault); in the waitid() system call: movl %ecx,(%rbx) # signo, MEM[(struct __large_struct *)_2] It's just one single store instruction, along with generating an exception table entry pointing to the Efault label case in case that instruction faults. Before, we would generate this: xorl %edx, %edx movl %ecx,(%rbx) # signo, MEM[(struct __large_struct *)_3] testl %edx, %edx jne .L309 with the exception table generated for that 'mov' instruction causing us to jump to a stub that set %edx to -EFAULT and then jumped back to the 'testl' instruction. So not only do we now get rid of the extra code in the normal sequence, we also avoid unnecessarily keeping that extra error register live across it all. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04x86 uaccess: Introduce __put_user_gotoLinus Torvalds
This is finally the actual reason for the odd error handling in the "unsafe_get/put_user()" functions, introduced over three years ago. Using a "jump to error label" interface is somewhat odd, but very convenient as a programming interface, and more importantly, it fits very well with simply making the target be the exception handler address directly from the inline asm. The reason it took over three years to actually do this? We need "asm goto" support for it, which only became the default on x86 last year. It's now been a year that we've forced asm goto support (see commit e501ce957a78 "x86: Force asm-goto"), and so let's just do it here too. [ Side note: this commit was originally done back in 2016. The above commentary about timing is obviously about it only now getting merged into my real upstream tree - Linus ] Sadly, gcc still only supports "asm goto" with asms that do not have any outputs, so we are limited to only the put_user case for this. Maybe in several more years we can do the get_user case too. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-05parisc: Remap hugepage-aligned pages in set_kernel_text_rw()Helge Deller
The alternative coding patch for parisc in kernel 4.20 broke booting machines with PA8500-PA8700 CPUs. The problem is, that for such machines the parisc kernel automatically utilizes huge pages to access kernel text code, but the set_kernel_text_rw() function, which is used shortly before applying any alternative patches, didn't used the correctly hugepage-aligned addresses to remap the kernel text read-writeable. Fixes: 3847dab77421 ("parisc: Add alternative coding infrastructure") Cc: <stable@vger.kernel.org> [4.20] Signed-off-by: Helge Deller <deller@gmx.de>
2019-01-04Merge branch 'next/drivers' into next/lateOlof Johansson
Merge in a few missing patches from the pull request (my copy of the branch was behind the staged version in linux-next). * next/drivers: memory: pl353: Add driver for arm pl353 static memory controller dt-bindings: memory: Add pl353 smc controller devicetree binding information firmware: qcom: scm: fix compilation error when disabled Signed-off-by: Olof Johansson <olof@lixom.net>
2019-01-04ARM: multi_v7_defconfig: enable CONFIG_UNIPHIER_MDMACMasahiro Yamada
Enable the UniPhier MIO DMAC driver. This is used as the DMA engine for accelerating the SD/eMMC controller drivers. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Olof Johansson <olof@lixom.net>
2019-01-04arch/arc/mm/fault.c: remove caller signal_pending_branch predictionsDavidlohr Bueso
This is already done for us internally by the signal machinery. Link: http://lkml.kernel.org/r/20181116002713.8474-4-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04mm: select HAVE_MOVE_PMD on x86 for faster mremapJoel Fernandes (Google)
Moving page-tables at the PMD-level on x86 is known to be safe. Enable this option so that we can do fast mremap when possible. Link: http://lkml.kernel.org/r/20181108181201.88826-4-joelaf@google.com Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Suggested-by: Kirill A. Shutemov <kirill@shutemov.name> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Michal Hocko <mhocko@kernel.org> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04mm: speed up mremap by 20x on large regionsJoel Fernandes (Google)
Android needs to mremap large regions of memory during memory management related operations. The mremap system call can be really slow if THP is not enabled. The bottleneck is move_page_tables, which is copying each pte at a time, and can be really slow across a large map. Turning on THP may not be a viable option, and is not for us. This patch speeds up the performance for non-THP system by copying at the PMD level when possible. The speedup is an order of magnitude on x86 (~20x). On a 1GB mremap, the mremap completion times drops from 3.4-3.6 milliseconds to 144-160 microseconds. Before: Total mremap time for 1GB data: 3521942 nanoseconds. Total mremap time for 1GB data: 3449229 nanoseconds. Total mremap time for 1GB data: 3488230 nanoseconds. After: Total mremap time for 1GB data: 150279 nanoseconds. Total mremap time for 1GB data: 144665 nanoseconds. Total mremap time for 1GB data: 158708 nanoseconds. If THP is enabled the optimization is mostly skipped except in certain situations. [joel@joelfernandes.org: fix 'move_normal_pmd' unused function warning] Link: http://lkml.kernel.org/r/20181108224457.GB209347@google.com Link: http://lkml.kernel.org/r/20181108181201.88826-3-joelaf@google.com Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Reviewed-by: William Kucharski <william.kucharski@oracle.com> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Michal Hocko <mhocko@kernel.org> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04mm: treewide: remove unused address argument from pte_alloc functionsJoel Fernandes (Google)
Patch series "Add support for fast mremap". This series speeds up the mremap(2) syscall by copying page tables at the PMD level even for non-THP systems. There is concern that the extra 'address' argument that mremap passes to pte_alloc may do something subtle architecture related in the future that may make the scheme not work. Also we find that there is no point in passing the 'address' to pte_alloc since its unused. This patch therefore removes this argument tree-wide resulting in a nice negative diff as well. Also ensuring along the way that the enabled architectures do not do anything funky with the 'address' argument that goes unnoticed by the optimization. Build and boot tested on x86-64. Build tested on arm64. The config enablement patch for arm64 will be posted in the future after more testing. The changes were obtained by applying the following Coccinelle script. (thanks Julia for answering all Coccinelle questions!). Following fix ups were done manually: * Removal of address argument from pte_fragment_alloc * Removal of pte_alloc_one_fast definitions from m68k and microblaze. // Options: --include-headers --no-includes // Note: I split the 'identifier fn' line, so if you are manually // running it, please unsplit it so it runs for you. virtual patch @pte_alloc_func_def depends on patch exists@ identifier E2; identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$"; type T2; @@ fn(... - , T2 E2 ) { ... } @pte_alloc_func_proto_noarg depends on patch exists@ type T1, T2, T3, T4; identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$"; @@ ( - T3 fn(T1, T2); + T3 fn(T1); | - T3 fn(T1, T2, T4); + T3 fn(T1, T2); ) @pte_alloc_func_proto depends on patch exists@ identifier E1, E2, E4; type T1, T2, T3, T4; identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$"; @@ ( - T3 fn(T1 E1, T2 E2); + T3 fn(T1 E1); | - T3 fn(T1 E1, T2 E2, T4 E4); + T3 fn(T1 E1, T2 E2); ) @pte_alloc_func_call depends on patch exists@ expression E2; identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$"; @@ fn(... -, E2 ) @pte_alloc_macro depends on patch exists@ identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$"; identifier a, b, c; expression e; position p; @@ ( - #define fn(a, b, c) e + #define fn(a, b) e | - #define fn(a, b) e + #define fn(a) e ) Link: http://lkml.kernel.org/r/20181108181201.88826-2-joelaf@google.com Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Suggested-by: Kirill A. Shutemov <kirill@shutemov.name> Acked-by: Kirill A. Shutemov <kirill@shutemov.name> Cc: Michal Hocko <mhocko@kernel.org> Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: William Kucharski <william.kucharski@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04fls: change parameter to unsigned intMatthew Wilcox
When testing in userspace, UBSAN pointed out that shifting into the sign bit is undefined behaviour. It doesn't really make sense to ask for the highest set bit of a negative value, so just turn the argument type into an unsigned int. Some architectures (eg ppc) already had it declared as an unsigned int, so I don't expect too many problems. Link: http://lkml.kernel.org/r/20181105221117.31828-1-willy@infradead.org Signed-off-by: Matthew Wilcox <willy@infradead.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04make 'user_access_begin()' do 'access_ok()'Linus Torvalds
Originally, the rule used to be that you'd have to do access_ok() separately, and then user_access_begin() before actually doing the direct (optimized) user access. But experience has shown that people then decide not to do access_ok() at all, and instead rely on it being implied by other operations or similar. Which makes it very hard to verify that the access has actually been range-checked. If you use the unsafe direct user accesses, hardware features (either SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged Access Never - on ARM) do force you to use user_access_begin(). But nothing really forces the range check. By putting the range check into user_access_begin(), we actually force people to do the right thing (tm), and the range check vill be visible near the actual accesses. We have way too long a history of people trying to avoid them. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-04Fix access_ok() fallout for sparc32 and powerpcLinus Torvalds
These two architectures actually had an intentional use of the 'type' argument to access_ok() just to avoid warnings. I had actually noticed the powerpc one, but forgot to then fix it up. And I missed the sparc32 case entirely. This is hopefully all of it. Reported-by: Mathieu Malaterre <malat@debian.org> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: 96d4f267e40f ("Remove 'type' argument from access_ok() function") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>