aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-04-02mm/gup: split get_user_pages_remote() into two routinesJohn Hubbard
Patch series "mm/gup: track FOLL_PIN pages", v6. This activates tracking of FOLL_PIN pages. This is in support of fixing the get_user_pages()+DMA problem described in [1]-[4]. FOLL_PIN support is now in the main linux tree. However, the patch to use FOLL_PIN to track pages was *not* submitted, because Leon saw an RDMA test suite failure that involved (I think) page refcount overflows when huge pages were used. This patch definitively solves that kind of overflow problem, by adding an exact pincount, for compound pages (of order > 1), in the 3rd struct page of a compound page. If available, that form of pincounting is used, instead of the GUP_PIN_COUNTING_BIAS approach. Thanks again to Jan Kara for that idea. Other interesting changes: * dump_page(): added one, or two new things to report for compound pages: head refcount (for all compound pages), and map_pincount (for compound pages of order > 1). * Documentation/core-api/pin_user_pages.rst: removed the "TODO" for the huge page refcount upper limit problems, and added notes about how it works now. Also added a note about the dump_page() enhancements. * Added some comments in gup.c and mm.h, to explain that there are two ways to count pinned pages: exact (for compound pages of order > 1) and fuzzy (GUP_PIN_COUNTING_BIAS: for all other pages). ============================================================ General notes about the tracking patch: This is a prerequisite to solving the problem of proper interactions between file-backed pages, and [R]DMA activities, as discussed in [1], [2], [3], [4] and in a remarkable number of email threads since about 2017. :) In contrast to earlier approaches, the page tracking can be incrementally applied to the kernel call sites that, until now, have been simply calling get_user_pages() ("gup"). In other words, opt-in by changing from this: get_user_pages() (sets FOLL_GET) put_page() to this: pin_user_pages() (sets FOLL_PIN) unpin_user_page() ============================================================ Future steps: * Convert more subsystems from get_user_pages() to pin_user_pages(). The first probably needs to be bio/biovecs, because any filesystem testing is too difficult without those in place. * Change VFS and filesystems to respond appropriately when encountering dma-pinned pages. * Work with Ira and others to connect this all up with file system leases. [1] Some slow progress on get_user_pages() (Apr 2, 2019): https://lwn.net/Articles/784574/ [2] DMA and get_user_pages() (LPC: Dec 12, 2018): https://lwn.net/Articles/774411/ [3] The trouble with get_user_pages() (Apr 30, 2018): https://lwn.net/Articles/753027/ [4] LWN kernel index: get_user_pages() https://lwn.net/Kernel/Index/#Memory_management-get_user_pages This patch (of 12): An upcoming patch requires reusing the implementation of get_user_pages_remote(). Split up get_user_pages_remote() into an outer routine that checks flags, and an implementation routine that will be reused. This makes subsequent changes much easier to understand. There should be no change in behavior due to this patch. Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: http://lkml.kernel.org/r/20200211001536.1027652-2-jhubbard@nvidia.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/filemap.c: rewrite pagecache_get_page documentationMatthew Wilcox (Oracle)
- These were never called PCG flags; they've been called FGP flags since their introduction in 2014. - The FGP_FOR_MMAP flag was misleadingly documented as if it was an alternative to FGP_CREAT instead of an option to it. - Rename the 'offset' parameter to 'index'. - Capitalisation, formatting, rewording. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: http://lkml.kernel.org/r/20200318140253.6141-9-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/filemap.c: unexport find_get_entryMatthew Wilcox (Oracle)
No in-tree users (proc, madvise, memcg, mincore) can be built as a module. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: http://lkml.kernel.org/r/20200318140253.6141-8-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/page-writeback.c: use VM_BUG_ON_PAGE in clear_page_dirty_for_ioMatthew Wilcox (Oracle)
Dumping the page information in this circumstance helps for debugging. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: http://lkml.kernel.org/r/20200318140253.6141-7-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02include/linux/pagemap.h: rename arguments to find_subpageMatthew Wilcox (Oracle)
This isn't just a random struct page, it's known to be a head page, and calling it head makes the function better self-documenting. The pgoff_t is less confusing if it's named index instead of offset. Also add a couple of comments to explain why we're doing various things. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20200318140253.6141-3-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/filemap.c: use vm_fault error code directlyMatthew Wilcox (Oracle)
Use VM_FAULT_OOM instead of indirecting through vmf_error(-ENOMEM). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Link: http://lkml.kernel.org/r/20200318140253.6141-2-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/filemap.c: remove unused argument from shrink_readahead_size_eio()Souptick Joarder
The first argument of shrink_readahead_size_eio() is not used. Hence remove it from the function definition and from all the callers. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1583868093-24342-1-git-send-email-jrdr.linux@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/filemap.c: clear page error before actual readXianting Tian
Mount failure issue happens under the scenario: Application forked dozens of threads to mount the same number of cramfs images separately in docker, but several mounts failed with high probability. Mount failed due to the checking result of the page(read from the superblock of loop dev) is not uptodate after wait_on_page_locked(page) returned in function cramfs_read: wait_on_page_locked(page); if (!PageUptodate(page)) { ... } The reason of the checking result of the page not uptodate: systemd-udevd read the loopX dev before mount, because the status of loopX is Lo_unbound at this time, so loop_make_request directly trigger the calling of io_end handler end_buffer_async_read, which called SetPageError(page). So It caused the page can't be set to uptodate in function end_buffer_async_read: if(page_uptodate && !PageError(page)) { SetPageUptodate(page); } Then mount operation is performed, it used the same page which is just accessed by systemd-udevd above, Because this page is not uptodate, it will launch a actual read via submit_bh, then wait on this page by calling wait_on_page_locked(page). When the I/O of the page done, io_end handler end_buffer_async_read is called, because no one cleared the page error(during the whole read path of mount), which is caused by systemd-udevd reading, so this page is still in "PageError" status, which can't be set to uptodate in function end_buffer_async_read, then caused mount failure. But sometimes mount succeed even through systemd-udeved read loopX dev just before, The reason is systemd-udevd launched other loopX read just between step 3.1 and 3.2, the steps as below: 1, loopX dev default status is Lo_unbound; 2, systemd-udved read loopX dev (page is set to PageError); 3, mount operation 1) set loopX status to Lo_bound; ==>systemd-udevd read loopX dev<== 2) read loopX dev(page has no error) 3) mount succeed As the loopX dev status is set to Lo_bound after step 3.1, so the other loopX dev read by systemd-udevd will go through the whole I/O stack, part of the call trace as below: SYS_read vfs_read do_sync_read blkdev_aio_read generic_file_aio_read do_generic_file_read: ClearPageError(page); mapping->a_ops->readpage(filp, page); here, mapping->a_ops->readpage() is blkdev_readpage. In latest kernel, some function name changed, the call trace as below: blkdev_read_iter generic_file_read_iter generic_file_buffered_read: /* * A previous I/O error may have been due to temporary * failures, eg. mutipath errors. * Pg_error will be set again if readpage fails. */ ClearPageError(page); /* Start the actual read. The read will unlock the page*/ error=mapping->a_ops->readpage(flip, page); We can see ClearPageError(page) is called before the actual read, then the read in step 3.2 succeed. This patch is to add the calling of ClearPageError just before the actual read of read path of cramfs mount. Without the patch, the call trace as below when performing cramfs mount: do_mount cramfs_read cramfs_blkdev_read read_cache_page do_read_cache_page: filler(data, page); or mapping->a_ops->readpage(data, page); With the patch, the call trace as below when performing mount: do_mount cramfs_read cramfs_blkdev_read read_cache_page: do_read_cache_page: ClearPageError(page); <== new add filler(data, page); or mapping->a_ops->readpage(data, page); With the patch, mount operation trigger the calling of ClearPageError(page) before the actual read, the page has no error if no additional page error happen when I/O done. Signed-off-by: Xianting Tian <xianting_tian@126.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Jan Kara <jack@suse.cz> Cc: <yubin@h3c.com> Link: http://lkml.kernel.org/r/1583318844-22971-1-git-send-email-xianting_tian@126.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/page-writeback.c: write_cache_pages(): deduplicate identical checksMauricio Faria de Oliveira
There used to be a 'retry' label in between the two (identical) checks when first introduced in commit f446daaea9d4 ("mm: implement writeback livelock avoidance using page tagging"), and later modified/updated in commit 6e6938b6d313 ("writeback: introduce .tagged_writepages for the WB_SYNC_NONE sync stage"). The label has been removed in commit 64081362e8ff ("mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock"), and the (identical) checks are now present / performed immediately one after another. So, remove/deduplicate the latter check, moving tag_pages_for_writeback() into the former check before the 'tag' variable assignment, so it's clear that it's not used in this (similarly-named) function call but only later in pagevec_lookup_range_tag(). Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Kara <jack@suse.cz> Link: http://lkml.kernel.org/r/20200218221716.1648-1-mfo@canonical.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/filemap.c: don't bother dropping mmap_sem for zero size readaheadJan Kara
When handling a page fault, we drop mmap_sem to start async readahead so that we don't block on IO submission with mmap_sem held. However there's no point to drop mmap_sem in case readahead is disabled. Handle that case to avoid pointless dropping of mmap_sem and retrying the fault. This was actually reported to block mlockall(MCL_CURRENT) indefinitely. Fixes: 6b4c9f446981 ("filemap: drop the mmap_sem for all blocking operations") Reported-by: Minchan Kim <minchan@kernel.org> Reported-by: Robert Stupp <snazy@gmx.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Minchan Kim <minchan@kernel.org> Link: http://lkml.kernel.org/r/20200212101356.30759-1-jack@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/Makefile: disable KCSAN for kmemleakQian Cai
Kmemleak could scan task stacks while plain writes happens to those stack variables which could results in data races. For example, in sys_rt_sigaction and do_sigaction(), it could have plain writes in a 32-byte size. Since the kmemleak does not care about the actual values of a non-pointer and all do_sigaction() call sites only copy to stack variables, just disable KCSAN for kmemleak to avoid annotating anything outside Kmemleak just because Kmemleak scans everything. Suggested-by: Marco Elver <elver@google.com> Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Marco Elver <elver@google.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: http://lkml.kernel.org/r/1583263716-25150-1-git-send-email-cai@lca.pw Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/kmemleak.c: use address-of operator on section symbolsNathan Chancellor
Clang warns: mm/kmemleak.c:1955:28: warning: array comparison always evaluates to a constant [-Wtautological-compare] if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata) ^ mm/kmemleak.c:1955:60: warning: array comparison always evaluates to a constant [-Wtautological-compare] if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata) These are not true arrays, they are linker defined symbols, which are just addresses. Using the address of operator silences the warning and does not change the resulting assembly with either clang/ld.lld or gcc/ld (tested with diff + objdump -Dr). Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://github.com/ClangBuiltLinux/linux/issues/895 Link: http://lkml.kernel.org/r/20200220051551.44000-1-natechancellor@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02revert "topology: add support for node_to_mem_node() to determine the ↵Vlastimil Babka
fallback node" This reverts commit ad2c8144418c6a81cefe65379fd47bbe8344cef2. The function node_to_mem_node() was introduced by that commit for use in SLUB on systems with memoryless nodes, but it turned out to be unreliable on some architectures/configurations and a simpler solution exists than fixing it up. Thus commit 0715e6c516f1 ("mm, slub: prevent kmalloc_node crashes and memory leaks") removed the only user of node_to_mem_node() and we can revert the commit that introduced the function. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Christopher Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: PUVICHAKRAVARTHY RAMACHANDRAN <puvichakravarthy@in.ibm.com> Cc: Sachin Sant <sachinp@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20200320115533.9604-2-vbabka@suse.cz Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02slub: relocate freelist pointer to middle of objectKees Cook
In a recent discussion[1] with Vitaly Nikolenko and Silvio Cesare, it became clear that moving the freelist pointer away from the edge of allocations would likely improve the overall defensive posture of the inline freelist pointer. My benchmarks show no meaningful change to performance (they seem to show it being faster), so this looks like a reasonable change to make. Instead of having the freelist pointer at the very beginning of an allocation (offset 0) or at the very end of an allocation (effectively offset -sizeof(void *) from the next allocation), move it away from the edges of the allocation and into the middle. This provides some protection against small-sized neighboring overflows (or underflows), for which the freelist pointer is commonly the target. (Large or well controlled overwrites are much more likely to attack live object contents, instead of attempting freelist corruption.) The vaunted kernel build benchmark, across 5 runs. Before: Mean: 250.05 Std Dev: 1.85 and after, which appears mysteriously faster: Mean: 247.13 Std Dev: 0.76 Attempts at running "sysbench --test=memory" show the change to be well in the noise (sysbench seems to be pretty unstable here -- it's not really measuring allocation). Hackbench is more allocation-heavy, and while the std dev is above the difference, it looks like may manifest as an improvement as well: 20 runs of "hackbench -g 20 -l 1000", before: Mean: 36.322 Std Dev: 0.577 and after: Mean: 36.056 Std Dev: 0.598 [1] https://twitter.com/vnik5287/status/1235113523098685440 Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Christoph Lameter <cl@linux.com> Cc: Vitaly Nikolenko <vnik@duasynt.com> Cc: Silvio Cesare <silvio.cesare@gmail.com> Cc: Christoph Lameter <cl@linux.com>Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Link: http://lkml.kernel.org/r/202003051624.AAAC9AECC@keescook Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02slub: improve bit diffusion for freelist ptr obfuscationKees Cook
Under CONFIG_SLAB_FREELIST_HARDENED=y, the obfuscation was relatively weak in that the ptr and ptr address were usually so close that the first XOR would result in an almost entirely 0-byte value[1], leaving most of the "secret" number ultimately being stored after the third XOR. A single blind memory content exposure of the freelist was generally sufficient to learn the secret. Add a swab() call to mix bits a little more. This is a cheap way (1 cycle) to make attacks need more than a single exposure to learn the secret (or to know _where_ the exposure is in memory). kmalloc-32 freelist walk, before: ptr ptr_addr stored value secret ffff90c22e019020@ffff90c22e019000 is 86528eb656b3b5bd (86528eb656b3b59d) ffff90c22e019040@ffff90c22e019020 is 86528eb656b3b5fd (86528eb656b3b59d) ffff90c22e019060@ffff90c22e019040 is 86528eb656b3b5bd (86528eb656b3b59d) ffff90c22e019080@ffff90c22e019060 is 86528eb656b3b57d (86528eb656b3b59d) ffff90c22e0190a0@ffff90c22e019080 is 86528eb656b3b5bd (86528eb656b3b59d) ... after: ptr ptr_addr stored value secret ffff9eed6e019020@ffff9eed6e019000 is 793d1135d52cda42 (86528eb656b3b59d) ffff9eed6e019040@ffff9eed6e019020 is 593d1135d52cda22 (86528eb656b3b59d) ffff9eed6e019060@ffff9eed6e019040 is 393d1135d52cda02 (86528eb656b3b59d) ffff9eed6e019080@ffff9eed6e019060 is 193d1135d52cdae2 (86528eb656b3b59d) ffff9eed6e0190a0@ffff9eed6e019080 is f93d1135d52cdac2 (86528eb656b3b59d) [1] https://blog.infosectcbr.com.au/2020/03/weaknesses-in-linux-kernel-heap.html Fixes: 2482ddec670f ("mm: add SLUB free list pointer obfuscation") Reported-by: Silvio Cesare <silvio.cesare@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/202003051623.AF4F8CB@keescook Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/slub.c: replace kmem_cache->cpu_partial with wrapped APIschenqiwu
There are slub_cpu_partial() and slub_set_cpu_partial() APIs to wrap kmem_cache->cpu_partial. This patch will use the two APIs to replace kmem_cache->cpu_partial in slub code. Signed-off-by: chenqiwu <chenqiwu@xiaomi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Link: http://lkml.kernel.org/r/1582079562-17980-1-git-send-email-qiwuchen55@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02mm/slub.c: replace cpu_slab->partial with wrapped APIschenqiwu
There are slub_percpu_partial() and slub_set_percpu_partial() APIs to wrap kmem_cache->cpu_partial. This patch will use the two to replace cpu_slab->partial in slub code. Signed-off-by: chenqiwu <chenqiwu@xiaomi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Link: http://lkml.kernel.org/r/1581951895-3038-1-git-send-email-qiwuchen55@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02fs_parse: remove pr_notice() about each validationKees Cook
This notice fills my boot logs with scary-looking asterisks but doesn't really tell me anything. Let's just remove it; validation errors are already reported separately, so this is just a redundant list of filesystems. $ dmesg | grep VALIDATE [ 0.306256] *** VALIDATE tmpfs *** [ 0.307422] *** VALIDATE proc *** [ 0.308355] *** VALIDATE cgroup *** [ 0.308741] *** VALIDATE cgroup2 *** [ 0.813256] *** VALIDATE bpf *** [ 0.815272] *** VALIDATE ramfs *** [ 0.815665] *** VALIDATE hugetlbfs *** [ 0.876970] *** VALIDATE nfs *** [ 0.877383] *** VALIDATE nfs4 *** Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Seth Arnold <seth.arnold@canonical.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: http://lkml.kernel.org/r/202003061617.A8835CAAF@keescook Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: use memalloc_nofs_save instead of memalloc_noio_saveMatthew Wilcox (Oracle)
OCFS2 doesn't mind if memory reclaim makes I/Os happen; it just cares that it won't be reentered, so it can use memalloc_nofs_save() instead of memalloc_noio_save(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200326200214.1102-1-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: use scnprintf() for avoiding potential buffer overflowTakashi Iwai
Since snprintf() returns the would-be-output size instead of the actual output size, the succeeding calls may go beyond the given buffer limit. Fix it by replacing with scnprintf(). Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200311093516.25300-1-tiwai@suse.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: roll back the reference count modification of the parent directory if ↵wangjian
an error occurs Under some conditions, the directory cannot be deleted. The specific scenarios are as follows: (for example, /mnt/ocfs2 is the mount point) 1. Create the /mnt/ocfs2/p_dir directory. At this time, the i_nlink corresponding to the inode of the /mnt/ocfs2/p_dir directory is equal to 2. 2. During the process of creating the /mnt/ocfs2/p_dir/s_dir directory, if the call to the inc_nlink function in ocfs2_mknod succeeds, the functions such as ocfs2_init_acl, ocfs2_init_security_set, and ocfs2_dentry_attach_lock fail. At this time, the i_nlink corresponding to the inode of the /mnt/ocfs2/p_dir directory is equal to 3, but /mnt/ocfs2/p_dir/s_dir is not added to the /mnt/ocfs2/p_dir directory entry. 3. Delete the /mnt/ocfs2/p_dir directory (rm -rf /mnt/ocfs2/p_dir). At this time, it is found that the i_nlink corresponding to the inode corresponding to the /mnt/ocfs2/p_dir directory is equal to 3. Therefore, the /mnt/ocfs2/p_dir directory cannot be deleted. Signed-off-by: Jian wang <wangjian161@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jun Piao <piaojun@huawei.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/a44f6666-bbc4-405e-0e6c-0f4e922eeef6@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: ocfs2_fs.h: replace zero-length array with flexible-array memberGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://urldefense.com/v3/__https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html__;!!GqivPVa7Brio!OKPotRhYhHbCG2kibo8Q6_6CuKaa28d_74h1svxyR6rbshrK2L_BdrQpNbvJWBWb40QCkg$ [2] https://urldefense.com/v3/__https://github.com/KSPP/linux/issues/21__;!!GqivPVa7Brio!OKPotRhYhHbCG2kibo8Q6_6CuKaa28d_74h1svxyR6rbshrK2L_BdrQpNbvJWBUhNn9M6g$ [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200309202155.GA8432@embeddedor Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: dlm: replace zero-length array with flexible-array memberGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://urldefense.com/v3/__https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html__;!!GqivPVa7Brio!OVOYL_CouISa5L1Lw-20EEFQntw6cKMx-j8UdY4z78uYgzKBUFcfpn50GaurvbV5v7YiUA$ [2] https://urldefense.com/v3/__https://github.com/KSPP/linux/issues/21__;!!GqivPVa7Brio!OVOYL_CouISa5L1Lw-20EEFQntw6cKMx-j8UdY4z78uYgzKBUFcfpn50GaurvbXs8Eh8eg$ [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200309202016.GA8210@embeddedor Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: cluster: replace zero-length array with flexible-array memberGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://urldefense.com/v3/__https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html__;!!GqivPVa7Brio!NzMr-YRl2zy-K3lwLVVatz7x0uD2z7-ykQag4GrGigxmfWU8TWzDy6xrkTiW3hYl00czlw$ [2] https://urldefense.com/v3/__https://github.com/KSPP/linux/issues/21__;!!GqivPVa7Brio!NzMr-YRl2zy-K3lwLVVatz7x0uD2z7-ykQag4GrGigxmfWU8TWzDy6xrkTiW3hYHG1nAnw$ [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200309201907.GA8005@embeddedor Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: replace zero-length array with flexible-array memberGustavo A. R. Silva
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200213160244.GA6088@embeddedor Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: add missing annotations for ocfs2_refcount_cache_lock() and ↵Jules Irenge
ocfs2_refcount_cache_unlock() Sparse reports warnings at ocfs2_refcount_cache_lock() and ocfs2_refcount_cache_unlock() warning: context imbalance in ocfs2_refcount_cache_lock() - wrong count at exit warning: context imbalance in ocfs2_refcount_cache_unlock() - unexpected unlock The root cause is the missing annotation at ocfs2_refcount_cache_lock() and at ocfs2_refcount_cache_unlock() Add the missing __acquires(&rf->rf_lock) annotation to ocfs2_refcount_cache_lock() Add the missing __releases(&rf->rf_lock) annotation to ocfs2_refcount_cache_unlock() Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/20200224204130.18178-1-jbi.octave@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: remove useless errAlex Shi
We don't need 'err' in these 2 places, better to remove them. Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: ChenGang <cg.chen@huawei.com> Cc: Richard Fontana <rfontana@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1579577836-251879-1-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: correct annotation from "l_next_rec" to "l_next_free_rec"wangyan
Correct annotation from "l_next_rec" to "l_next_free_rec" Signed-off-by: Yan Wang <wangyan122@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jun Piao <piaojun@huawei.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Link: http://lkml.kernel.org/r/5e76c953-3479-1280-023c-ad05e4c75608@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: there is no need to log twice in several functionswangyan
There is no need to log twice in several functions. Signed-off-by: Yan Wang <wangyan122@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jun Piao <piaojun@huawei.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Link: http://lkml.kernel.org/r/77eec86a-f634-5b98-4f7d-0cd15185a37b@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: remove dlm_lock_is_remoteAlex Shi
This macro has been unused since it was introduced. Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/1579578203-254451-1-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: use OCFS2_SEC_BITS in macroAlex Shi
This macro should be used. Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/1579577840-251956-1-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: remove unused macrosAlex Shi
O2HB_DEFAULT_BLOCK_BITS/DLM_THREAD_MAX_ASTS/DLM_MIGRATION_RETRY_MS and OCFS2_MAX_RESV_WINDOW_BITS/OCFS2_MIN_RESV_WINDOW_BITS have been unused since commit 66effd3c6812 ("ocfs2/dlm: Do not migrate resource to a node that is leaving the domain"). Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: ChenGang <cg.chen@huawei.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Richard Fontana <rfontana@redhat.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joseph Qi <jiangqi903@gmail.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/1579577827-251796-1-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02ocfs2: remove FS_OCFS2_NMAlex Shi
This macro is unused since commit ab09203e302b ("sysctl fs: Remove dead binary sysctl support"). Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Link: http://lkml.kernel.org/r/1579577812-251572-1-git-send-email-alex.shi@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02scripts/spelling.txt: add more spellings to spelling.txtColin Ian King
Here are some of the more common spelling mistakes and typos that I've found while fixing up spelling mistakes in the kernel since November 2019 Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Joe Perches <joe@perches.com> Link: http://lkml.kernel.org/r/20200313174946.228216-1-colin.king@canonical.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02scripts/spelling.txt: add syfs/sysfs patternJonathan Neuschäfer
There are a few cases in the tree where "sysfs" is misspelled as "syfs". Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.ne> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Colin Ian King <colin.king@canonical.com> Cc: Xiong <xndchn@gmail.com> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Chris Paterson <chris.paterson2@renesas.com> Cc: Luca Ceresoli <luca@lucaceresoli.net> Link: http://lkml.kernel.org/r/20200218152010.27349-1-j.neuschaefer@gmx.net Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02asm-generic: make more kernel-space headers mandatoryMasahiro Yamada
Change a header to mandatory-y if both of the following are met: [1] At least one architecture (except um) specifies it as generic-y in arch/*/include/asm/Kbuild [2] Every architecture (except um) either has its own implementation (arch/*/include/asm/*.h) or specifies it as generic-y in arch/*/include/asm/Kbuild This commit was generated by the following shell script. ----------------------------------->8----------------------------------- arches=$(cd arch; ls -1 | sed -e '/Kconfig/d' -e '/um/d') tmpfile=$(mktemp) grep "^mandatory-y +=" include/asm-generic/Kbuild > $tmpfile find arch -path 'arch/*/include/asm/Kbuild' | xargs sed -n 's/^generic-y += \(.*\)/\1/p' | sort -u | while read header do mandatory=yes for arch in $arches do if ! grep -q "generic-y += $header" arch/$arch/include/asm/Kbuild && ! [ -f arch/$arch/include/asm/$header ]; then mandatory=no break fi done if [ "$mandatory" = yes ]; then echo "mandatory-y += $header" >> $tmpfile for arch in $arches do sed -i "/generic-y += $header/d" arch/$arch/include/asm/Kbuild done fi done sed -i '/^mandatory-y +=/d' include/asm-generic/Kbuild LANG=C sort $tmpfile >> include/asm-generic/Kbuild ----------------------------------->8----------------------------------- One obvious benefit is the diff stat: 25 files changed, 52 insertions(+), 557 deletions(-) It is tedious to list generic-y for each arch that needs it. So, mandatory-y works like a fallback default (by just wrapping asm-generic one) when arch does not have a specific header implementation. See the following commits: def3f7cefe4e81c296090e1722a76551142c227c a1b39bae16a62ce4aae02d958224f19316d98b24 It is tedious to convert headers one by one, so I processed by a shell script. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Simek <michal.simek@xilinx.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Arnd Bergmann <arnd@arndb.de> Link: http://lkml.kernel.org/r/20200210175452.5030-1-masahiroy@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02kthread: mark timer used by delayed kthread works as IRQ safePetr Mladek
The timer used by delayed kthread works are IRQ safe because the used kthread_delayed_work_timer_fn() is IRQ safe. It is properly marked when initialized by KTHREAD_DELAYED_WORK_INIT(). But TIMER_IRQSAFE flag is missing when initialized by kthread_init_delayed_work(). The missing flag might trigger invalid warning from del_timer_sync() when kthread_mod_delayed_work() is called with interrupts disabled. This patch is result of a discussion about using the API, see https://lkml.kernel.org/r/cfa886ad-e3b7-c0d2-3ff8-58d94170eab5@ti.com Reported-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20200217120709.1974-1-pmladek@suse.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-02tools/accounting/getdelays.c: fix netlink attribute lengthDavid Ahern
A recent change to the netlink code: 6e237d099fac ("netlink: Relax attr validation for fixed length types") logs a warning when programs send messages with invalid attributes (e.g., wrong length for a u32). Yafang reported this error message for tools/accounting/getdelays.c. send_cmd() is wrongly adding 1 to the attribute length. As noted in include/uapi/linux/netlink.h nla_len should be NLA_HDRLEN + payload length, so drop the +1. Fixes: 9e06d3f9f6b1 ("per task delay accounting taskstats interface: documentation fix") Reported-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Yafang Shao <laoar.shao@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200327173111.63922-1-dsahern@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-03-31x86: get rid of 'errret' argument to __get_user_xyz() macrossLinus Torvalds
Every remaining user just has the error case returning -EFAULT. In fact, the exception was __get_user_asm_nozero(), which was removed in commit 4b842e4e25b1 ("x86: get rid of small constant size cases in raw_copy_{to,from}_user()"), and the other __get_user_xyz() macros just followed suit for consistency. Fix up some macro whitespace while at it. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-03-31x86: remove __put_user_asm() infrastructureLinus Torvalds
The last user was removed by commit 4b842e4e25b1 ("x86: get rid of small constant size cases in raw_copy_{to,from}_user()"). Get rid of the left-overs before somebody tries to use it again. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-03-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextLinus Torvalds
Pull networking updates from David Miller: "Highlights: 1) Fix the iwlwifi regression, from Johannes Berg. 2) Support BSS coloring and 802.11 encapsulation offloading in hardware, from John Crispin. 3) Fix some potential Spectre issues in qtnfmac, from Sergey Matyukevich. 4) Add TTL decrement action to openvswitch, from Matteo Croce. 5) Allow paralleization through flow_action setup by not taking the RTNL mutex, from Vlad Buslov. 6) A lot of zero-length array to flexible-array conversions, from Gustavo A. R. Silva. 7) Align XDP statistics names across several drivers for consistency, from Lorenzo Bianconi. 8) Add various pieces of infrastructure for offloading conntrack, and make use of it in mlx5 driver, from Paul Blakey. 9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki. 10) Lots of parallelization improvements during configuration changes in mlxsw driver, from Ido Schimmel. 11) Add support to devlink for generic packet traps, which report packets dropped during ACL processing. And use them in mlxsw driver. From Jiri Pirko. 12) Support bcmgenet on ACPI, from Jeremy Linton. 13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei Starovoitov, and your's truly. 14) Support XDP meta-data in virtio_net, from Yuya Kusakabe. 15) Fix sysfs permissions when network devices change namespaces, from Christian Brauner. 16) Add a flags element to ethtool_ops so that drivers can more simply indicate which coalescing parameters they actually support, and therefore the generic layer can validate the user's ethtool request. Use this in all drivers, from Jakub Kicinski. 17) Offload FIFO qdisc in mlxsw, from Petr Machata. 18) Support UDP sockets in sockmap, from Lorenz Bauer. 19) Fix stretch ACK bugs in several TCP congestion control modules, from Pengcheng Yang. 20) Support virtual functiosn in octeontx2 driver, from Tomasz Duszynski. 21) Add region operations for devlink and use it in ice driver to dump NVM contents, from Jacob Keller. 22) Add support for hw offload of MACSEC, from Antoine Tenart. 23) Add support for BPF programs that can be attached to LSM hooks, from KP Singh. 24) Support for multiple paths, path managers, and counters in MPTCP. From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti, and others. 25) More progress on adding the netlink interface to ethtool, from Michal Kubecek" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits) net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline cxgb4/chcr: nic-tls stats in ethtool net: dsa: fix oops while probing Marvell DSA switches net/bpfilter: remove superfluous testing message net: macb: Fix handling of fixed-link node net: dsa: ksz: Select KSZ protocol tag netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write net: stmmac: add EHL 2.5Gbps PCI info and PCI ID net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID net: stmmac: create dwmac-intel.c to contain all Intel platform net: dsa: bcm_sf2: Support specifying VLAN tag egress rule net: dsa: bcm_sf2: Add support for matching VLAN TCI net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT net: dsa: bcm_sf2: Disable learning for ASP port net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge net: dsa: b53: Prevent tagged VLAN on port 7 for 7278 net: dsa: b53: Restore VLAN entries upon (re)configuration net: dsa: bcm_sf2: Fix overflow checks hv_netvsc: Remove unnecessary round_up for recv_completion_cnt ...
2020-03-31Merge tag 'ntb-5.7' of git://github.com/jonmason/ntbLinus Torvalds
Pull NTB updates from Jon Mason: "Bug fixes for a few printing issues, link status detection bug on AMD hardware, and a DMA address issue with ntb_perf. Also, large series of AMD NTB patches" * tag 'ntb-5.7' of git://github.com/jonmason/ntb: (21 commits) NTB: add pci shutdown handler for AMD NTB NTB: send DB event when driver is loaded or un-loaded NTB: remove redundant setting of DB valid mask NTB: return link up status correctly for PRI and SEC NTB: add helper functions to set and clear sideinfo NTB: move ntb_ctrl handling to init and deinit NTB: handle link up, D0 and D3 events correctly NTB: handle link down event correctly NTB: remove handling of peer_sta from amd_link_is_up NTB: set peer_sta within event handler itself NTB: return the side info status from amd_poll_link NTB: define a new function to get link status NTB: Enable link up and down event notification NTB: clear interrupt status register NTB: Fix access to link status and control register MAINTAINERS: update maintainer list for AMD NTB driver NTB: ntb_transport: Use scnprintf() for avoiding potential buffer overflow ntb_hw_switchtec: Fix ntb_mw_clear_trans error if size == 0 ntb_tool: Fix printk format NTB: ntb_perf: Fix address err in perf_copy_chunk ...
2020-03-31Merge tag 'platform-drivers-x86-v5.7-1' of ↵Linus Torvalds
git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver updates from Andy Shevchenko: - Fix for improper handling of fan_boost_mode in sysfs for ASUS laptops. - On newer ASUS laptops the 1st battery is named differently, here is a fix. - Fix Lex 2I385SW to allow both network cards to be used. - The power integrated circuit driver for Surface 3 has been added. - Refactor and clean up of Intel PMC driver and enable it on Intel Jasper Lake. - Clean up of Dell RBU driver. - Big update for Intel Speed Select technology support tool and driver. * tag 'platform-drivers-x86-v5.7-1' of git://git.infradead.org/linux-platform-drivers-x86: (75 commits) platform/x86: surface3_power: Fix always true condition in mshw0011_space_handler() platform/x86: surface3_power: Fix Kconfig section ordering platform/x86: surface3_power: Add missed headers platform/x86: surface3_power: Reformat GUID assignment platform/x86: surface3_power: Drop useless macro ACPI_PTR() platform/x86: surface3_power: Prefix POLL_INTERVAL with SURFACE_3 platform/x86: surface3_power: Simplify mshw0011_adp_psr() to one liner platform/x86: surface3_power: Use dev_err() instead of pr_err() platform/x86: surface3_power: Drop unused structure definition platform/x86: surface3_power: MSHW0011 rev-eng implementation platform/x86: intel_pmc_core: Make pmc_core_substate_res_show() generic platform/x86: intel_pmc_core: Make pmc_core_lpm_display() generic for platforms that support sub-states tools/power/x86/intel-speed-select: Fix a typo in error message tools/power/x86/intel-speed-select: Update version tools/power/x86/intel-speed-select: Avoid duplicate Package strings for json tools/power/x86/intel-speed-select: Add display for enabled cpus count tools/power/x86/intel-speed-select: Print friendly warning for bad command line tools/power/x86/intel-speed-select: Fix avx options for turbo-freq feature tools/power/x86/intel-speed-select: Improve CLX commands tools/power/x86/intel-speed-select: Show error for invalid CPUs in the options ...
2020-03-31Merge tag 'tty-5.7-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here is the big set of TTY / Serial patches for 5.7-rc1 Lots of console fixups and reworking in here, serial core tweaks (doesn't that ever get old, why are we still creating new serial devices?), serial driver updates, line-protocol driver updates, and some vt cleanups and fixes included in here as well. All have been in linux-next with no reported issues" * tag 'tty-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (161 commits) serial: 8250: Optimize irq enable after console write serial: 8250: Fix rs485 delay after console write vt: vt_ioctl: fix use-after-free in vt_in_use() vt: vt_ioctl: fix VT_DISALLOCATE freeing in-use virtual console tty: serial: make SERIAL_SPRD depend on COMMON_CLK tty: serial: fsl_lpuart: fix return value checking tty: serial: fsl_lpuart: move dma_request_chan() ARM: dts: tango4: Make /serial compatible with ns16550a ARM: dts: mmp*: Make the serial ports compatible with xscale-uart ARM: dts: mmp*: Fix serial port names ARM: dts: mmp2-brownstone: Don't redeclare phandle references ARM: dts: pxa*: Make the serial ports compatible with xscale-uart ARM: dts: pxa*: Fix serial port names ARM: dts: pxa*: Don't redeclare phandle references serial: omap: drop unused dt-bindings header serial: 8250: 8250_omap: Add DMA support for UARTs on K3 SoCs serial: 8250: 8250_omap: Work around errata causing spurious IRQs with DMA serial: 8250: 8250_omap: Extend driver data to pass FIFO trigger info serial: 8250: 8250_omap: Move locking out from __dma_rx_do_complete() serial: 8250: 8250_omap: Account for data in flight during DMA teardown ...
2020-03-31Merge tag 'mmc-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmcLinus Torvalds
Pull MMC updates from Ulf Hansson: "MMC core: - Add support for host software queue for (e)MMC/SD - Throttle polling rate for CMD6 - Update CMD13 busy condition check for CMD6 commands - Improve busy detect polling for erase/trim/discard/HPI - Fixup support for HW busy detection for HPI commands - Re-work and improve support for eMMC sanitize commands MMC host: - mmci: * Add support for sdmmc variant revision 2.0 - mmci_sdmmc: * Improve support for busyend detection * Fixup support for signal voltage switch * Add support for tuning with delay block - mtk-sd: * Fix another SDIO irq issue - sdhci: * Disable native card detect when GPIO based type exist - sdhci: * Add option to defer request completion - sdhci_am654: * Add support to set a tap value per speed mode - sdhci-esdhc-imx: * Add support for i.MX8MM based variant * Fixup support for standard tuning on i.MX8 usdhc * Optimize for strobe/clock dll settings * Fixup support for system and runtime suspend/resume - sdhci-iproc: * Update regulator/bus-voltage management for bcm2711 - sdhci-msm: * Prevent clock gating with PWRSAVE_DLL on broken variants * Fix management of CQE during SDHCI reset - sdhci-of-arasan: * Add support for auto tuning on ZynqMP based platforms - sdhci-omap: * Add support for system suspend/resume - sdhci-sprd: * Add support for HW busy detection * Enable support host software queue - sdhci-tegra: * Add support for HW busy detection - tmio/renesas_sdhi: * Enforce retune after runtime suspend - renesas_sdhi: * Use manual tap correction for HS400 on some variants * Add support for manual correction of tap values for tunings" * tag 'mmc-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (86 commits) mmc: cavium-octeon: remove nonsense variable coercion mmc: mediatek: fix SDIO irq issue mmc: mmci_sdmmc: Fix clear busyd0end irq flag dt-bindings: mmc: Fix node name in an example mmc: core: Re-work the code for eMMC sanitize mmc: sdhci: use FIELD_GET for preset value bit masks mmc: sdhci-of-at91: Display clock changes for debug purpose only mmc: sdhci: iproc: Add custom set_power() callback for bcm2711 mmc: sdhci: am654: Use sdhci_set_power_and_voltage() mmc: sdhci: at91: Use sdhci_set_power_and_voltage() mmc: sdhci: milbeaut: Use sdhci_set_power_and_voltage() mmc: sdhci: arasan: Use sdhci_set_power_and_voltage() mmc: sdhci: Introduce sdhci_set_power_and_bus_voltage() mmc: vub300: Use scnprintf() for avoiding potential buffer overflow dt-bindings: mmc: synopsys-dw-mshc: fix clock-freq-min-max in example sdhci: tegra: Enable MMC_CAP_WAIT_WHILE_BUSY host capability sdhci: tegra: Implement Tegra specific set_timeout callback mmc: sdhci-omap: Add Support for Suspend/Resume mmc: renesas_sdhi: simplify execute_tuning mmc: renesas_sdhi: Use BITS_PER_LONG helper ...
2020-03-31Merge tag 'kbuild-v5.7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: "Build system: - add CONFIG_UNUSED_KSYMS_WHITELIST, which will be useful to define a fixed set of export symbols for Generic Kernel Image (GKI) - allow to run 'make dt_binding_check' without .config - use full schema for checking DT examples in *.yaml files - make modpost fail for missing MODULE_IMPORT_NS(), which makes more sense because we know the produced modules are never loadable - Remove unused 'AS' variable Kconfig: - sanitize DEFCONFIG_LIST, and remove ARCH_DEFCONFIG from Kconfig files - relax the 'imply' behavior so that symbols implied by 'y' can become 'm' - make 'imply' obey 'depends on' in order to make 'imply' really weak Misc: - add documentation on building the kernel with Clang/LLVM - revive __HAVE_ARCH_STRLEN for 32bit sparc to use optimized strlen() - fix warning from deb-pkg builds when CONFIG_DEBUG_INFO=n - various script and Makefile cleanups" * tag 'kbuild-v5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits) Makefile: Update kselftest help information kbuild: deb-pkg: fix warning when CONFIG_DEBUG_INFO is unset kbuild: add outputmakefile to no-dot-config-targets kbuild: remove AS variable net: wan: wanxl: refactor the firmware rebuild rule net: wan: wanxl: use $(M68KCC) instead of $(M68KAS) for rebuilding firmware net: wan: wanxl: use allow to pass CROSS_COMPILE_M68k for rebuilding firmware kbuild: add comment about grouped target kbuild: add -Wall to KBUILD_HOSTCXXFLAGS kconfig: remove unused variable in qconf.cc sparc: revive __HAVE_ARCH_STRLEN for 32bit sparc kbuild: refactor Makefile.dtbinst more kbuild: compute the dtbs_install destination more simply Makefile: disallow data races on gcc-10 as well kconfig: make 'imply' obey the direct dependency kconfig: allow symbols implied by y to become m net: drop_monitor: use IS_REACHABLE() to guard net_dm_hw_report() modpost: return error if module is missing ns imports and MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS=n modpost: rework and consolidate logging interface kbuild: allow to run dt_binding_check without kernel configuration ...
2020-03-31Merge branch 'next-general' of ↵Linus Torvalds
git://git.kernel.org:/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Two minor updates for the core security subsystem: - kernel-doc warning fixes from Randy Dunlap - header cleanup from YueHaibing" * 'next-general' of git://git.kernel.org:/pub/scm/linux/kernel/git/jmorris/linux-security: security: remove duplicated include from security.h security: <linux/lsm_hooks.h>: fix all kernel-doc warnings
2020-03-31Merge tag 'selinux-pr-20200330' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull SELinux updates from Paul Moore: "We've got twenty SELinux patches for the v5.7 merge window, the highlights are below: - Deprecate setting /sys/fs/selinux/checkreqprot to 1. This flag was originally created to deal with legacy userspace and the READ_IMPLIES_EXEC personality flag. We changed the default from 1 to 0 back in Linux v4.4 and now we are taking the next step of deprecating it, at some point in the future we will take the final step of rejecting 1. - Allow kernfs symlinks to inherit the SELinux label of the parent directory. In order to preserve backwards compatibility this is protected by the genfs_seclabel_symlinks SELinux policy capability. - Optimize how we store filename transitions in the kernel, resulting in some significant improvements to policy load times. - Do a better job calculating our internal hash table sizes which resulted in additional policy load improvements and likely general SELinux performance improvements as well. - Remove the unused initial SIDs (labels) and improve how we handle initial SIDs. - Enable per-file labeling for the bpf filesystem. - Ensure that we properly label NFS v4.2 filesystems to avoid a temporary unlabeled condition. - Add some missing XFS quota command types to the SELinux quota access controls. - Fix a problem where we were not updating the seq_file position index correctly in selinuxfs. - We consolidate some duplicated code into helper functions. - A number of list to array conversions. - Update Stephen Smalley's email address in MAINTAINERS" * tag 'selinux-pr-20200330' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: clean up indentation issue with assignment statement NFS: Ensure security label is set for root inode MAINTAINERS: Update my email address selinux: avtab_init() and cond_policydb_init() return void selinux: clean up error path in policydb_init() selinux: remove unused initial SIDs and improve handling selinux: reduce the use of hard-coded hash sizes selinux: Add xfs quota command types selinux: optimize storage of filename transitions selinux: factor out loop body from filename_trans_read() security: selinux: allow per-file labeling for bpffs selinux: generalize evaluate_cond_node() selinux: convert cond_expr to array selinux: convert cond_av_list to array selinux: convert cond_list to array selinux: sel_avc_get_stat_idx should increase position index selinux: allow kernfs symlinks to inherit parent directory context selinux: simplify evaluate_cond_node() Documentation,selinux: deprecate setting checkreqprot to 1 selinux: move status variables out of selinux_ss
2020-03-31Merge tag 'audit-pr-20200330' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "We've got two audit patches for the v5.7 merge window with a stellar 14 lines changed between the two patches. The patch descriptions are far more lengthy than the patches themselves, which is a very good thing for patches this size IMHO. The patches pass our test suites and a quick summary is below: - Stop logging inode information when updating an audit file watch. Since we are not changing the inode, or the fact that we are watching the associated file, the inode information is just noise that we can do without. - Fix a problem where mandatory audit records were missing their accompanying audit records (e.g. SYSCALL records were missing). The missing records often meant that we didn't have the necessary context to understand what was going on when the event occurred" * tag 'audit-pr-20200330' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: trigger accompanying records when no rules present audit: CONFIG_CHANGE don't log internal bookkeeping as an event
2020-03-31Merge tag '5.7-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull cifs updates from Steve French: "First part of cifs/smb3 changes for merge window (others are still being tested). Various RDMA (smbdirect) fixes, addition of SMB3.1.1 POSIX support in readdir, 3 fixes for stable, and a fix for flock. Summary: New feature: - SMB3.1.1 POSIX support in readdir Fixes: - various RDMA (smbdirect) fixes - fix for flock - fallocate fix - some improved mount warnings - two timestamp related fixes - reconnect fix - three fixes for stable" * tag '5.7-rc-smb3-fixes-part1' of git://git.samba.org/sfrench/cifs-2.6: (28 commits) cifs: update internal module version number cifs: Allocate encryption header through kmalloc cifs: smbd: Check and extend sender credits in interrupt context cifs: smbd: Calculate the correct maximum packet size for segmented SMBDirect send/receive smb3: use SMB2_SIGNATURE_SIZE define CIFS: Fix bug which the return value by asynchronous read is error CIFS: check new file size when extending file by fallocate SMB3: Minor cleanup of protocol definitions SMB3: Additional compression structures SMB3: Add new compression flags cifs: smb2pdu.h: Replace zero-length array with flexible-array member cifs: clear PF_MEMALLOC before exiting demultiplex thread cifs: cifspdu.h: Replace zero-length array with flexible-array member CIFS: Warn less noisily on default mount fs/cifs: fix gcc warning in sid_to_id cifs: allow unlock flock and OFD lock across fork cifs: do d_move in rename cifs: add SMB2_open() arg to return POSIX data cifs: plumb smb2 POSIX dir enumeration cifs: add smb2 POSIX info level ...