aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-08-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2018-08-05 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix bpftool percpu_array dump by using correct roundup to next multiple of 8 for the value size, from Yonghong. 2) Fix in AF_XDP's __xsk_rcv_zc() to not returning frames back to allocator since driver will recycle frame anyway in case of an error, from Jakub. 3) Fix up BPF test_lwt_seg6local test cases to final iproute2 syntax, from Mathieu. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-04net/smc: no cursor update send in state SMC_INITUrsula Braun
If a writer blocked condition is received without data, the current consumer cursor is immediately sent. Servers could already receive this condition in state SMC_INIT without finished tx-setup. This patch avoids sending a consumer cursor update in this case. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-04jfs: Fix usercopy whitelist for inline inode dataKees Cook
Bart Massey reported what turned out to be a usercopy whitelist false positive in JFS when symlink contents exceeded 128 bytes. The inline inode data (i_inline) is actually designed to overflow into the "extended area" following it (i_inline_ea) when needed. So the whitelist needed to be expanded to include both i_inline and i_inline_ea (the whole size of which is calculated internally using IDATASIZE, 256, instead of sizeof(i_inline), 128). $ cd /mnt/jfs $ touch $(perl -e 'print "B" x 250') $ ln -s B* b $ ls -l >/dev/null [ 249.436410] Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLUB object 'jfs_ip' (offset 616, size 250)! Reported-by: Bart Massey <bart.massey@gmail.com> Fixes: 8d2704d382a9 ("jfs: Define usercopy region in jfs_ip slab cache") Cc: Dave Kleikamp <shaggy@kernel.org> Cc: jfs-discussion@lists.sourceforge.net Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org>
2018-08-03Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "Two vmx bugfixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: x86: vmx: fix vpid leak KVM: vmx: use local variable for current_vmptr when emulating VMPTRST
2018-08-03l2tp: fix missing refcount drop in pppol2tp_tunnel_ioctl()Guillaume Nault
If 'session' is not NULL and is not a PPP pseudo-wire, then we fail to drop the reference taken by l2tp_session_get(). Fixes: ecd012e45ab5 ("l2tp: filter out non-PPP sessions in pppol2tp_tunnel_ioctl()") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03Merge branch 'mlxsw-Fix-ACL-actions-error-condition-handling'David S. Miller
Ido Schimmel says: ==================== mlxsw: Fix ACL actions error condition handling Nir says: Two issues were lately noticed within mlxsw ACL actions error condition handling. The first patch deals with conflicting actions such as: # tc filter add dev swp49 parent ffff: \ protocol ip pref 10 flower skip_sw dst_ip 192.168.101.1 \ action goto chain 100 \ action mirred egress redirect dev swp4 The second action will never execute, however SW model allows this configuration, while the mlxsw driver cannot allow for it as it implements actions in sets of up to three actions per set with a single termination marking. Conflicting actions create a contradiction over this single marking and thus cannot be configured. The fix replaces a misplaced warning with an error code to be returned. Patches 2-4 fix a condition of duplicate destruction of resources. Some actions require allocation of specific resource prior to setting the action itself. On error condition this resource was destroyed twice, leading to a crash when using mirror action, and to a redundant destruction in other cases, since for error condition rule destruction also takes care of resource destruction. In order to fix this state a symmetry in behavior is added and resource destruction also takes care of removing the resource from rule's resource list. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03mlxsw: core_acl_flex_actions: Remove redundant mirror resource destructionNir Dotan
In previous patch mlxsw_afa_resource_del() was added to avoid a duplicate resource detruction scenario. For mirror actions, such duplicate destruction leads to a crash as in: # tc qdisc add dev swp49 ingress # tc filter add dev swp49 parent ffff: \ protocol ip chain 100 pref 10 \ flower skip_sw dst_ip 192.168.101.1 action drop # tc filter add dev swp49 parent ffff: \ protocol ip pref 10 \ flower skip_sw dst_ip 192.168.101.1 action goto chain 100 \ action mirred egress mirror dev swp4 Therefore add a call to mlxsw_afa_resource_del() in mlxsw_afa_mirror_destroy() in order to clear that resource from rule's resources. Fixes: d0d13c1858a1 ("mlxsw: spectrum_acl: Add support for mirror action") Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03mlxsw: core_acl_flex_actions: Remove redundant counter destructionNir Dotan
Each tc flower rule uses a hidden count action. As counter resource may not be available due to limited HW resources, update _counter_create() and _counter_destroy() pair to follow previously introduced symmetric error condition handling, add a call to mlxsw_afa_resource_del() as part of the counter resource destruction. Fixes: c18c1e186ba8 ("mlxsw: core: Make counter index allocated inside the action append") Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03mlxsw: core_acl_flex_actions: Remove redundant resource destructionNir Dotan
Some ACL actions require the allocation of a separate resource prior to applying the action itself. When facing an error condition during the setup phase of the action, resource should be destroyed. For such actions the destruction was done twice which is dangerous and lead to a potential crash. The destruction took place first upon error on action setup phase and then as the rule was destroyed. The following sequence generated a crash: # tc qdisc add dev swp49 ingress # tc filter add dev swp49 parent ffff: \ protocol ip chain 100 pref 10 \ flower skip_sw dst_ip 192.168.101.1 action drop # tc filter add dev swp49 parent ffff: \ protocol ip pref 10 \ flower skip_sw dst_ip 192.168.101.1 action goto chain 100 \ action mirred egress mirror dev swp4 Therefore add mlxsw_afa_resource_del() as a complement of mlxsw_afa_resource_add() to add symmetry to resource_list membership handling. Call this from mlxsw_afa_fwd_entry_ref_destroy() to make the _fwd_entry_ref_create() and _fwd_entry_ref_destroy() pair of calls a NOP. Fixes: 140ce421217e ("mlxsw: core: Convert fwd_entry_ref list to be generic per-block resource list") Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03mlxsw: core_acl_flex_actions: Return error for conflicting actionsNir Dotan
Spectrum switch ACL action set is built in groups of three actions which may point to additional actions. A group holds a single record which can be set as goto record for pointing at a following group or can be set to mark the termination of the lookup. This is perfectly adequate for handling a series of actions to be executed on a packet. While the SW model allows configuration of conflicting actions where it is clear that some actions will never execute, the mlxsw driver must block such configurations as it creates a conflict over the single terminate/goto record value. For a conflicting actions configuration such as: # tc filter add dev swp49 parent ffff: \ protocol ip pref 10 \ flower skip_sw dst_ip 192.168.101.1 \ action goto chain 100 \ action mirred egress mirror dev swp4 Where it is clear that the last action will never execute, the mlxsw driver was issuing a warning instead of returning an error. Therefore replace that warning with an error for this specific case. Fixes: 4cda7d8d7098 ("mlxsw: core: Introduce flexible actions support") Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-03Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fix from Jason Gunthorpe: "One bug for missing user input validation: refuse invalid port numbers in the modify_qp system call" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/uverbs: Expand primary and alt AV port checks
2018-08-03Merge tag 'for-linus-20180803' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fix from Jens Axboe: "Just a single fix, from Ming, fixing a regression in this cycle where the busy tag iteration was changed to only calling the callback function for requests that are started. We really want all non-free requests. This fixes a boot regression on certain VM setups" * tag 'for-linus-20180803' of git://git.kernel.dk/linux-block: blk-mq: fix blk_mq_tagset_busy_iter
2018-08-03Merge tag 'nfs-for-4.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds
Pull NFS client bugfix from Trond Myklebust: "Fix a NFSv4 file locking regression" * tag 'nfs-for-4.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Fix _nfs4_do_setlk()
2018-08-03Merge tag 'powerpc-4.18-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "One fix for a regression in a recent TLB flush optimisation, which caused us to incorrectly not send TLB invalidations to coprocessors. Thanks to Frederic Barrat, Nicholas Piggin, Vaibhav Jain" * tag 'powerpc-4.18-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/radix: Fix missing global invalidations when removing copro
2018-08-03Merge tag 'drm-fixes-2018-08-03' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "Nothing too major at this late stage: - adv7511: reset fix - vc4: scaling fix - two atomic core fixes - one legacy core error handling fix I had a bunch of driver fixes from hdlcd but I think I'll leave them for -next at this point" * tag 'drm-fixes-2018-08-03' of git://anongit.freedesktop.org/drm/drm: drm/vc4: Reset ->{x, y}_scaling[1] when dealing with uniplanar formats drm/atomic: Initialize variables in drm_atomic_helper_async_check() to make gcc happy drm/atomic: Check old_plane_state->crtc in drm_atomic_helper_async_check() drm: re-enable error handling drm/bridge: adv7511: Reset registers on hotplug
2018-08-03Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a memory corruption in the padlock-aes driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: padlock-aes - Fix Nano workaround data corruption
2018-08-03nohz: Fix missing tick reprogram when interrupting an inline softirqFrederic Weisbecker
The full nohz tick is reprogrammed in irq_exit() only if the exit is not in a nesting interrupt. This stands as an optimization: whether a hardirq or a softirq is interrupted, the tick is going to be reprogrammed when necessary at the end of the inner interrupt, with even potential new updates on the timer queue. When soft interrupts are interrupted, it's assumed that they are executing on the tail of an interrupt return. In that case tick_nohz_irq_exit() is called after softirq processing to take care of the tick reprogramming. But the assumption is wrong: softirqs can be processed inline as well, ie: outside of an interrupt, like in a call to local_bh_enable() or from ksoftirqd. Inline softirqs don't reprogram the tick once they are done, as opposed to interrupt tail softirq processing. So if a tick interrupts an inline softirq processing, the next timer will neither be reprogrammed from the interrupting tick's irq_exit() nor after the interrupted softirq processing. This situation may leave the tick unprogrammed while timers are armed. To fix this, simply keep reprogramming the tick even if a softirq has been interrupted. That can be optimized further, but for now correctness is more important. Note that new timers enqueued in nohz_full mode after a softirq gets interrupted will still be handled just fine through self-IPIs triggered by the timer code. Reported-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: stable@vger.kernel.org # 4.14+ Link: https://lkml.kernel.org/r/1533303094-15855-1-git-send-email-frederic@kernel.org
2018-08-03genirq: Make force irq threading setup more robustThomas Gleixner
The support of force threading interrupts which are set up with both a primary and a threaded handler wreckaged the setup of regular requested threaded interrupts (primary handler == NULL). The reason is that it does not check whether the primary handler is set to the default handler which wakes the handler thread. Instead it replaces the thread handler with the primary handler as it would do with force threaded interrupts which have been requested via request_irq(). So both the primary and the thread handler become the same which then triggers the warnon that the thread handler tries to wakeup a not configured secondary thread. Fortunately this only happens when the driver omits the IRQF_ONESHOT flag when requesting the threaded interrupt, which is normaly caught by the sanity checks when force irq threading is disabled. Fix it by skipping the force threading setup when a regular threaded interrupt is requested. As a consequence the interrupt request which lacks the IRQ_ONESHOT flag is rejected correctly instead of silently wreckaging it. Fixes: 2a1d3ab8986d ("genirq: Handle force threading of irqs with primary and thread handler") Reported-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de> Cc: stable@vger.kernel.org
2018-08-03selftests/bpf: update test_lwt_seg6local.sh according to iproute2Mathieu Xhonneux
The shell file for test_lwt_seg6local contains an early iproute2 syntax for installing a seg6local End.BPF route. iproute2 support for this feature has recently been upstreamed, but with an additional keyword required. This patch updates test_lwt_seg6local.sh to the definitive iproute2 syntax Signed-off-by: Mathieu Xhonneux <m.xhonneux@gmail.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-02Merge tag 'media/v4.18-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - a deadlock regression at vsp1 driver - some Remote Controller fixes related to the new BPF filter logic added on it for Kernel 4.18. * tag 'media/v4.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: v4l: vsp1: Fix deadlock in VSPDL DRM pipelines media: rc: read out of bounds if bpf reports high protocol number media: bpf: ensure bpf program is freed on detach media: rc: be less noisy when driver misbehaves
2018-08-02Merge tag 'arc-4.18-final' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: "Another batch of fixes for ARC, this time mainly DMA API rework wreckage: - Fix software managed DMA wreckage after rework in 4.17 [Euginey] * missing cache flush * SMP_CACHE_BYTES vs cache_line_size - Fix allmodconfig build errors [Randy] - Maintainer update for Mellanox (EZChip) NPS platform" * tag 'arc-4.18-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: arc: fix type warnings in arc/mm/cache.c arc: fix build errors in arc/include/asm/delay.h arc: [plat-eznps] fix printk warning in arc/plat-eznps/mtm.c arc: [plat-eznps] fix data type errors in platform headers ARC: [plat-eznps] Add missing struct nps_host_reg_aux_dpc ARC: add SMP_CACHE_BYTES value validate ARC: dma [non-IOC] setup SMP_CACHE_BYTES and cache_line_size ARC: dma [non IOC]: fix arc_dma_sync_single_for_(device|cpu) ARC: Add Ofer Levi as plat-eznps maintainer
2018-08-02Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "3 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK fails ipc/shm.c add ->pagesize function to shm_vm_ops memcg: remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failure
2018-08-02userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK failsMike Rapoport
The fix in commit 0cbb4b4f4c44 ("userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails") cleared the vma->vm_userfaultfd_ctx but kept userfaultfd flags in vma->vm_flags that were copied from the parent process VMA. As the result, there is an inconsistency between the values of vma->vm_userfaultfd_ctx.ctx and vma->vm_flags which triggers BUG_ON in userfaultfd_release(). Clearing the uffd flags from vma->vm_flags in case of UFFD_EVENT_FORK failure resolves the issue. Link: http://lkml.kernel.org/r/1532931975-25473-1-git-send-email-rppt@linux.vnet.ibm.com Fixes: 0cbb4b4f4c44 ("userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails") Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reported-by: syzbot+121be635a7a35ddb7dcb@syzkaller.appspotmail.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02ipc/shm.c add ->pagesize function to shm_vm_opsJane Chu
Commit 05ea88608d4e ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct") adds a new ->pagesize() function to hugetlb_vm_ops, intended to cover all hugetlbfs backed files. With System V shared memory model, if "huge page" is specified, the "shared memory" is backed by hugetlbfs files, but the mappings initiated via shmget/shmat have their original vm_ops overwritten with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops. Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs backed vma, result in below BUG: fs/hugetlbfs/inode.c 443 if (unlikely(page_mapped(page))) { 444 BUG_ON(truncate_op); resulting in hugetlbfs: oracle (4592): Using mlock ulimits for SHM_HUGETLB is deprecated ------------[ cut here ]------------ kernel BUG at fs/hugetlbfs/inode.c:444! Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 ... CPU: 35 PID: 5583 Comm: oracle_5583_sbt Not tainted 4.14.35-1829.el7uek.x86_64 #2 RIP: 0010:remove_inode_hugepages+0x3db/0x3e2 .... Call Trace: hugetlbfs_evict_inode+0x1e/0x3e evict+0xdb/0x1af iput+0x1a2/0x1f7 dentry_unlink_inode+0xc6/0xf0 __dentry_kill+0xd8/0x18d dput+0x1b5/0x1ed __fput+0x18b/0x216 ____fput+0xe/0x10 task_work_run+0x90/0xa7 exit_to_usermode_loop+0xdd/0x116 do_syscall_64+0x187/0x1ae entry_SYSCALL_64_after_hwframe+0x150/0x0 [jane.chu@oracle.com: relocate comment] Link: http://lkml.kernel.org/r/20180731044831.26036-1-jane.chu@oracle.com Link: http://lkml.kernel.org/r/20180727211727.5020-1-jane.chu@oracle.com Fixes: 05ea88608d4e13 ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct") Signed-off-by: Jane Chu <jane.chu@oracle.com> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02memcg: remove memcg_cgroup::id from IDR on mem_cgroup_css_alloc() failureKirill Tkhai
In case of memcg_online_kmem() failure, memcg_cgroup::id remains hashed in mem_cgroup_idr even after memcg memory is freed. This leads to leak of ID in mem_cgroup_idr. This patch adds removal into mem_cgroup_css_alloc(), which fixes the problem. For better readability, it adds a generic helper which is used in mem_cgroup_alloc() and mem_cgroup_id_put_many() as well. Link: http://lkml.kernel.org/r/152354470916.22460.14397070748001974638.stgit@localhost.localdomain Fixes 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02drivers: net: lmc: fix case value for target abort errorColin Ian King
Current value for a target abort error is 0x010, however, this value should in fact be 0x002. As it stands, the range of error is 0..7 so it is currently never being detected. This bug has been in the driver since the early 2.6.12 days (or before). Detected by CoverityScan, CID#744290 ("Logically dead code") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-02blk-mq: fix blk_mq_tagset_busy_iterMing Lei
Commit d250bf4e776ff09d5("blk-mq: only iterate over inflight requests in blk_mq_tagset_busy_iter") uses 'blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT' to replace 'blk_mq_request_started(req)', this way is wrong, and causes lots of test system hang during booting. Fix the issue by using blk_mq_request_started(req) inside bt_tags_iter(). Fixes: d250bf4e776ff09d5 ("blk-mq: only iterate over inflight requests in blk_mq_tagset_busy_iter") Cc: Josef Bacik <josef@toxicpanda.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Mark Brown <broonie@kernel.org> Cc: Matt Hart <matthew.hart@linaro.org> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: John Garry <john.garry@huawei.com> Cc: Hannes Reinecke <hare@suse.com>, Cc: "Martin K. Petersen" <martin.petersen@oracle.com>, Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: linux-scsi@vger.kernel.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Reported-by: Mark Brown <broonie@kernel.org> Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-08-02fs: fix iomap_bmap position calculationEric Sandeen
The position calculation in iomap_bmap() shifts bno the wrong way, so we don't progress properly and end up re-mapping block zero over and over, yielding an unchanging physical block range as the logical block advances: # filefrag -Be file ext: logical_offset: physical_offset: length: expected: flags: 0: 0.. 0: 21.. 21: 1: merged 1: 1.. 1: 21.. 21: 1: 22: merged Discontinuity: Block 1 is at 21 (was 22) 2: 2.. 2: 21.. 21: 1: 22: merged Discontinuity: Block 2 is at 21 (was 22) 3: 3.. 3: 21.. 21: 1: 22: merged This breaks the FIBMAP interface for anyone using it (XFS), which in turn breaks LILO, zipl, etc. Bug-actually-spotted-by: Darrick J. Wong <darrick.wong@oracle.com> Fixes: 89eb1906a953 ("iomap: add an iomap-based bmap implementation") Cc: stable@vger.kernel.org Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-08-02Merge tag 'pci-v4.18-fixes-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - Fix integer overflow in new mobiveil driver (Dan Carpenter) - Fix race during NVMe removal/rescan (Hari Vyas) * tag 'pci-v4.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Fix is_added/is_busmaster race condition PCI: mobiveil: Avoid integer overflow in IB_WIN_SIZE
2018-08-02selftest/net: fix protocol family to work for IPv4.Maninder Singh
use actual protocol family passed by user rather than hardcoded AF_INTE6 to cerate sockets. current code is not working for IPv4. Signed-off-by: Maninder Singh <maninder1.s@samsung.com> Signed-off-by: Vaneet Narang <v.narang@samsung.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-02Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 regression fix from Will Deacon: "Ard found a nasty arm64 regression in 4.18 where the AES ghash/gcm code doesn't notify the kernel about its use of the vector registers, therefore potentially corrupting live user state. The fix is straightforward and Herbert agreed for it to go via arm64" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: crypto/arm64: aes-ce-gcm - add missing kernel_neon_begin/end pair
2018-08-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "Fixes keep trickling in: 1) Various IP fragmentation memory limit hardening changes from Eric Dumazet. 2) Revert ipv6 metrics leak change, it causes more problems than it fixes for now. 3) Fix WoL regression in stmmac driver, from Jose Abreu. 4) Netlink socket spectre v1 gadget fix, from Jeremy Cline" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: Revert "net/ipv6: fix metrics leak" rxrpc: Fix user call ID check in rxrpc_service_prealloc_one net: dsa: Do not suspend/resume closed slave_dev netlink: Fix spectre v1 gadget in netlink_create() Documentation: dpaa2: Use correct heading adornment net: stmmac: Fix WoL for PCI-based setups bonding: avoid lockdep confusion in bond_get_stats() enic: do not call enic_change_mtu in enic_probe ipv4: frags: handle possible skb truesize change inet: frag: enforce memory limits earlier net/mlx5e: IPoIB, Set the netdevice sw mtu in ipoib enhanced flow net/mlx5e: Fix null pointer access when setting MTU of vport representor net/mlx5e: Set port trust mode to PCP as default net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager net: dsa: mv88e6xxx: Fix SERDES support on 88E6141/6341 brcmfmac: fix regression in parsing NVRAM for multiple devices iwlwifi: add more card IDs for 9000 series
2018-08-02Squashfs: Compute expected length from inode size rather than block lengthPhillip Lougher
Previously in squashfs_readpage() when copying data into the page cache, it used the length of the datablock read from the filesystem (after decompression). However, if the filesystem has been corrupted this data block may be short, which will leave pages unfilled. The fix for this is to compute the expected number of bytes to copy from the inode size, and use this to detect if the block is short. Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Tested-by: Willy Tarreau <w@1wt.eu> Cc: Анатолий Тросиненко <anatoly.trosinenko@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02squashfs: more metadata hardeningLinus Torvalds
The squashfs fragment reading code doesn't actually verify that the fragment is inside the fragment table. The end result _is_ verified to be inside the image when actually reading the fragment data, but before that is done, we may end up taking a page fault because the fragment table itself might not even exist. Another report from Anatoly and his endless squashfs image fuzzing. Reported-by: Анатолий Тросиненко <anatoly.trosinenko@gmail.com> Acked-by:: Phillip Lougher <phillip.lougher@gmail.com>, Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-02x86/boot/compressed/64: Validate trampoline placement against E820Kirill A. Shutemov
There were two report of boot failure cased by trampoline placed into a reserved memory region. It can happen on machines that don't report EBDA correctly. Fix the problem by re-validating the found address against the E820 table. If the address is in a reserved area, find the next usable region below the initial address. Fixes: 3548e131ec6a ("x86/boot/compressed/64: Find a place for 32-bit trampoline") Reported-by: Dmitry Malkin <d.malkin@real-time-systems.com> Reported-by: youling 257 <youling257@gmail.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20180801133225.38121-1-kirill.shutemov@linux.intel.com
2018-08-01Revert "net/ipv6: fix metrics leak"David S. Miller
This reverts commit df18b50448fab1dff093731dfd0e25e77e1afcd1. This change causes other problems and use-after-free situations as found by syzbot. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01NFSv4: Fix _nfs4_do_setlk()Trond Myklebust
The patch to fix the case where a lock request was interrupted ended up changing default handling of errors such as NFS4ERR_DENIED and caused the client to immediately resend the lock request. Let's do a partial revert of that request so that the default is now to exit, but change the way we handle resends to take into account the fact that the user may have interrupted the request. Reported-by: Kenneth Johansson <ken@kenjo.org> Fixes: a3cf9bca2ace ("NFSv4: Don't add a new lock on an interrupted wait..") Cc: Benjamin Coddington <bcodding@redhat.com> Cc: Jeff Layton <jlayton@kernel.org> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
2018-08-01Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM fix from Russell King: "Just a single fix this time around for recent binutils causing build problems when generating Thumb-2 code" * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 8781/1: Fix Thumb-2 syscall return for binutils 2.29+
2018-08-01mm: do not initialize TLB stack vma's with vma_init()Linus Torvalds
Commit 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack and data segments") tried to initialize various left-over ad-hoc vma's "properly", but actually made things worse for the temporary vma's used for TLB flushing. vma_init() doesn't actually initialize all of the vma, just a few fields, so doing something like - struct vm_area_struct vma = { .vm_mm = tlb->mm, }; + struct vm_area_struct vma; + + vma_init(&vma, tlb->mm); was actually very bad: instead of having a nicely initialized vma with every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma with only a couple of fields initialized. And they weren't even fields that the code in question mostly cared about. The flush_tlb_range() function takes a "struct vma" rather than a "struct mm_struct", because a few architectures actually care about what kind of range it is - being able to only do an ITLB flush if it's a range that doesn't have data accesses enabled, for example. And all the normal users already have the vma for doing the range invalidation. But a few people want to call flush_tlb_range() with a range they just made up, so they also end up using a made-up vma. x86 just has a special "flush_tlb_mm_range()" function for this, but other architectures (arm and ia64) do the "use fake vma" thing instead, and thus got caught up in the vma_init() changes. At the same time, the TLB flushing code really doesn't care about most other fields in the vma, so vma_init() is just unnecessary and pointless. This fixes things by having an explicit "this is just an initializer for the TLB flush" initializer macro, which is used by the arm/arm64/ia64 people who mis-use this interface with just a dummy vma. Fixes: 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack and data segments") Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: John Stultz <john.stultz@linaro.org> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-01mm: delete historical BUG from zap_pmd_range()Hugh Dickins
Delete the old VM_BUG_ON_VMA() from zap_pmd_range(), which asserted that mmap_sem must be held when splitting an "anonymous" vma there. Whether that's still strictly true nowadays is not entirely clear, but the danger of sometimes crashing on the BUG is now fairly clear. Even with the new stricter rules for anonymous vma marking, the condition it checks for can possible trigger. Commit 44960f2a7b63 ("staging: ashmem: Fix SIGBUS crash when traversing mmaped ashmem pages") is good, and originally I thought it was safe from that VM_BUG_ON_VMA(), because the /dev/ashmem fd exposed to the user is disconnected from the vm_file in the vma, and madvise(,,MADV_REMOVE) insists on VM_SHARED. But after I read John's earlier mail, drawing attention to the vfs_fallocate() in there: I may be wrong, and I don't know if Android has THP in the config anyway, but it looks to me like an unmap_mapping_range() from ashmem's vfs_fallocate() could hit precisely the VM_BUG_ON_VMA(), once it's vma_is_anonymous(). Signed-off-by: Hugh Dickins <hughd@google.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-01rxrpc: Fix user call ID check in rxrpc_service_prealloc_oneYueHaibing
There just check the user call ID isn't already in use, hence should compare user_call_ID with xcall->user_call_ID, which is current node's user_call_ID. Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") Suggested-by: David Howells <dhowells@redhat.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01Merge tag 'mmc-v4.18-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fix from Ulf Hansson: "MMC host: mxcmmc: Fix build error for powerpc" * tag 'mmc-v4.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: mxcmmc: Fix missing parentheses and brace
2018-08-01Merge tag 'pm-urgent-4.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the scope of a recent intel_pstate driver optimization used incorrectly on some systems due to processor identification ambiguity and fix a few issues in the turbostat utility, including three recent regressions. Specifics: - Use ACPI FADT preferred PM Profile to distinguish Skylake desktop processors from some server ones with the same model number in order to limit the scope of the recent IO-wait boost optimization to servers, as intended (Srinivas Pandruvada). - Fix several issues in the turbostat utility: * Fix the -S option on 1-CPU systems (Len Brown). * Fix computations using incorrect processor core counts (Artem Bityutskiy). * Fix the x2apic debug message (Len Brown). * Fix logical node enumeration to allow for non-sequential physical nodes (Prarit Bhargava). * Fix reported family on modern AMD processors (Calvin Walton). * Clarify the RAPL column information in the man page (Len Brown)" * tag 'pm-urgent-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: intel_pstate: Limit the scope of HWP dynamic boost platforms tools/power turbostat: version 18.07.27 tools/power turbostat: Read extended processor family from CPUID tools/power turbostat: Fix logical node enumeration to allow for non-sequential physical nodes tools/power turbostat: fix x2apic debug message output file tools/power turbostat: fix bogus summary values tools/power turbostat: fix -S on UP systems tools/power turbostat: Update turbostat(8) RAPL throttling column description
2018-08-01squashfs metadata 2: electric boogalooLinus Torvalds
Anatoly continues to find issues with fuzzed squashfs images. This time, corrupt, missing, or undersized data for the page filling wasn't checked for, because the squashfs_{copy,read}_cache() functions did the squashfs_copy_data() call without checking the resulting data size. Which could result in the page cache pages being incompletely filled in, and no error indication to the user space reading garbage data. So make a helper function for the "fill in pages" case, because the exact same incomplete sequence existed in two places. [ I should have made a squashfs branch for these things, but I didn't intend to start doing them in the first place. My historical connection through cramfs is why I got into looking at these issues at all, and every time I (continue to) think it's a one-off. Because _this_ time is always the last time. Right? - Linus ] Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Tested-by: Willy Tarreau <w@1wt.eu> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-01staging: ashmem: Fix SIGBUS crash when traversing mmaped ashmem pagesJohn Stultz
Amit Pundir and Youling in parallel reported crashes with recent mainline kernels running Android: F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** F DEBUG : Build fingerprint: 'Android/db410c32_only/db410c32_only:Q/OC-MR1/102:userdebug/test-key F DEBUG : Revision: '0' F DEBUG : ABI: 'arm' F DEBUG : pid: 2261, tid: 2261, name: zygote >>> zygote <<< F DEBUG : signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0xec00008 ... <snip> ... F DEBUG : backtrace: F DEBUG : #00 pc 00001c04 /system/lib/libc.so (memset+48) F DEBUG : #01 pc 0010c513 /system/lib/libart.so (create_mspace_with_base+82) F DEBUG : #02 pc 0015c601 /system/lib/libart.so (art::gc::space::DlMallocSpace::CreateMspace(void*, unsigned int, unsigned int)+40) F DEBUG : #03 pc 0015c3ed /system/lib/libart.so (art::gc::space::DlMallocSpace::CreateFromMemMap(art::MemMap*, std::__1::basic_string<char, std::__ 1::char_traits<char>, std::__1::allocator<char>> const&, unsigned int, unsigned int, unsigned int, unsigned int, bool)+36) ... This was bisected back to commit bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives"). create_mspace_with_base() in the trace above, utilizes ashmem, and with ashmem, for shared mappings we use shmem_zero_setup(), which sets the vma->vm_ops to &shmem_vm_ops. But for private ashmem mappings nothing sets the vma->vm_ops. Looking at the problematic patch, it seems to add a requirement that one call vma_set_anonymous() on a vma, otherwise the dummy_vm_ops will be used. Using the dummy_vm_ops seem to triggger SIGBUS when traversing unmapped pages. Thus, this patch adds a call to vma_set_anonymous() for ashmem private mappings and seems to avoid the reported problem. Fixes: bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives") Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hugh Dickins <hughd@google.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Colin Cross <ccross@google.com> Cc: Matthew Wilcox <willy@infradead.org> Reported-by: Amit Pundir <amit.pundir@linaro.org> Reported-by: Youling 257 <youling257@gmail.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-01ia64: mark special ia64 memory areas anonymousLinus Torvalds
Commit bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives") made newly allocated vma's have a dummy vm_ops field so that they wouldn't be mistaken for anonymous mappings, and if you wanted an anonymous vma you had to explicitly say so by calling "vma_set_anonymous()" on it. However, it missed the two special vmas that ia64 processes have: the register backing store and the NaT page. So they wouldn't actually act like anonymous ranges, and page faults on them caused a SIGBUS rather than the creation of a new anon page in them. That obviously will make any ia64 binary very unhappy indeed, and the boot fails early. Fixes: bfd40eaff5ab ("mm: fix vma_is_anonymous() false-positives") Reported-by: Tony Luck <tony.luck@intel.com> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-01net: dsa: Do not suspend/resume closed slave_devFlorian Fainelli
If a DSA slave network device was previously disabled, there is no need to suspend or resume it. Fixes: 2446254915a7 ("net: dsa: allow switch drivers to implement suspend/resume hooks") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01netlink: Fix spectre v1 gadget in netlink_create()Jeremy Cline
'protocol' is a user-controlled value, so sanitize it after the bounds check to avoid using it for speculative out-of-bounds access to arrays indexed by it. This addresses the following accesses detected with the help of smatch: * net/netlink/af_netlink.c:654 __netlink_create() warn: potential spectre issue 'nlk_cb_mutex_keys' [w] * net/netlink/af_netlink.c:654 __netlink_create() warn: potential spectre issue 'nlk_cb_mutex_key_strings' [w] * net/netlink/af_netlink.c:685 netlink_create() warn: potential spectre issue 'nl_table' [w] (local cap) Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Jeremy Cline <jcline@redhat.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01Documentation: dpaa2: Use correct heading adornmentIoana Ciornei
Add overline heading adornment to document title in order to comply with kernel doc requirements. Fixes: 60b9131 staging: fsl-mc: Convert documentation to rst format Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-01net: stmmac: Fix WoL for PCI-based setupsJose Abreu
WoL won't work in PCI-based setups because we are not saving the PCI EP state before entering suspend state and not allowing D3 wake. Fix this by using a wrapper around stmmac_{suspend/resume} which correctly sets the PCI EP state. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>