aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2012-06-01new helpers: {clear,test,test_and_clear}_restore_sigmask()Al Viro
helpers parallel to set_restore_sigmask(), used in the next commits Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2012-05-31Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull second pile of signal handling patches from Al Viro: "This one is just task_work_add() series + remaining prereqs for it. There probably will be another pull request from that tree this cycle - at least for helpers, to get them out of the way for per-arch fixes remaining in the tree." Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile had brought in commit 97fd75b7b8e0 ("kernel/irq/manage.c: use the pr_foo() infrastructure to prefix printks") which changed one of the pr_err() calls that this merge moves around. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: keys: kill task_struct->replacement_session_keyring keys: kill the dummy key_replace_session_keyring() keys: change keyctl_session_to_parent() to use task_work_add() genirq: reimplement exit_irq_thread() hook via task_work_add() task_work_add: generic process-context callbacks avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers parisc: need to check NOTIFY_RESUME when exiting from syscall move key_repace_session_keyring() into tracehook_notify_resume() TIF_NOTIFY_RESUME is defined on all targets now
2012-05-31Merge branch 'akpm' (Andrew's patch-bomb)Linus Torvalds
Merge misc patches from Andrew Morton: - the "misc" tree - stuff from all over the map - checkpatch updates - fatfs - kmod changes - procfs - cpumask - UML - kexec - mqueue - rapidio - pidns - some checkpoint-restore feature work. Reluctantly. Most of it delayed a release. I'm still rather worried that we don't have a clear roadmap to completion for this work. * emailed from Andrew Morton <akpm@linux-foundation.org>: (78 patches) kconfig: update compression algorithm info c/r: prctl: add ability to set new mm_struct::exe_file c/r: prctl: extend PR_SET_MM to set up more mm_struct entries c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat syscalls, x86: add __NR_kcmp syscall fs, proc: introduce /proc/<pid>/task/<tid>/children entry sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() eventfd: change int to __u64 in eventfd_signal() fs/nls: add Apple NLS pidns: make killed children autoreap pidns: use task_active_pid_ns in do_notify_parent rapidio/tsi721: add DMA engine support rapidio: add DMA engine support for RIO data transfers ipc/mqueue: add rbtree node caching support tools/selftests: add mq_perf_tests ipc/mqueue: strengthen checks on mqueue creation ipc/mqueue: correct mq_attr_ok test ipc/mqueue: improve performance of send/recv selftests: add mq_open_tests ...
2012-05-31syscalls, x86: add __NR_kcmp syscallCyrill Gorcunov
While doing the checkpoint-restore in the user space one need to determine whether various kernel objects (like mm_struct-s of file_struct-s) are shared between tasks and restore this state. The 2nd step can be solved by using appropriate CLONE_ flags and the unshare syscall, while there's currently no ways for solving the 1st one. One of the ways for checking whether two tasks share e.g. mm_struct is to provide some mm_struct ID of a task to its proc file, but showing such info considered to be not that good for security reasons. Thus after some debates we end up in conclusion that using that named 'comparison' syscall might be the best candidate. So here is it -- __NR_kcmp. It takes up to 5 arguments - the pids of the two tasks (which characteristics should be compared), the comparison type and (in case of comparison of files) two file descriptors. Lookups for pids are done in the caller's PID namespace only. At moment only x86 is supported and tested. [akpm@linux-foundation.org: fix up selftests, warnings] [akpm@linux-foundation.org: include errno.h] [akpm@linux-foundation.org: tweak comment text] Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Andrey Vagin <avagin@openvz.org> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Glauber Costa <glommer@parallels.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Tejun Heo <tj@kernel.org> Cc: Matt Helsley <matthltc@us.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Valdis.Kletnieks@vt.edu Cc: Michal Marek <mmarek@suse.cz> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31um: properly check all process' threads for a live mmAnton Vorontsov
kill_off_processes() might miss a valid process, this is because checking for process->mm is not enough. Process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To catch this we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held). Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31um: fix possible race on task->mmAnton Vorontsov
Checking for task->mm is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so let's take the task lock while we care about its mm. Note that we should also use find_lock_task_mm() to check all process' threads for a valid mm, but for uml we'll do it in a separate patch. Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Richard Weinberger <richard@nod.at> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31um: should hold tasklist_lock while traversing processesAnton Vorontsov
Traversing the tasks requires holding tasklist_lock, otherwise it is unsafe. p.s. However, I'm not sure that calling os_kill_ptraced_process() in the atomic context is correct. It seem to work, but please take a closer look. Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Richard Weinberger <richard@nod.at> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31blackfin: fix possible deadlock in decode_address()Anton Vorontsov
Oleg Nesterov found an interesting deadlock possibility: > sysrq_showregs_othercpus() does smp_call_function(showacpu) > and showacpu() show_stack()->decode_address(). Now suppose that IPI > interrupts the task holding read_lock(tasklist). To fix this, blackfin should not grab the write_ variant of the tasklist lock, read_ one is enough. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31blackfin: a couple of task->mm handling fixesAnton Vorontsov
The patch fixes two problems: 1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we have to take the task lock while handle its mm. 2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To catch this we use find_lock_task_mm(), which walks up all threads and returns an appropriate task (with task lock held). Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31sh: use clear_tasks_mm_cpumask()Anton Vorontsov
Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held). clear_tasks_mm_cpumask() has the issue fixed, so let's use it. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31powerpc: use clear_tasks_mm_cpumask()Anton Vorontsov
Current CPU hotplug code has some task->mm handling issues: 1. Working with task->mm w/o getting mm or grabing the task lock is dangerous as ->mm might disappear (exit_mm() assigns NULL under task_lock(), so tasklist lock is not enough). We can't use get_task_mm()/mmput() pair as mmput() might sleep, so we must take the task lock while handle its mm. 2. Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held). clear_tasks_mm_cpumask() has all the issues fixed, so let's use it. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31arm: use clear_tasks_mm_cpumask()Anton Vorontsov
Checking for process->mm is not enough because process' main thread may exit or detach its mm via use_mm(), but other threads may still have a valid mm. To fix this we would need to use find_lock_task_mm(), which would walk up all threads and returns an appropriate task (with task lock held). clear_tasks_mm_cpumask() has this issue fixed, so let's use it. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> Cc: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31um/kernel/trap.c: port OOM changes to handle_page_fault()Kautuk Consul
Commit d065bd810b6d ("mm: retry page fault when blocking on disk transfer") and commit 37b23e0525d3 ("x86,mm: make pagefault killable") introduced changes into the x86 pagefault handler for making the page fault handler retryable as well as killable. These changes reduce the mmap_sem hold time, which is crucial during OOM killer invocation. Port these changes to um. Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-31Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull two small kvm fixes from Avi Kivity: "A build fix for non-kvm archs and a transparent hugepage refcount bugfix on hosts with 4M pages." * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Export asm-generic/kvm_para.h KVM: MMU: fix huge page adapted on non-PAE host
2012-05-31Merge tag 'please-pull-mce' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull mce cleanup from Tony Luck: "One more mce cleanup before the 3.5 merge window closes" * tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: x86/mce: Cleanup timer mess
2012-05-31Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 patches from Heiko Carstens: "A couple of s390 patches for the 3.5 merge window. Just a collection of bug fixes and cleanups." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/uaccess: fix access_ok compile warnings s390/cmpxchg: select HAVE_CMPXCHG_LOCAL option s390/cmpxchg: fix sign extension bugs s390/cmpxchg: fix 1 and 2 byte memory accesses s390/cmpxchg: fix compile warnings specific to s390 s390/cmpxchg: add missing memory barrier to cmpxchg64 s390/cpu: remove cpu "capabilities" sysfs attribute s390/kernel: Fix smp_call_ipl_cpu() for offline CPUs s390/kernel: Introduce memcpy_absolute() function s390/headers: replace __s390x__ with CONFIG_64BIT where possible s390/headers: remove #ifdef __KERNEL__ from not exported headers s390/irq: split irq stats for cpu-measurement alert facilities s390/kexec: Move early_pgm_check_handler() to text section s390/kdump: Use real mode for PSW restart and kexec s390/kdump: Account /sys/kernel/kexec_crash_size changes in OS info s390/kernel: Remove OS info init function call and diag 308 for kdump
2012-05-31Merge tag 'parisc-misc' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6 Pull misc parisc updates from James Bottomley: "This is a couple of updates to complete our fixes and one to fix a compile failure caused during the merge window. Additionally, we now switch to the generic strncopy_from_user." * tag 'parisc-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6: [PARISC] update parisc to use generic strncpy_from_user() [PARISC] Fix parisc compile failure after smp: Add task_struct argument to __cpu_up() [PARISC] fix TLB fault path on PA2.0 narrow systems [PARISC] fix boot failure on 32-bit systems caused by branch stubs placed before .text
2012-05-31Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull additional x86 fixes from Peter Anvin. * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, amd, xen: Avoid NULL pointer paravirt references x86, mtrr: Fix a type overflow in range_to_mtrr func x86, realmode: Unbreak the ia64 build of drivers/acpi/sleep.c x86/mm/pat: Improve scaling of pat_pagerange_is_ram() x86: hpet: Fix copy-and-paste mistake in earlier change x86/mce: Fix 32-bit build x86/bitops: Move BIT_64() for a wider use
2012-05-31Merge tag 'devel-late' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull late-merged development and first fixes for arm-soc from Olof Johansson: "This branch contains a few development patches for Samsung and Versatile Express that were submitted to arm-soc near the beginning of the merge window. We picked them up with the agreement that they would need to sit in linux-next for a while, and now they have. There are also two fixes: - One long-standing build breakage on ixp4xx due to missing gpiolib dependencies. - The other is for some gpio device tree changes needed on lpc32xx." * tag 'devel-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: LPC32xx: Adjust dts files to gpio dt binding ixp4xx: fix compilation by adding gpiolib support ARM: vexpress: Remove twice included header files ARM: vexpress: Device Tree updates ARM: EXYNOS: Support suspend and resume for EXYNOS5250 ARM: EXYNOS: Add Clock register list for save and restore ARM: EXYNOS: Add PMU table for EXYNOS5250 ARM: EXYNOS: Rename of function for pm.c ARM: EXYNOS: Remove GIC save & restore function ARM: dts: Add node for interrupt combiner controller on EXYNOS5250 ARM: S3C24XX: add support for second irq set of S3C2416 ARM: S3C64XX: use timekeeping wrapper on cpuidle ARM: S3C64XX: declare the states with the new api on cpuidle ARM: S3C64XX: Hook up carrier class modules on Cragganmore ARM: S3C64XX: Initial hookup for Bells module on Cragganmore
2012-05-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull Sparc updates from David S. Miller: 1) Remove the idiotic situation wherein Leon was a special case in all of the TLB/cache handling code. The worst side effect of this bogosity is that you couldn't build a kernel with Leon support enabled (to get better build coverage), and test boot it on a non-LEON cpu. Leon is, in all core respects, programatically identical to the 32-bit SRMMU. Except that they put the TLB registers in a different alternate address space location. Through code patching (for fast paths) and run time checks, this issue is now a thing of the past. From Sam Ravnborg. 2) There was a mis-merge of arch/sparc/Kconfig for one of the clockevents changes that went in, causing 32-bit sparc to start failing to build. I merged in your tree to get those clockevents changes (and added a note to the merge commit) then added Stephen Rothwell's fix for the merge error. 3) Software quad floating point emulation was not working properly on more recent Niagara chips, because the way the situation is reported by the cpu has changed. Nobody noticed because gcc emits calls to software emulation routines in glibc. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: (25 commits) sparc: fix sparc64 build due to leon.h inclusion sparc32: remove unused variable in head_32.S sparc32,leon: fix leon bootup sparc32: Export leon_dma_ops to modules. sparc32: support leon + sun in dma_make_coherent() sparc32,leon: always support leon in ioport sparc32,leon: always include leon_pmc in build sparc32: refactor cpu_idle() sparc32: srmmu_probe now knows about leon too sparc32: drop LEON hack for ASI_M_MMUREGS sparc32: introduce run-time patching of srmmu access functions sparc32: introduce support for run-time patching for all shared assembler code sparc32,leon: fix section mismatch warning sparc32,leon: always include leon_smp + leon_mm in build sparc32,leon: always include leon_kernel in build sparc32,leon: clean up leon.h sparc32: handle leon in cpu.c sparc32: handle leon in irq_32.c sparc32: add support for run-time patching of leon/sun single instructions sparc32: introduce sparc32_start_kernel called from head_32.S ...
2012-05-31[PARISC] update parisc to use generic strncpy_from_user()James Bottomley
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2012-05-30Merge branches 'fixes' and 'fixes2' into devel-lateOlof Johansson
* fixes: ixp4xx: fix compilation by adding gpiolib support * fixes2: ARM: LPC32xx: Adjust dts files to gpio dt binding
2012-05-30ARM: LPC32xx: Adjust dts files to gpio dt bindingRoland Stigge
The GPIO devicetree binding in 3.5 doesn't register the various LPC32xx GPIO banks via DT subnodes but always all at once, and changes the gpio referencing to 3 cells (bank, gpio, flags). This patch adjusts the DTS files to this binding that was just accepted to the gpio subsystem. Signed-off-by: Roland Stigge <stigge@antcom.de> Signed-off-by: Olof Johansson <olof@lixom.net>
2012-05-30x86, amd, xen: Avoid NULL pointer paravirt referencesKonrad Rzeszutek Wilk
Stub out MSR methods that aren't actually needed. This fixes a crash as Xen Dom0 on AMD Trinity systems. A bigger patch should be added to remove the paravirt machinery completely for the methods which apparently have no users! Reported-by: Andre Przywara <andre.przywara@amd.com> Link: http://lkml.kernel.org/r/20120530222356.GA28417@andromeda.dapyr.net Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@vger.kernel.org>
2012-05-30Merge branch 'late/board' into devel-lateOlof Johansson
* late/board: ARM: S3C64XX: Hook up carrier class modules on Cragganmore ARM: S3C64XX: Initial hookup for Bells module on Cragganmore
2012-05-30Merge branch 'late/soc' into devel-lateOlof Johansson
* late/soc: ARM: vexpress: Remove twice included header files ARM: vexpress: Device Tree updates ARM: EXYNOS: Support suspend and resume for EXYNOS5250 ARM: EXYNOS: Add Clock register list for save and restore ARM: EXYNOS: Add PMU table for EXYNOS5250 ARM: EXYNOS: Rename of function for pm.c ARM: EXYNOS: Remove GIC save & restore function ARM: dts: Add node for interrupt combiner controller on EXYNOS5250 ARM: S3C24XX: add support for second irq set of S3C2416
2012-05-30Merge branch 'late/cleanup' into devel-lateOlof Johansson
* late/cleanup: ARM: S3C64XX: use timekeeping wrapper on cpuidle ARM: S3C64XX: declare the states with the new api on cpuidle
2012-05-30x86/mce: Cleanup timer messThomas Gleixner
Use unsigned long for dealing with jiffies not int. Rename the callback to something sensible. Use __this_cpu_read/write for accessing per cpu data. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-05-30x86, mtrr: Fix a type overflow in range_to_mtrr funczhenzhong.duan
When boot on sun G5+ with 4T mem, see an overflow in mtrr cleanup as below. *BAD*gran_size: 2G chunk_size: 2G num_reg: 10 lose cover RAM: -18014398505283592M This is because 1<<31 sign extended. Use an unsigned long constant to fix it. Useful for mem larger than or equal to 4T. -v2: Use 64bit constant instead of explicit type conversion as suggested by Yinghai. Description updated too. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Link: http://lkml.kernel.org/r/4FC5A77F.6060505@oracle.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-05-30Merge branch 'x86/trampoline' into x86/urgentH. Peter Anvin
x86/trampoline contains an urgent commit which is necessarily on a newer baseline. Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-05-30x86, realmode: Unbreak the ia64 build of drivers/acpi/sleep.cH. Peter Anvin
Revert usage of acpi_wakeup_address and move definition to x86 architecture code in order to make compilation work in ia64. [jsakkine: tested compilation in ia64/x86-64 and added proper commit message] Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Originally-by: H. Peter Anvin <hpa@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Link: http://lkml.kernel.org/r/1338370421-27735-1-git-send-email-jarkko.sakkinen@intel.com Cc: Tony Luck <tony.luck@intel.com> Cc: Len Brown <lenb@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-05-30Merge branch 'x86/mce' into x86/urgentIngo Molnar
Merge in these fixlets. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30x86/mm/pat: Improve scaling of pat_pagerange_is_ram()John Dykstra
Function pat_pagerange_is_ram() scales poorly to large address ranges, because it probes the resource tree for each page. On a 2.6 GHz Opteron, this function consumes 34 ms for a 1 GB range. It is called twice during untrack_pfn_vma(), slowing process cleanup and handicapping the OOM killer. This replacement consumes less than 1ms, under the same conditions. Signed-off-by: John Dykstra <jdykstra@cray.com> on behalf of Cray Inc. Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1337980366.1979.6.camel@redwood [ Small stylistic cleanups and renames ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30s390/uaccess: fix access_ok compile warningsHeiko Carstens
On s390 access_ok is a macro which discards all parameters and always returns 1. This can result in compile warnings which warn about unused variables like this: fs/read_write.c: In function 'rw_copy_check_uvector': fs/read_write.c:684:16: warning: unused variable 'buf' [-Wunused-variable] Fix this by adding a __range_ok() function which consumes all parameters but still always returns 1. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-05-30s390/cmpxchg: select HAVE_CMPXCHG_LOCAL optionHeiko Carstens
Now that hopefully all cmpxchg/xchg bugs have been fixed select HAVE_CMPXCHG_LOCAL option which uncovered a couple of bugs on s390. The only call site which is affected seems to be within mm/vmstat.c. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-05-30s390/cmpxchg: fix sign extension bugsHeiko Carstens
For 1 and 2 byte operands for xchg and cmpxchg the old and new values get or'ed into the larger 4 byte old value before the compare and swap instruction gets executed. This is done without using the proper byte mask before or'ing the values. If the caller passed in negative old or new values these got sign extended by the caller. Which in turn means that either the old value never matches, or, even worse, unrelated bytes would be changed in memory. Luckily there don't seem to be any callers around yet, since that would have resulted in the specification exception fixed in an earlies patch. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-05-30s390/cmpxchg: fix 1 and 2 byte memory accessesHeiko Carstens
When accessing a 1 or 2 byte memory operand we cannot use the passed address since the compare and swap instruction only works for 4 byte aligned memory operands. Hence we calculate an aligned address so that compare and swap works correctly. However we don't pass the calculated address to the inline assembly. This results in incorrect memory accesses and in a specification exception if used on non 4 byte aligned memory operands. Since this didn't happen until now, there don't seem to be too many users of cmpxchg on unaligned addresses. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-05-30s390/cmpxchg: fix compile warnings specific to s390Heiko Carstens
The cmpxchg macros and functions are a bit different than on other architectures. In particular the macros do not store the return value of a __cmpxchg function call in a variable before returning the value. This causes compile warnings that only occur on s390 like this one: net/ipv4/af_inet.c: In function 'build_ehash_secret': net/ipv4/af_inet.c:241:2: warning: value computed is not used [-Wunused-value] To get rid of these warnings use the same construct that we already use for the xchg macro, which was introduced for the same reason. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-05-30s390/cmpxchg: add missing memory barrier to cmpxchg64Heiko Carstens
All cmpxchg functions imply a memory barrier. cmpxch64 did not have one for 31 bit code, so add it. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-05-30s390/cpu: remove cpu "capabilities" sysfs attributeHeiko Carstens
It has been a big mistage to add the capabilities attribute to the cpus in sysfs: First the attribute only contains the cpu capability of primary cpus, which however is not necessarily (or better: unlikely) the type of cpu the kernel runs on, which is typically an IFL. In addition all information that is necessary is available in /proc/sysinfo already. So this attribute partially duplicated informations. So programs should look into the sysinfo file to retrieve all informations they are interested in. Since with this kernel release also the powersavings cpu attributes are removed this seems to be a good opportunity to remove another broken interface. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-05-30s390/kernel: Fix smp_call_ipl_cpu() for offline CPUsMichael Holzheu
If the IPL CPU is offline, currently the pcpu_delegate() function used by smp_call_ipl_cpu() does not work because pcpu_delegate() modifies the lowcore of the target CPU. In case of an offline IPL CPU currently the prefix register is zero but pcpu->lowcore still points to the old prefix page. Therefore the lowcore changes done by pcpu_delegate() have no effect. With this fix pcpu_delegate() now uses memcpy_absolute() and therefore also prepares the absolute zero lowcore if the target CPU has prefix register zero. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-05-30s390/kernel: Introduce memcpy_absolute() functionMichael Holzheu
This patch introduces the new function memcpy_absolute() that allows to copy memory using absolute addressing. This means that the prefix swap does not apply when this function is used. With this patch also all s390 kernel code that accesses absolute zero now uses the new memcpy_absolute() function. The old and less generic copy_to_absolute_zero() function is removed. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-05-29Merge branch 'x86-trampoline-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 trampoline rework from H. Peter Anvin: "This code reworks all the "trampoline"/"realmode" code (various bits that need to live in the first megabyte of memory, most but not all of which runs in real mode at some point) in the kernel into a single object. The main reason for doing this is that it eliminates the last place in the kernel where we needed pages to be mapped RWX. This code separates all that code into proper R/RW/RX pages." Fix up conflicts in arch/x86/kernel/Makefile (mca removed next to reboot code), and arch/x86/kernel/reboot.c (reboot code moved around in one branch, modified in this one), and arch/x86/tools/relocs.c (mostly same code came in earlier due to working around the ld bugs just before the 3.4 release). Also remove stale x86-relocs entry from scripts/.gitignore as per Peter Anvin. * commit '61f5446169046c217a5479517edac3a890c3bee7': (36 commits) x86, realmode: Move end signature into header.S x86, relocs: When printing an error, say relative or absolute x86, relocs: More relocations which may end up as absolute x86, relocs: Workaround for binutils 2.22.52.0.1 section bug xen-acpi-processor: Add missing #include <xen/xen.h> acpi, bgrd: Add missing <linux/io.h> to drivers/acpi/bgrt.c x86, realmode: Change EFER to a single u64 field x86, realmode: Move kernel/realmode.c to realmode/init.c x86, realmode: Move not-common bits out of trampoline_common.S x86, realmode: Mask out EFER.LMA when saving trampoline EFER x86, realmode: Fix no cache bits test in reboot_32.S x86, realmode: Make sure all generated files are listed in targets x86, realmode: build fix: remove duplicate build x86, realmode: read cr4 and EFER from kernel for 64-bit trampoline x86, realmode: fixes compilation issue in tboot.c x86, realmode: move relocs from scripts/ to arch/x86/tools x86, realmode: header for trampoline code x86, realmode: flattened rm hierachy x86, realmode: don't copy real_mode_header x86, realmode: fix 64-bit wakeup sequence ...
2012-05-29Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds
Pull MIPS updates from Ralf Baechle: "The whole series has been sitting in -next for quite a while with no complaints. The last change to the series was before the weekend the removal of an SPI patch which Grant - even though previously acked by himself - appeared to raise objections. So I removed it until the situation is clarified. Other than that all the patches have the acks from their respective maintainers, all MIPS and x86 defconfigs are building fine and I'm not aware of any problems introduced by this series. Among the key features for this patch series is a sizable patchset for Lantiq which among other things introduces support for Lantiq's flagship product, the FALCON SOC. It also means that the opensource developers behind this patchset have overtaken Lantiq's competing inhouse development team that was working behind closed doors. Less noteworthy the ath79 patchset which adds support for a few more chip variants, cleanups and fixes. Finally the usual dose of tweaking of generic code." Fix up trivial conflicts in arch/mips/lantiq/xway/gpio_{ebu,stp}.c where printk spelling fixes clashed with file move and eventual removal of the printk. * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (81 commits) MIPS: lantiq: remove orphaned code MIPS: Remove all -Wall and almost all -Werror usage from arch/mips. MIPS: lantiq: implement support for FALCON soc MTD: MIPS: lantiq: verify that the NOR interface is available on falcon soc MTD: MIPS: lantiq: implement OF support watchdog: MIPS: lantiq: implement OF support and minor fixes SERIAL: MIPS: lantiq: implement OF support GPIO: MIPS: lantiq: convert gpio-stp-xway to OF GPIO: MIPS: lantiq: convert gpio-mm-lantiq to OF and of_mm_gpio GPIO: MIPS: lantiq: move gpio-stp and gpio-ebu to the subsystem folder MIPS: pci: convert lantiq driver to OF MIPS: lantiq: convert dma to platform driver MIPS: lantiq: implement support for clkdev api MIPS: lantiq: drop ltq_gpio_request() and gpio_to_irq() OF: MIPS: lantiq: implement irq_domain support OF: MIPS: lantiq: implement OF support MIPS: lantiq: drop mips_machine support OF: PCI: const usage needed by MIPS MIPS: Cavium: Remove smp_reserve_lock. MIPS: Move cache setup to setup_arch(). ...
2012-05-29Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds
Pull arm updates from Russell King: "This contains both some fixes found when trying to get the Assabet+neponset setup as a replacement firewall with a 3c589 PCMCIA card, and a bunch of changes from Al to fix up the ARM signal handling, particularly some of the restart behaviour." * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: ARM: neponset: make sure neponset_ncr_frob() is exported ARM: fix out[bwl]() arm: don't open-code ptrace_report_syscall() arm: bury unused _TIF_RESTORE_SIGMASK arm: remove unused restart trampoline arm: new way of handling ERESTART_RESTARTBLOCK arm: if we get into work_pending while returning to kernel mode, just go away arm: don't call try_to_freeze() from do_signal() arm: if there's no handler we need to restore sigmask, syscall or no syscall arm: trim _TIF_WORK_MASK, get rid of useless test and branch... arm: missing checks of __get_user()/__put_user() return values
2012-05-29rtc: rename CONFIG_RTC_MXC to CONFIG_RTC_DRV_MXCFabio Estevam
In order to keep consistency with other rtc drivers,rename CONFIG_RTC_MXC to CONFIG_RTC_DRV_MXC. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Acked-by: Wolfram Sang <w.sang@pengutronix.de> Cc: Alessandro Zummo <a.zummo@towertech.it> [akpm@linux-foundation.org: fix missed arch/arm/configs/imx_v6_v7_defconfig] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race conditionAndrea Arcangeli
When holding the mmap_sem for reading, pmd_offset_map_lock should only run on a pmd_t that has been read atomically from the pmdp pointer, otherwise we may read only half of it leading to this crash. PID: 11679 TASK: f06e8000 CPU: 3 COMMAND: "do_race_2_panic" #0 [f06a9dd8] crash_kexec at c049b5ec #1 [f06a9e2c] oops_end at c083d1c2 #2 [f06a9e40] no_context at c0433ded #3 [f06a9e64] bad_area_nosemaphore at c043401a #4 [f06a9e6c] __do_page_fault at c0434493 #5 [f06a9eec] do_page_fault at c083eb45 #6 [f06a9f04] error_code (via page_fault) at c083c5d5 EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP: 00000000 DS: 007b ESI: 9e201000 ES: 007b EDI: 01fb4700 GS: 00e0 CS: 0060 EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246 #7 [f06a9f38] _spin_lock at c083bc14 #8 [f06a9f44] sys_mincore at c0507b7d #9 [f06a9fb0] system_call at c083becd start len EAX: ffffffda EBX: 9e200000 ECX: 00001000 EDX: 6228537f DS: 007b ESI: 00000000 ES: 007b EDI: 003d0f00 SS: 007b ESP: 62285354 EBP: 62285388 GS: 0033 CS: 0073 EIP: 00291416 ERR: 000000da EFLAGS: 00000286 This should be a longstanding bug affecting x86 32bit PAE without THP. Only archs with 64bit large pmd_t and 32bit unsigned long should be affected. With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad() would partly hide the bug when the pmd transition from none to stable, by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is enabled a new set of problem arises by the fact could then transition freely in any of the none, pmd_trans_huge or pmd_trans_stable states. So making the barrier in pmd_none_or_trans_huge_or_clear_bad() unconditional isn't good idea and it would be a flakey solution. This should be fully fixed by introducing a pmd_read_atomic that reads the pmd in order with THP disabled, or by reading the pmd atomically with cmpxchg8b with THP enabled. Luckily this new race condition only triggers in the places that must already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix is localized there but this bug is not related to THP. NOTE: this can trigger on x86 32bit systems with PAE enabled with more than 4G of ram, otherwise the high part of the pmd will never risk to be truncated because it would be zero at all times, in turn so hiding the SMP race. This bug was discovered and fully debugged by Ulrich, quote: ---- [..] pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and eax. 496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t *pmd) 497 { 498 /* depend on compiler for an atomic pmd read */ 499 pmd_t pmdval = *pmd; // edi = pmd pointer 0xc0507a74 <sys_mincore+548>: mov 0x8(%esp),%edi ... // edx = PTE page table high address 0xc0507a84 <sys_mincore+564>: mov 0x4(%edi),%edx ... // eax = PTE page table low address 0xc0507a8e <sys_mincore+574>: mov (%edi),%eax [..] Please note that the PMD is not read atomically. These are two "mov" instructions where the high order bits of the PMD entry are fetched first. Hence, the above machine code is prone to the following race. - The PMD entry {high|low} is 0x0000000000000000. The "mov" at 0xc0507a84 loads 0x00000000 into edx. - A page fault (on another CPU) sneaks in between the two "mov" instructions and instantiates the PMD. - The PMD entry {high|low} is now 0x00000003fda38067. The "mov" at 0xc0507a8e loads 0xfda38067 into eax. ---- Reported-by: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29x86: print physical addresses consistently with other parts of kernelBjorn Helgaas
Print physical address info in a style consistent with the %pR style used elsewhere in the kernel. For example: -found SMP MP-table at [ffff8800000fce90] fce90 +found SMP MP-table at [mem 0x000fce90-0x000fce9f] mapped at [ffff8800000fce90] -initial memory mapped : 0 - 20000000 +initial memory mapped: [mem 0x00000000-0x1fffffff] -Base memory trampoline at [ffff88000009c000] 9c000 size 8192 +Base memory trampoline [mem 0x0009c000-0x0009dfff] mapped at [ffff88000009c000] -SRAT: Node 0 PXM 0 0-80000000 +SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29x86: print e820 physical addresses consistently with other parts of kernelBjorn Helgaas
Print physical address info in a style consistent with the %pR style used elsewhere in the kernel. For example: -BIOS-provided physical RAM map: +e820: BIOS-provided physical RAM map: - BIOS-e820: 0000000000000100 - 000000000009e000 (usable) +BIOS-e820: [mem 0x0000000000000100-0x000000000009dfff] usable -Allocating PCI resources starting at 90000000 (gap: 90000000:6ed1c000) +e820: [mem 0x90000000-0xfed1bfff] available for PCI devices -reserve RAM buffer: 000000000009e000 - 000000000009ffff +e820: reserve RAM buffer [mem 0x0009e000-0x0009ffff] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-05-29cris: select GENERIC_ATOMIC64Cong Wang
Cris doesn't implement atomic64 operations neither, should select GENERIC_ATOMIC64. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>