path: root/include/trace/events
AgeCommit message (Collapse)Author
2013-06-20ARM: bL_switcher/trace: Add trace trigger for trace bootstrappingDave Martin
When tracing switching, an external tracer needs a way to bootstrap its knowledge of the logical<->physical CPU mapping. This patch adds a sysfs attribute trace_trigger. A write to this attribute will generate a power:cpu_migrate_current event for each online CPU, indicating the current physical CPU for each logical CPU. Activating or deactivating the switcher also generates these events, so that the tracer knows about the resulting remapping of affected CPUs. Signed-off-by: Dave Martin <dave.martin@linaro.org>
2013-06-20ARM: bL_switcher: Basic trace events supportDave Martin
This patch adds simple trace events to the b.L switcher code to allow tracing of CPU migration events. To make use of the trace events, you will need: CONFIG_FTRACE=y CONFIG_ENABLE_DEFAULT_TRACERS=y The following events are added: * power:cpu_migrate_begin * power:cpu_migrate_finish each with the following data: u64 timestamp; u32 cpu_hwid; power:cpu_migrate_begin occurs immediately before the switcher-specific migration operations start. power:cpu_migrate_finish occurs immediately when migration is completed. The cpu_hwid field contains the ID fields of the MPIDR. * For power:cpu_migrate_begin, cpu_hwid is the ID of the outbound physical CPU (equivalent to (from_phys_cpu,from_phys_cluster)). * For power:cpu_migrate_finish, cpu_hwid is the ID of the inbound physical CPU (equivalent to (to_phys_cpu,to_phys_cluster)). By design, the cpu_hwid field is masked in the same way as the device tree cpu node reg property, allowing direct correlation to the DT description of the hardware. The timestamp is added in order to minimise timing noise. An accurate system-wide clock should be used for generating this (hopefully getnstimeofday is appropriate, but it could be changed). It could be any monotonic shared clock, since the aim is to allow accurate deltas to be computed. We don't necessarily care about accurate synchronisation with wall clock time. In practice, each switch takes place on a single logical CPU, and the trace infrastructure should guarantee that events are well-ordered with respect to a single logical CPU. Signed-off-by: Dave Martin <dave.martin@linaro.org> Signed-off-by: Nicolas Pitre <nico@linaro.org>
2013-05-14Merge tag 'ext4_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 update from Ted Ts'o: "Fixed regressions (two stability regressions and a performance regression) introduced during the 3.10-rc1 merge window. Also included is a bug fix relating to allocating blocks after resizing an ext3 file system when using the ext4 file system driver" * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: jbd,jbd2: fix oops in jbd2_journal_put_journal_head() ext4: revert "ext4: use io_end for multiple bios" ext4: limit group search loop for non-extent files ext4: fix fio regression
2013-05-08Merge tag 'f2fs-for-v3.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "This patch-set includes the following major enhancement patches. - introduce a new gloabl lock scheme - add tracepoints on several major functions - fix the overall cleaning process focused on victim selection - apply the block plugging to merge IOs as much as possible - enhance management of free nids and its list - enhance the readahead mode for node pages - address several cretical deadlock conditions - reduce lock_page calls The other minor bug fixes and enhancements are as follows. - calculation mistakes: overflow - bio types: READ, READA, and READ_SYNC - fix the recovery flow, data races, and null pointer errors" * tag 'f2fs-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (68 commits) f2fs: cover free_nid management with spin_lock f2fs: optimize scan_nat_page() f2fs: code cleanup for scan_nat_page() and build_free_nids() f2fs: bugfix for alloc_nid_failed() f2fs: recover when journal contains deleted files f2fs: continue to mount after failing recovery f2fs: avoid deadlock during evict after f2fs_gc f2fs: modify the number of issued pages to merge IOs f2fs: remove useless #include <linux/proc_fs.h> as we're now using sysfs as debug entry. f2fs: fix inconsistent using of NM_WOUT_THRESHOLD f2fs: check truncation of mapping after lock_page f2fs: enhance alloc_nid and build_free_nids flows f2fs: add a tracepoint on f2fs_new_inode f2fs: check nid == 0 in add_free_nid f2fs: add REQ_META about metadata requests for submit f2fs: give a chance to merge IOs by IO scheduler f2fs: avoid frequent background GC f2fs: add tracepoints to debug checkpoint request f2fs: add tracepoints for write page operations f2fs: add tracepoints to debug the block allocation ...
2013-05-08Merge branch 'for-3.10/drivers' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block driver updates from Jens Axboe: "It might look big in volume, but when categorized, not a lot of drivers are touched. The pull request contains: - mtip32xx fixes from Micron. - A slew of drbd updates, this time in a nicer series. - bcache, a flash/ssd caching framework from Kent. - Fixes for cciss" * 'for-3.10/drivers' of git://git.kernel.dk/linux-block: (66 commits) bcache: Use bd_link_disk_holder() bcache: Allocator cleanup/fixes cciss: bug fix to prevent cciss from loading in kdump crash kernel cciss: add cciss_allow_hpsa module parameter drivers/block/mg_disk.c: add CONFIG_PM_SLEEP to suspend/resume functions mtip32xx: Workaround for unaligned writes bcache: Make sure blocksize isn't smaller than device blocksize bcache: Fix merge_bvec_fn usage for when it modifies the bvm bcache: Correctly check against BIO_MAX_PAGES bcache: Hack around stuff that clones up to bi_max_vecs bcache: Set ra_pages based on backing device's ra_pages bcache: Take data offset from the bdev superblock. mtip32xx: mtip32xx: Disable TRIM support mtip32xx: fix a smatch warning bcache: Disable broken btree fuzz tester bcache: Fix a format string overflow bcache: Fix a minor memory leak on device teardown bcache: Documentation updates bcache: Use WARN_ONCE() instead of __WARN() bcache: Add missing #include <linux/prefetch.h> ...
2013-05-08Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block core updates from Jens Axboe: - Major bit is Kents prep work for immutable bio vecs. - Stable candidate fix for a scheduling-while-atomic in the queue bypass operation. - Fix for the hang on exceeded rq->datalen 32-bit unsigned when merging discard bios. - Tejuns changes to convert the writeback thread pool to the generic workqueue mechanism. - Runtime PM framework, SCSI patches exists on top of these in James' tree. - A few random fixes. * 'for-3.10/core' of git://git.kernel.dk/linux-block: (40 commits) relay: move remove_buf_file inside relay_close_buf partitions/efi.c: replace useless kzalloc's by kmalloc's fs/block_dev.c: fix iov_shorten() criteria in blkdev_aio_read() block: fix max discard sectors limit blkcg: fix "scheduling while atomic" in blk_queue_bypass_start Documentation: cfq-iosched: update documentation help for cfq tunables writeback: expose the bdi_wq workqueue writeback: replace custom worker pool implementation with unbound workqueue writeback: remove unused bdi_pending_list aoe: Fix unitialized var usage bio-integrity: Add explicit field for owner of bip_buf block: Add an explicit bio flag for bios that own their bvec block: Add bio_alloc_pages() block: Convert some code to bio_for_each_segment_all() block: Add bio_for_each_segment_all() bounce: Refactor __blk_queue_bounce to not use bi_io_vec raid1: use bio_copy_data() pktcdvd: Use bio_reset() in disabled code to kill bi_idx usage pktcdvd: use bio_copy_data() block: Add bio_copy_data() ...
2013-05-05Merge tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Gleb Natapov: "Highlights of the updates are: general: - new emulated device API - legacy device assignment is now optional - irqfd interface is more generic and can be shared between arches x86: - VMCS shadow support and other nested VMX improvements - APIC virtualization and Posted Interrupt hardware support - Optimize mmio spte zapping ppc: - BookE: in-kernel MPIC emulation with irqfd support - Book3S: in-kernel XICS emulation (incomplete) - Book3S: HV: migration fixes - BookE: more debug support preparation - BookE: e6500 support ARM: - reworking of Hyp idmaps s390: - ioeventfd for virtio-ccw And many other bug fixes, cleanups and improvements" * tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits) kvm: Add compat_ioctl for device control API KVM: x86: Account for failing enable_irq_window for NMI window request KVM: PPC: Book3S: Add API for in-kernel XICS emulation kvm/ppc/mpic: fix missing unlock in set_base_addr() kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write kvm/ppc/mpic: remove users kvm/ppc/mpic: fix mmio region lists when multiple guests used kvm/ppc/mpic: remove default routes from documentation kvm: KVM_CAP_IOMMU only available with device assignment ARM: KVM: iterate over all CPUs for CPU compatibility check KVM: ARM: Fix spelling in error message ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally KVM: ARM: Fix API documentation for ONE_REG encoding ARM: KVM: promote vfp_host pointer to generic host cpu context ARM: KVM: add architecture specific hook for capabilities ARM: KVM: perform HYP initilization for hotplugged CPUs ARM: KVM: switch to a dual-step HYP init code ARM: KVM: rework HYP page table freeing ARM: KVM: enforce maximum size for identity mapped code ARM: KVM: move to a KVM provided HYP idmap ...
2013-05-05Merge branch 'timers-nohz-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull 'full dynticks' support from Ingo Molnar: "This tree from Frederic Weisbecker adds a new, (exciting! :-) core kernel feature to the timer and scheduler subsystems: 'full dynticks', or CONFIG_NO_HZ_FULL=y. This feature extends the nohz variable-size timer tick feature from idle to busy CPUs (running at most one task) as well, potentially reducing the number of timer interrupts significantly. This feature got motivated by real-time folks and the -rt tree, but the general utility and motivation of full-dynticks runs wider than that: - HPC workloads get faster: CPUs running a single task should be able to utilize a maximum amount of CPU power. A periodic timer tick at HZ=1000 can cause a constant overhead of up to 1.0%. This feature removes that overhead - and speeds up the system by 0.5%-1.0% on typical distro configs even on modern systems. - Real-time workload latency reduction: CPUs running critical tasks should experience as little jitter as possible. The last remaining source of kernel-related jitter was the periodic timer tick. - A single task executing on a CPU is a pretty common situation, especially with an increasing number of cores/CPUs, so this feature helps desktop and mobile workloads as well. The cost of the feature is mainly related to increased timer reprogramming overhead when a CPU switches its tick period, and thus slightly longer to-idle and from-idle latency. Configuration-wise a third mode of operation is added to the existing two NOHZ kconfig modes: - CONFIG_HZ_PERIODIC: [formerly !CONFIG_NO_HZ], now explicitly named as a config option. This is the traditional Linux periodic tick design: there's a HZ tick going on all the time, regardless of whether a CPU is idle or not. - CONFIG_NO_HZ_IDLE: [formerly CONFIG_NO_HZ=y], this turns off the periodic tick when a CPU enters idle mode. - CONFIG_NO_HZ_FULL: this new mode, in addition to turning off the tick when a CPU is idle, also slows the tick down to 1 Hz (one timer interrupt per second) when only a single task is running on a CPU. The .config behavior is compatible: existing !CONFIG_NO_HZ and CONFIG_NO_HZ=y settings get translated to the new values, without the user having to configure anything. CONFIG_NO_HZ_FULL is turned off by default. This feature is based on a lot of infrastructure work that has been steadily going upstream in the last 2-3 cycles: related RCU support and non-periodic cputime support in particular is upstream already. This tree adds the final pieces and activates the feature. The pull request is marked RFC because: - it's marked 64-bit only at the moment - the 32-bit support patch is small but did not get ready in time. - it has a number of fresh commits that came in after the merge window. The overwhelming majority of commits are from before the merge window, but still some aspects of the tree are fresh and so I marked it RFC. - it's a pretty wide-reaching feature with lots of effects - and while the components have been in testing for some time, the full combination is still not very widely used. That it's default-off should reduce its regression abilities and obviously there are no known regressions with CONFIG_NO_HZ_FULL=y enabled either. - the feature is not completely idempotent: there is no 100% equivalent replacement for a periodic scheduler/timer tick. In particular there's ongoing work to map out and reduce its effects on scheduler load-balancing and statistics. This should not impact correctness though, there are no known regressions related to this feature at this point. - it's a pretty ambitious feature that with time will likely be enabled by most Linux distros, and we'd like you to make input on its design/implementation, if you dislike some aspect we missed. Without flaming us to crisp! :-) Future plans: - there's ongoing work to reduce 1Hz to 0Hz, to essentially shut off the periodic tick altogether when there's a single busy task on a CPU. We'd first like 1 Hz to be exposed more widely before we go for the 0 Hz target though. - once we reach 0 Hz we can remove the periodic tick assumption from nr_running>=2 as well, by essentially interrupting busy tasks only as frequently as the sched_latency constraints require us to do - once every 4-40 msecs, depending on nr_running. I am personally leaning towards biting the bullet and doing this in v3.10, like the -rt tree this effort has been going on for too long - but the final word is up to you as usual. More technical details can be found in Documentation/timers/NO_HZ.txt" * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits) sched: Keep at least 1 tick per second for active dynticks tasks rcu: Fix full dynticks' dependency on wide RCU nocb mode nohz: Protect smp_processor_id() in tick_nohz_task_switch() nohz_full: Add documentation. cputime_nsecs: use math64.h for nsec resolution conversion helpers nohz: Select VIRT_CPU_ACCOUNTING_GEN from full dynticks config nohz: Reduce overhead under high-freq idling patterns nohz: Remove full dynticks' superfluous dependency on RCU tree nohz: Fix unavailable tick_stop tracepoint in dynticks idle nohz: Add basic tracing nohz: Select wide RCU nocb for full dynticks nohz: Disable the tick when irq resume in full dynticks CPU nohz: Re-evaluate the tick for the new task after a context switch nohz: Prepare to stop the tick on irq exit nohz: Implement full dynticks kick nohz: Re-evaluate the tick from the scheduler IPI sched: New helper to prevent from stopping the tick in full dynticks sched: Kick full dynticks CPU that have more than one task enqueued. perf: New helper to prevent full dynticks CPUs from stopping tick perf: Kick full dynticks CPU if events rotation is needed ...
2013-05-03ext4: fix fio regressionYan, Zheng
We (Linux Kernel Performance project) found a regression introduced by commit: f7fec032aa ext4: track all extent status in extent status tree The commit causes about 20% performance decrease in fio random write test. Profiler shows that rb_next() uses a lot of CPU time. The call stack is: rb_next ext4_es_find_delayed_extent ext4_map_blocks _ext4_get_block ext4_get_block_write __blockdev_direct_IO ext4_direct_IO generic_file_direct_write __generic_file_aio_write ext4_file_write aio_rw_vect_retry aio_run_iocb do_io_submit sys_io_submit system_call_fastpath io_submit td_io_getevents io_u_queued_complete thread_main main __libc_start_main The cause is that ext4_es_find_delayed_extent() doesn't have an upper bound, it keeps searching until a delayed extent is found. When there are a lots of non-delayed entries in the extent state tree, ext4_es_find_delayed_extent() may uses a lot of CPU time. Reported-by: LKP project <lkp@linux.intel.com> Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Cc: "Theodore Ts'o" <tytso@mit.edu>
2013-05-02Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds
Pull drm updates from Dave Airlie: "This is the main drm pull request for 3.10. Wierd bits: - OMAP drm changes required OMAP dss changes, in drivers/video, so I took them in here. - one more fbcon fix for font handover - VT switch avoidance in pm code - scatterlist helpers for gpu drivers - have acks from akpm Highlights: - qxl kms driver - driver for the spice qxl virtual GPU Nouveau: - fermi/kepler VRAM compression - GK110/nvf0 modesetting support. Tegra: - host1x core merged with 2D engine support i915: - vt switchless resume - more valleyview support - vblank fixes - modesetting pipe config rework radeon: - UVD engine support - SI chip tiling support - GPU registers initialisation from golden values. exynos: - device tree changes - fimc block support Otherwise: - bunches of fixes all over the place." * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (513 commits) qxl: update to new idr interfaces. drm/nouveau: fix build with nv50->nvc0 drm/radeon: fix handling of v6 power tables drm/radeon: clarify family checks in pm table parsing drm/radeon: consolidate UVD clock programming drm/radeon: fix UPLL_REF_DIV_MASK definition radeon: add bo tracking debugfs drm/radeon: add new richland pci ids drm/radeon: add some new SI PCI ids drm/radeon: fix scratch reg handling for UVD fence drm/radeon: allocate SA bo in the requested domain drm/radeon: fix possible segfault when parsing pm tables drm/radeon: fix endian bugs in atom_allocate_fb_scratch() OMAPDSS: TFP410: return EPROBE_DEFER if the i2c adapter not found OMAPDSS: VENC: Add error handling for venc_probe_pdata OMAPDSS: HDMI: Add error handling for hdmi_probe_pdata OMAPDSS: RFBI: Add error handling for rfbi_probe_pdata OMAPDSS: DSI: Add error handling for dsi_probe_pdata OMAPDSS: SDI: Add error handling for sdi_probe_pdata OMAPDSS: DPI: Add error handling for dpi_probe_pdata ...
2013-05-02Merge commit '8700c95adb03' into timers/nohzFrederic Weisbecker
The full dynticks tree needs the latest RCU and sched upstream updates in order to fix some dependencies. Merge a common upstream merge point that has these updates. Conflicts: include/linux/perf_event.h kernel/rcutree.h kernel/rcutree_plugin.h Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2013-05-01Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Mostly performance and bug fixes, plus some cleanups. The one new feature this merge window is a new ioctl EXT4_IOC_SWAP_BOOT which allows installation of a hidden inode designed for boot loaders." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (50 commits) ext4: fix type-widening bug in inode table readahead code ext4: add check for inodes_count overflow in new resize ioctl ext4: fix Kconfig documentation for CONFIG_EXT4_DEBUG ext4: fix online resizing for ext3-compat file systems jbd2: trace when lock_buffer in do_get_write_access takes a long time ext4: mark metadata blocks using bh flags buffer: add BH_Prio and BH_Meta flags ext4: mark all metadata I/O with REQ_META ext4: fix readdir error in case inline_data+^dir_index. ext4: fix readdir error in the case of inline_data+dir_index jbd2: use kmem_cache_zalloc instead of kmem_cache_alloc/memset ext4: mext_insert_extents should update extent block checksum ext4: move quota initialization out of inode allocation transaction ext4: reserve xattr index for Rich ACL support jbd2: reduce journal_head size ext4: clear buffer_uninit flag when submitting IO ext4: use io_end for multiple bios ext4: make ext4_bio_write_page() use BH_Async_Write flags ext4: Use kstrtoul() instead of parse_strtoul() ext4: defragmentation code cleanup ...
2013-04-30Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "Usual stuff, mostly comment fixes, typo fixes, printk fixes and small code cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (45 commits) mm: Convert print_symbol to %pSR gfs2: Convert print_symbol to %pSR m32r: Convert print_symbol to %pSR iostats.txt: add easy-to-find description for field 6 x86 cmpxchg.h: fix wrong comment treewide: Fix typo in printk and comments doc: devicetree: Fix various typos docbook: fix 8250 naming in device-drivers pata_pdc2027x: Fix compiler warning treewide: Fix typo in printks mei: Fix comments in drivers/misc/mei treewide: Fix typos in kernel messages pm44xx: Fix comment for "CONFIG_CPU_IDLE" doc: Fix typo "CONFIG_CGROUP_CGROUP_MEMCG_SWAP" mmzone: correct "pags" to "pages" in comment. kernel-parameters: remove outdated 'noresidual' parameter Remove spurious _H suffixes from ifdef comments sound: Remove stray pluses from Kconfig file radio-shark: Fix printk "CONFIG_LED_CLASS" doc: put proper reference to CONFIG_MODULE_SIG_ENFORCE ...
2013-04-30Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main changes in this cycle are mostly related to preparatory work for the full-dynticks work: - Remove restrictions on no-CBs CPUs, make RCU_FAST_NO_HZ take advantage of numbered callbacks, do callback accelerations based on numbered callbacks. Posted to LKML at https://lkml.org/lkml/2013/3/18/960 - RCU documentation updates. Posted to LKML at https://lkml.org/lkml/2013/3/18/570 - Miscellaneous fixes. Posted to LKML at https://lkml.org/lkml/2013/3/18/594" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) rcu: Make rcu_accelerate_cbs() note need for future grace periods rcu: Abstract rcu_start_future_gp() from rcu_nocb_wait_gp() rcu: Rename n_nocb_gp_requests to need_future_gp rcu: Push lock release to rcu_start_gp()'s callers rcu: Repurpose no-CBs event tracing to future-GP events rcu: Rearrange locking in rcu_start_gp() rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks rcu: Accelerate RCU callbacks at grace-period end rcu: Export RCU_FAST_NO_HZ parameters to sysfs rcu: Distinguish "rcuo" kthreads by RCU flavor rcu: Add event tracing for no-CBs CPUs' grace periods rcu: Add event tracing for no-CBs CPUs' callback registration rcu: Introduce proper blocking to no-CBs kthreads GP waits rcu: Provide compile-time control for no-CBs CPUs rcu: Tone down debugging during boot-up and shutdown. rcu: Add softirq-stall indications to stall-warning messages rcu: Documentation update rcu: Make bugginess of code sample more evident rcu: Fix hlist_bl_set_first_rcu() annotation rcu: Delete unused rcu_node "wakemask" field ...
2013-04-29Merge branch 'akpm' (incoming from Andrew)Linus Torvalds
Merge second batch of fixes from Andrew Morton: - various misc bits - some printk updates - a new "SRAM" driver. - MAINTAINERS updates - the backlight driver queue - checkpatch updates - a few init/ changes - a huge number of drivers/rtc changes - fatfs updates - some lib/idr.c work - some renaming of the random driver interfaces * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (285 commits) net: rename random32 to prandom net/core: remove duplicate statements by do-while loop net/core: rename random32() to prandom_u32() net/netfilter: rename random32() to prandom_u32() net/sched: rename random32() to prandom_u32() net/sunrpc: rename random32() to prandom_u32() scsi: rename random32() to prandom_u32() lguest: rename random32() to prandom_u32() uwb: rename random32() to prandom_u32() video/uvesafb: rename random32() to prandom_u32() mmc: rename random32() to prandom_u32() drbd: rename random32() to prandom_u32() kernel/: rename random32() to prandom_u32() mm/: rename random32() to prandom_u32() lib/: rename random32() to prandom_u32() x86: rename random32() to prandom_u32() x86: pageattr-test: remove srandom32 call uuid: use prandom_bytes() raid6test: use prandom_bytes() sctp: convert sctp_assoc_set_id() to use idr_alloc_cyclic() ...
2013-04-29printk/tracing: rework console tracingzhangwei(Jovi)
Commit 7ff9554bb578 ("printk: convert byte-buffer to variable-length record buffer") removed start and end parameters from call_console_drivers, but those parameters still exist in include/trace/events/printk.h. Without start and end parameters handling, printk tracing became more simple as: trace_console(text, len); Signed-off-by: zhangwei(Jovi) <jovi.zhangwei@huawei.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Kay Sievers <kay@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29Merge branch 'akpm' (incoming from Andrew)Linus Torvalds
Merge first batch of fixes from Andrew Morton: - A couple of kthread changes - A few minor audit patches - A number of fbdev patches. Florian remains AWOL so I'm picking up some of these. - A few kbuild things - ocfs2 updates - Almost all of the MM queue (And in the meantime, I already have the second big batch from Andrew pending in my mailbox ;^) * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (149 commits) memcg: take reference before releasing rcu_read_lock mem hotunplug: fix kfree() of bootmem memory mmKconfig: add an option to disable bounce mm, nobootmem: do memset() after memblock_reserve() mm, nobootmem: clean-up of free_low_memory_core_early() fs/buffer.c: remove unnecessary init operation after allocating buffer_head. numa, cpu hotplug: change links of CPU and node when changing node number by onlining CPU mm: fix memory_hotplug.c printk format warning mm: swap: mark swap pages writeback before queueing for direct IO swap: redirty page if page write fails on swap file mm, memcg: give exiting processes access to memory reserves thp: fix huge zero page logic for page with pfn == 0 memcg: avoid accessing memcg after releasing reference fs: fix fsync() error reporting memblock: fix missing comment of memblock_insert_region() mm: Remove unused parameter of pages_correctly_reserved() firmware, memmap: fix firmware_map_entry leak mm/vmstat: add note on safety of drain_zonestat mm: thp: add split tail pages to shrink page list in page reclaim mm: allow for outstanding swap writeback accounting ...
2013-04-29Merge tag 'regmap-v3.10' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap updates from Mark Brown: "In user visible terms just a couple of enhancements here, though there was a moderate amount of refactoring required in order to support the register cache sync performance improvements. - Support for block and asynchronous I/O during register cache syncing; this provides a use case dependant performance improvement. - Additional debugfs information on the memory consuption and register set" * tag 'regmap-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (23 commits) regmap: don't corrupt work buffer in _regmap_raw_write() regmap: cache: Fix format specifier in dev_dbg regmap: cache: Make regcache_sync_block_raw static regmap: cache: Write consecutive registers in a single block write regmap: cache: Split raw and non-raw syncs regmap: cache: Factor out block sync regmap: cache: Factor out reg_present support from rbtree cache regmap: cache: Use raw I/O to sync rbtrees if we can regmap: core: Provide regmap_can_raw_write() operation regmap: cache: Provide a get address of value operation regmap: Cut down on the average # of nodes in the rbtree cache regmap: core: Make raw write available to regcache regmap: core: Warn on invalid operation combinations regmap: irq: Clarify error message when we fail to request primary IRQ regmap: rbtree Expose total memory consumption in the rbtree debugfs entry regmap: debugfs: Add a registers `range' file regmap: debugfs: Simplify calculation of `c->max_reg' regmap: cache: Store caches in native register format where possible regmap: core: Split out in place value parsing regmap: cache: Use regcache_get_value() to check if we updated ...
2013-04-29mm: trace filemap add and delRobert Jarzmik
Use the events API to trace filemap loading and unloading of file pieces into the page cache. This patch aims at tracing the eviction reload cycle of executable and shared libraries pages in a memory constrained environment. The typical usage is to spot a specific device and inode (for example /lib/libc.so) to see the eviction cycles, and find out if frequently used code is rather spread across many pages (bad) or coallesced (good). Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Cc: Dave Chinner <david@fromorbit.com> Cc: Hugh Dickins <hughd@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29f2fs: add a tracepoint on f2fs_new_inodeJaegeuk Kim
This can help when debugging the free nid allocation flows. Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-26KVM: Extract generic irqchip logic into irqchip.cAlexander Graf
The current irq_comm.c file contains pieces of code that are generic across different irqchip implementations, as well as code that is fully IOAPIC specific. Split the generic bits out into irqchip.c. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2013-04-24nohz: Fix unavailable tick_stop tracepoint in dynticks idleFrederic Weisbecker
The trace_tick_stop() tracepoint is only available in full dynticks. But it's also used by dynticks-idle so let's build it for the latter config as well. This fixes: kernel/time/tick-sched.c: In function tick_nohz_stop_sched_tick: kernel/time/tick-sched.c:644: error: implicit declaration of function trace_tick_stop make[2]: *** [kernel/time/tick-sched.o] Erreur 1 Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@linaro.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
2013-04-23f2fs: add tracepoints to debug checkpoint requestNamjae Jeon
Add tracepoints to debug checkpoint request. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [Jaegeuk: change expressions] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23f2fs: add tracepoints for write page operationsNamjae Jeon
Add tracepoints to debug the various page write operation like data pages, meta pages. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [Jaegeuk: remove unnecessary tracepoints] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23f2fs: add tracepoints to debug the block allocationNamjae Jeon
Add tracepoints to debug the block allocation & fallocate. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [Jaegeuk: enhance information] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23f2fs: add tracepoints for GC threadsNamjae Jeon
Add tracepoints for tracing the garbage collector threads in f2fs with status of collection & type. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [Jaegeuk: modify slightly to show information] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23f2fs: add tracepoint for tracing the page i/oNamjae Jeon
Add tracepoints for page i/o operations and block allocation tracing during page read operation. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [Jaegeuk: combine and modify the tracepoint structures] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23f2fs: add tracepoints for truncate operationNamjae Jeon
add tracepoints for tracing the truncate operations like truncate node/data blocks, f2fs_truncate etc. Tracepoints are added at entry and exit of operation to trace the success & failure of operation. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [Jaegeuk: combine and modify the tracepoint structures] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-23f2fs: add tracepoints for sync & inode operationsNamjae Jeon
Add tracepoints in f2fs for tracing the syncing operations like filesystem sync, file sync enter/exit. It will helf to trace the code under debugging scenarios. Also add tracepoints for tracing the various inode operations like building inode, eviction of inode, link/unlike of inodes. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> [Jaegeuk: combine and modify the tracepoint structures] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
2013-04-22nohz: Add basic tracingFrederic Weisbecker
It's not obvious to find out why the full dynticks subsystem doesn't always stop the tick: whether this is due to kthreads, posix timers, perf events, etc... These new tracepoints are here to help the user diagnose the failures and test this feature. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kevin Hilman <khilman@linaro.org> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de>
2013-04-22gpu: host1x: Add channel supportTerje Bergstrom
Add support for host1x client modules, and host1x channels to submit work to the clients. Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Thierry Reding <thierry.reding@avionic-design.de> Tested-by: Thierry Reding <thierry.reding@avionic-design.de> Tested-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
2013-04-22gpu: host1x: Add host1x driverTerje Bergstrom
Add host1x, the driver for host1x and its client unit 2D. The Tegra host1x module is the DMA engine for register access to Tegra's graphics- and multimedia-related modules. The modules served by host1x are referred to as clients. host1x includes some other functionality, such as synchronization. Signed-off-by: Arto Merilainen <amerilainen@nvidia.com> Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-by: Thierry Reding <thierry.reding@avionic-design.de> Tested-by: Thierry Reding <thierry.reding@avionic-design.de> Tested-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
2013-04-21jbd2: trace when lock_buffer in do_get_write_access takes a long timeTheodore Ts'o
While investigating interactivity problems it was clear that processes sometimes stall for long periods of times if an attempt is made to lock a buffer which is undergoing writeback. It would stall in a trace looking something like [<ffffffff811a39de>] __lock_buffer+0x2e/0x30 [<ffffffff8123a60f>] do_get_write_access+0x43f/0x4b0 [<ffffffff8123a7cb>] jbd2_journal_get_write_access+0x2b/0x50 [<ffffffff81220f79>] __ext4_journal_get_write_access+0x39/0x80 [<ffffffff811f3198>] ext4_reserve_inode_write+0x78/0xa0 [<ffffffff811f3209>] ext4_mark_inode_dirty+0x49/0x220 [<ffffffff811f57d1>] ext4_dirty_inode+0x41/0x60 [<ffffffff8119ac3e>] __mark_inode_dirty+0x4e/0x2d0 [<ffffffff8118b9b9>] update_time+0x79/0xc0 [<ffffffff8118ba98>] file_update_time+0x98/0x100 [<ffffffff81110ffc>] __generic_file_aio_write+0x17c/0x3b0 [<ffffffff811112aa>] generic_file_aio_write+0x7a/0xf0 [<ffffffff811ea853>] ext4_file_write+0x83/0xd0 [<ffffffff81172b23>] do_sync_write+0xa3/0xe0 [<ffffffff811731ae>] vfs_write+0xae/0x180 [<ffffffff8117361d>] sys_write+0x4d/0x90 [<ffffffff8159d62d>] system_call_fastpath+0x1a/0x1f [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-18Revert "block: add missing block_bio_complete() tracepoint"Linus Torvalds
This reverts commit 3a366e614d0837d9fc23f78cdb1a1186ebc3387f. Wanlong Gao reports that it causes a kernel panic on his machine several minutes after boot. Reverting it removes the panic. Jens says: "It's not quite clear why that is yet, so I think we should just revert the commit for 3.9 final (which I'm assuming is pretty close). The wifi is crap at the LSF hotel, so sending this email instead of queueing up a revert and pull request." Reported-by: Wanlong Gao <gaowanlong@cn.fujitsu.com> Requested-by: Jens Axboe <axboe@kernel.dk> Cc: Tejun Heo <tj@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-12kthread: Prevent unpark race which puts threads on the wrong cpuThomas Gleixner
The smpboot threads rely on the park/unpark mechanism which binds per cpu threads on a particular core. Though the functionality is racy: CPU0 CPU1 CPU2 unpark(T) wake_up_process(T) clear(SHOULD_PARK) T runs leave parkme() due to !SHOULD_PARK bind_to(CPU2) BUG_ON(wrong CPU) We cannot let the tasks move themself to the target CPU as one of those tasks is actually the migration thread itself, which requires that it starts running on the target cpu right away. The solution to this problem is to prevent wakeups in park mode which are not from unpark(). That way we can guarantee that the association of the task to the target cpu is working correctly. Add a new task state (TASK_PARKED) which prevents other wakeups and use this state explicitly for the unpark wakeup. Peter noticed: Also, since the task state is visible to userspace and all the parked tasks are still in the PID space, its a good hint in ps and friends that these tasks aren't really there for the moment. The migration thread has another related issue. CPU0 CPU1 Bring up CPU2 create_thread(T) park(T) wait_for_completion() parkme() complete() sched_set_stop_task() schedule(TASK_PARKED) The sched_set_stop_task() call is issued while the task is on the runqueue of CPU1 and that confuses the hell out of the stop_task class on that cpu. So we need the same synchronizaion before sched_set_stop_task(). Reported-by: Dave Jones <davej@redhat.com> Reported-and-tested-by: Dave Hansen <dave@sr71.net> Reported-and-tested-by: Borislav Petkov <bp@alien8.de> Acked-by: Peter Ziljstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: dhillf@gmail.com Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304091635430.21884@ionos Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-04-09ext4: fix miscellaneous big endian warningsTheodore Ts'o
None of these result in any bug, but they makes sparse complain. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2013-04-03ext4: collapse handling of data=ordered and data=writeback codepathsTheodore Ts'o
The only difference between how we handle data=ordered and data=writeback is a single call to ext4_jbd2_file_inode(). Eliminate code duplication by factoring out redundant the code paths. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com>
2013-04-02Merge branch 'writeback-workqueue' of ↵Jens Axboe
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq into for-3.10/core Tejun writes: ----- This is the pull request for the earlier patchset[1] with the same name. It's only three patches (the first one was committed to workqueue tree) but the merge strategy is a bit involved due to the dependencies. * Because the conversion needs features from wq/for-3.10, block/for-3.10/core is based on rc3, and wq/for-3.10 has conflicts with rc3, I pulled mainline (rc5) into wq/for-3.10 to prevent those workqueue conflicts from flaring up in block tree. * Resolving the issue that Jan and Dave raised about debugging requires arch-wide changes. The patchset is being worked on[2] but it'll have to go through -mm after these changes show up in -next, and not included in this pull request. The three commits are located in the following git branch. git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git writeback-workqueue Pulling it into block/for-3.10/core produces a conflict in drivers/md/raid5.c between the following two commits. e3620a3ad5 ("MD RAID5: Avoid accessing gendisk or queue structs when not available") 2f6db2a707 ("raid5: use bio_reset()") The conflict is trivial - one removes an "if ()" conditional while the other removes "rbi->bi_next = NULL" right above it. We just need to remove both. The merged branch is available at git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git block-test-merge so that you can use it for verification. The test merge commit has proper merge description. While these changes are a bit of pain to route, they make code simpler and even have, while minute, measureable performance gain[3] even on a workload which isn't particularly favorable to showing the benefits of this conversion. ---- Fixed up the conflict. Conflicts: drivers/md/raid5.c Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-04-01writeback: replace custom worker pool implementation with unbound workqueueTejun Heo
Writeback implements its own worker pool - each bdi can be associated with a worker thread which is created and destroyed dynamically. The worker thread for the default bdi is always present and serves as the "forker" thread which forks off worker threads for other bdis. there's no reason for writeback to implement its own worker pool when using unbound workqueue instead is much simpler and more efficient. This patch replaces custom worker pool implementation in writeback with an unbound workqueue. The conversion isn't too complicated but the followings are worth mentioning. * bdi_writeback->last_active, task and wakeup_timer are removed. delayed_work ->dwork is added instead. Explicit timer handling is no longer necessary. Everything works by either queueing / modding / flushing / canceling the delayed_work item. * bdi_writeback_thread() becomes bdi_writeback_workfn() which runs off bdi_writeback->dwork. On each execution, it processes bdi->work_list and reschedules itself if there are more things to do. The function also handles low-mem condition, which used to be handled by the forker thread. If the function is running off a rescuer thread, it only writes out limited number of pages so that the rescuer can serve other bdis too. This preserves the flusher creation failure behavior of the forker thread. * INIT_LIST_HEAD(&bdi->bdi_list) is used to tell bdi_writeback_workfn() about on-going bdi unregistration so that it always drains work_list even if it's running off the rescuer. Note that the original code was broken in this regard. Under memory pressure, a bdi could finish unregistration with non-empty work_list. * The default bdi is no longer special. It now is treated the same as any other bdi and bdi_cap_flush_forker() is removed. * BDI_pending is no longer used. Removed. * Some tracepoints become non-applicable. The following TPs are removed - writeback_nothread, writeback_wake_thread, writeback_wake_forker_thread, writeback_thread_start, writeback_thread_stop. Everything, including devices coming and going away and rescuer operation under simulated memory pressure, seems to work fine in my test setup. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Jens Axboe <axboe@kernel.dk> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Jeff Moyer <jmoyer@redhat.com>
2013-03-26rcu: Repurpose no-CBs event tracing to future-GP eventsPaul E. McKenney
Dyntick-idle CPUs need to be able to pre-announce their need for grace periods. This can be done using something similar to the mechanism used by no-CB CPUs to announce their need for grace periods. This commit moves in this direction by renaming the no-CBs grace-period event tracing to suit the new future-grace-period needs. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-03-26rcu: Add event tracing for no-CBs CPUs' grace periodsPaul E. McKenney
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2013-03-23bcache: A block layer cacheKent Overstreet
Does writethrough and writeback caching, handles unclean shutdown, and has a bunch of other nifty features motivated by real world usage. See the wiki at http://bcache.evilpiepirate.org for more. Signed-off-by: Kent Overstreet <koverstreet@google.com>
2013-03-23block: Use bio_sectors() more consistentlyKent Overstreet
Bunch of places in the code weren't using it where they could be - this'll reduce the size of the patch that puts bi_sector/bi_size/bi_idx into a struct bvec_iter. Signed-off-by: Kent Overstreet <koverstreet@google.com> CC: Jens Axboe <axboe@kernel.dk> CC: "Ed L. Cashin" <ecashin@coraid.com> CC: Nick Piggin <npiggin@kernel.dk> CC: Jiri Kosina <jkosina@suse.cz> CC: Jim Paris <jim@jtan.com> CC: Geoff Levand <geoff@infradead.org> CC: Alasdair Kergon <agk@redhat.com> CC: dm-devel@redhat.com CC: Neil Brown <neilb@suse.de> CC: Steven Rostedt <rostedt@goodmis.org> Acked-by: Ed Cashin <ecashin@coraid.com>
2013-03-18treewide: Fix typos in printk and commentMasanari Iida
Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2013-03-04regmap: async: Add tracepoints for async I/OMark Brown
Trace when we start and complete async writes, and when we start and finish blocking for their completion. This is useful for performance analysis of the resulting I/O patterns. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-03-02Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Ted Ts'o: "Various bug fixes for ext4. The most important is a fix for the new extent cache's slab shrinker which can cause significant, user-visible pauses when the system is under memory pressure." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: enable quotas before orphan cleanup ext4: don't allow quota mount options when quota feature enabled ext4: fix a warning from sparse check for ext4_dir_llseek ext4: convert number of blocks to clusters properly ext4: fix possible memory leak in ext4_remount() jbd2: fix ERR_PTR dereference in jbd2__journal_start ext4: use percpu counter for extent cache count ext4: optimize ext4_es_shrink()
2013-02-28ext4: optimize ext4_es_shrink()Theodore Ts'o
When the system is under memory pressure, ext4_es_srhink() will get called very often. So optimize returning the number of items in the file system's extent status cache by keeping a per-filesystem count, instead of calculating it each time by scanning all of the inodes in the extent status cache. Also rename the slab used for the extent status cache to be "ext4_extent_status" so it's obviousl the slab in question is created by ext4. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Zheng Liu <gnehzuil.liu@gmail.com>
2013-02-28Merge branch 'for-3.9/core' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block IO core bits from Jens Axboe: "Below are the core block IO bits for 3.9. It was delayed a few days since my workstation kept crashing every 2-8h after pulling it into current -git, but turns out it is a bug in the new pstate code (divide by zero, will report separately). In any case, it contains: - The big cfq/blkcg update from Tejun and and Vivek. - Additional block and writeback tracepoints from Tejun. - Improvement of the should sort (based on queues) logic in the plug flushing. - _io() variants of the wait_for_completion() interface, using io_schedule() instead of schedule() to contribute to io wait properly. - Various little fixes. You'll get two trivial merge conflicts, which should be easy enough to fix up" Fix up the trivial conflicts due to hlist traversal cleanups (commit b67bfe0d42ca: "hlist: drop the node parameter from iterators"). * 'for-3.9/core' of git://git.kernel.dk/linux-block: (39 commits) block: remove redundant check to bd_openers() block: use i_size_write() in bd_set_size() cfq: fix lock imbalance with failed allocations drivers/block/swim3.c: fix null pointer dereference block: don't select PERCPU_RWSEM block: account iowait time when waiting for completion of IO request sched: add wait_for_completion_io[_timeout] writeback: add more tracepoints block: add block_{touch|dirty}_buffer tracepoint buffer: make touch_buffer() an exported function block: add @req to bio_{front|back}_merge tracepoints block: add missing block_bio_complete() tracepoint block: Remove should_sort judgement when flush blk_plug block,elevator: use new hashtable implementation cfq-iosched: add hierarchical cfq_group statistics cfq-iosched: collect stats from dead cfqgs cfq-iosched: separate out cfqg_stats_reset() from cfq_pd_reset_stats() blkcg: make blkcg_print_blkgs() grab q locks instead of blkcg lock block: RCU free request_queue blkcg: implement blkg_[rw]stat_recursive_sum() and blkg_[rw]stat_merge() ...
2013-02-26Merge tag 'ext4_for_linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Theodore Ts'o: "The one new feature added in this patch series is the ability to use the "punch hole" functionality for inodes that are not using extent maps. In the bug fix category, we fixed some races in the AIO and fstrim code, and some potential NULL pointer dereferences and memory leaks in error handling code paths. In the optimization category, we fixed a performance regression in the jbd2 layer introduced by commit d9b01934d56a ("jbd: fix fsync() tid wraparound bug", introduced in v3.0) which shows up in the AIM7 benchmark. We also further optimized jbd2 by minimize the amount of time that transaction handles are held active. This patch series also features some additional enhancement of the extent status tree, which is now used to cache extent information in a more efficient/compact form than what we use on-disk." * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (65 commits) ext4: fix free clusters calculation in bigalloc filesystem ext4: no need to remove extent if len is 0 in ext4_es_remove_extent() ext4: fix xattr block allocation/release with bigalloc ext4: reclaim extents from extent status tree ext4: adjust some functions for reclaiming extents from extent status tree ext4: remove single extent cache ext4: lookup block mapping in extent status tree ext4: track all extent status in extent status tree ext4: let ext4_ext_map_blocks return EXT4_MAP_UNWRITTEN flag ext4: rename and improbe ext4_es_find_extent() ext4: add physical block and status member into extent status tree ext4: refine extent status tree ext4: use ERR_PTR() abstraction for ext4_append() ext4: refactor code to read directory blocks into ext4_read_dirblock() ext4: add debugging context for warning in ext4_da_update_reserve_space() ext4: use KERN_WARNING for warning messages jbd2: use module parameters instead of debugfs for jbd_debug ext4: use module parameters instead of debugfs for mballoc_debug ext4: start handle at the last possible moment when creating inodes ext4: fix the number of credits needed for acl ops with inline data ...
2013-02-24Merge tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Marcelo Tosatti: "KVM updates for the 3.9 merge window, including x86 real mode emulation fixes, stronger memory slot interface restrictions, mmu_lock spinlock hold time reduction, improved handling of large page faults on shadow, initial APICv HW acceleration support, s390 channel IO based virtio, amongst others" * tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits) Revert "KVM: MMU: lazily drop large spte" x86: pvclock kvm: align allocation size to page size KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr x86 emulator: fix parity calculation for AAD instruction KVM: PPC: BookE: Handle alignment interrupts booke: Added DBCR4 SPR number KVM: PPC: booke: Allow multiple exception types KVM: PPC: booke: use vcpu reference from thread_struct KVM: Remove user_alloc from struct kvm_memory_slot KVM: VMX: disable apicv by default KVM: s390: Fix handling of iscs. KVM: MMU: cleanup __direct_map KVM: MMU: remove pt_access in mmu_set_spte KVM: MMU: cleanup mapping-level KVM: MMU: lazily drop large spte KVM: VMX: cleanup vmx_set_cr0(). KVM: VMX: add missing exit names to VMX_EXIT_REASONS array KVM: VMX: disable SMEP feature when guest is in non-paging mode KVM: Remove duplicate text in api.txt Revert "KVM: MMU: split kvm_mmu_free_page" ...