aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2019-07-09 12:34:26 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2019-07-09 12:34:26 -0700
commite9a83bd2322035ed9d7dcf35753d3f984d76c6a5 (patch)
tree66dc466ff9aec0f9bb7f39cba50a47eab6585559
parent7011b7e1b702cc76f9e969b41d9a95969f2aecaa (diff)
parent454f96f2b738374da4b0a703b1e2e7aed82c4486 (diff)
downloadlinux-stericsson-e9a83bd2322035ed9d7dcf35753d3f984d76c6a5.tar.gz
Merge tag 'docs-5.3' of git://git.lwn.net/linux
Pull Documentation updates from Jonathan Corbet: "It's been a relatively busy cycle for docs: - A fair pile of RST conversions, many from Mauro. These create more than the usual number of simple but annoying merge conflicts with other trees, unfortunately. He has a lot more of these waiting on the wings that, I think, will go to you directly later on. - A new document on how to use merges and rebases in kernel repos, and one on Spectre vulnerabilities. - Various improvements to the build system, including automatic markup of function() references because some people, for reasons I will never understand, were of the opinion that :c:func:``function()`` is unattractive and not fun to type. - We now recommend using sphinx 1.7, but still support back to 1.4. - Lots of smaller improvements, warning fixes, typo fixes, etc" * tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits) docs: automarkup.py: ignore exceptions when seeking for xrefs docs: Move binderfs to admin-guide Disable Sphinx SmartyPants in HTML output doc: RCU callback locks need only _bh, not necessarily _irq docs: format kernel-parameters -- as code Doc : doc-guide : Fix a typo platform: x86: get rid of a non-existent document Add the RCU docs to the core-api manual Documentation: RCU: Add TOC tree hooks Documentation: RCU: Rename txt files to rst Documentation: RCU: Convert RCU UP systems to reST Documentation: RCU: Convert RCU linked list to reST Documentation: RCU: Convert RCU basic concepts to reST docs: filesystems: Remove uneeded .rst extension on toctables scripts/sphinx-pre-install: fix out-of-tree build docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/ Documentation: PGP: update for newer HW devices Documentation: Add section about CPU vulnerabilities for Spectre Documentation: platform: Delete x86-laptop-drivers.txt docs: Note that :c:func: should no longer be used ...
-rw-r--r--Documentation/ABI/testing/sysfs-devices-system-cpu3
-rw-r--r--Documentation/ABI/testing/sysfs-kernel-uids2
-rw-r--r--Documentation/DMA-API.txt2
-rw-r--r--Documentation/EDID/howto.rst (renamed from Documentation/EDID/HOWTO.txt)35
-rw-r--r--Documentation/Kconfig13
-rw-r--r--Documentation/Makefile14
-rw-r--r--Documentation/RCU/UP.rst (renamed from Documentation/RCU/UP.txt)50
-rw-r--r--Documentation/RCU/index.rst19
-rw-r--r--Documentation/RCU/listRCU.rst (renamed from Documentation/RCU/listRCU.txt)38
-rw-r--r--Documentation/RCU/rcu.rst92
-rw-r--r--Documentation/RCU/rcu.txt89
-rw-r--r--Documentation/accelerators/ocxl.rst2
-rw-r--r--Documentation/acpi/dsd/leds.txt2
-rw-r--r--Documentation/admin-guide/README.rst2
-rw-r--r--Documentation/admin-guide/binderfs.rst (renamed from Documentation/filesystems/binderfs.rst)0
-rw-r--r--Documentation/admin-guide/bug-hunting.rst2
-rw-r--r--Documentation/admin-guide/hw-vuln/index.rst1
-rw-r--r--Documentation/admin-guide/hw-vuln/spectre.rst697
-rw-r--r--Documentation/admin-guide/index.rst1
-rw-r--r--Documentation/admin-guide/kernel-parameters.rst10
-rw-r--r--Documentation/admin-guide/kernel-parameters.txt38
-rw-r--r--Documentation/admin-guide/mm/numaperf.rst5
-rw-r--r--Documentation/admin-guide/ras.rst2
-rw-r--r--Documentation/aoe/aoe.rst (renamed from Documentation/aoe/aoe.txt)65
-rw-r--r--Documentation/aoe/examples.rst23
-rw-r--r--Documentation/aoe/index.rst19
-rw-r--r--Documentation/aoe/todo.rst (renamed from Documentation/aoe/todo.txt)3
-rw-r--r--Documentation/aoe/udev.txt2
-rw-r--r--Documentation/arm/mem_alignment2
-rw-r--r--Documentation/arm/stm32/overview.rst2
-rw-r--r--Documentation/arm/stm32/stm32f429-overview.rst2
-rw-r--r--Documentation/arm/stm32/stm32f746-overview.rst2
-rw-r--r--Documentation/arm/stm32/stm32f769-overview.rst2
-rw-r--r--Documentation/arm/stm32/stm32h743-overview.rst2
-rw-r--r--Documentation/arm/stm32/stm32mp157-overview.rst2
-rw-r--r--Documentation/arm64/acpi_object_usage.rst (renamed from Documentation/arm64/acpi_object_usage.txt)288
-rw-r--r--Documentation/arm64/arm-acpi.rst (renamed from Documentation/arm64/arm-acpi.txt)163
-rw-r--r--Documentation/arm64/booting.rst (renamed from Documentation/arm64/booting.txt)91
-rw-r--r--Documentation/arm64/cpu-feature-registers.rst (renamed from Documentation/arm64/cpu-feature-registers.txt)210
-rw-r--r--Documentation/arm64/elf_hwcaps.rst (renamed from Documentation/arm64/elf_hwcaps.txt)56
-rw-r--r--Documentation/arm64/hugetlbpage.rst (renamed from Documentation/arm64/hugetlbpage.txt)7
-rw-r--r--Documentation/arm64/index.rst28
-rw-r--r--Documentation/arm64/legacy_instructions.rst (renamed from Documentation/arm64/legacy_instructions.txt)43
-rw-r--r--Documentation/arm64/memory.rst98
-rw-r--r--Documentation/arm64/memory.txt97
-rw-r--r--Documentation/arm64/pointer-authentication.rst (renamed from Documentation/arm64/pointer-authentication.txt)2
-rw-r--r--Documentation/arm64/silicon-errata.rst (renamed from Documentation/arm64/silicon-errata.txt)65
-rw-r--r--Documentation/arm64/sve.rst (renamed from Documentation/arm64/sve.txt)12
-rw-r--r--Documentation/arm64/tagged-pointers.rst (renamed from Documentation/arm64/tagged-pointers.txt)6
-rw-r--r--Documentation/bpf/btf.rst2
-rw-r--r--Documentation/cdrom/Makefile21
-rw-r--r--Documentation/cdrom/cdrom-standard.rst1063
-rw-r--r--Documentation/cdrom/cdrom-standard.tex1026
-rw-r--r--Documentation/cdrom/ide-cd.rst (renamed from Documentation/cdrom/ide-cd)202
-rw-r--r--Documentation/cdrom/index.rst19
-rw-r--r--Documentation/cdrom/packet-writing.rst (renamed from Documentation/cdrom/packet-writing.txt)27
-rw-r--r--Documentation/conf.py5
-rw-r--r--Documentation/core-api/index.rst2
-rw-r--r--Documentation/core-api/kernel-api.rst14
-rw-r--r--Documentation/core-api/protection-keys.rst (renamed from Documentation/x86/protection-keys.rst)0
-rw-r--r--Documentation/core-api/timekeeping.rst2
-rw-r--r--Documentation/core-api/xarray.rst270
-rw-r--r--Documentation/device-mapper/cache-policies.rst (renamed from Documentation/device-mapper/cache-policies.txt)24
-rw-r--r--Documentation/device-mapper/cache.rst (renamed from Documentation/device-mapper/cache.txt)214
-rw-r--r--Documentation/device-mapper/delay.rst (renamed from Documentation/device-mapper/delay.txt)29
-rw-r--r--Documentation/device-mapper/dm-crypt.rst (renamed from Documentation/device-mapper/dm-crypt.txt)61
-rw-r--r--Documentation/device-mapper/dm-flakey.rst (renamed from Documentation/device-mapper/dm-flakey.txt)45
-rw-r--r--Documentation/device-mapper/dm-init.rst (renamed from Documentation/device-mapper/dm-init.txt)89
-rw-r--r--Documentation/device-mapper/dm-integrity.rst (renamed from Documentation/device-mapper/dm-integrity.txt)62
-rw-r--r--Documentation/device-mapper/dm-io.rst (renamed from Documentation/device-mapper/dm-io.txt)14
-rw-r--r--Documentation/device-mapper/dm-log.rst (renamed from Documentation/device-mapper/dm-log.txt)5
-rw-r--r--Documentation/device-mapper/dm-queue-length.rst (renamed from Documentation/device-mapper/dm-queue-length.txt)25
-rw-r--r--Documentation/device-mapper/dm-raid.rst (renamed from Documentation/device-mapper/dm-raid.txt)225
-rw-r--r--Documentation/device-mapper/dm-service-time.rst (renamed from Documentation/device-mapper/dm-service-time.txt)76
-rw-r--r--Documentation/device-mapper/dm-uevent.rst110
-rw-r--r--Documentation/device-mapper/dm-uevent.txt97
-rw-r--r--Documentation/device-mapper/dm-zoned.rst (renamed from Documentation/device-mapper/dm-zoned.txt)10
-rw-r--r--Documentation/device-mapper/era.rst (renamed from Documentation/device-mapper/era.txt)36
-rw-r--r--Documentation/device-mapper/index.rst44
-rw-r--r--Documentation/device-mapper/kcopyd.rst (renamed from Documentation/device-mapper/kcopyd.txt)10
-rw-r--r--Documentation/device-mapper/linear.rst63
-rw-r--r--Documentation/device-mapper/linear.txt61
-rw-r--r--Documentation/device-mapper/log-writes.rst (renamed from Documentation/device-mapper/log-writes.txt)105
-rw-r--r--Documentation/device-mapper/persistent-data.rst (renamed from Documentation/device-mapper/persistent-data.txt)4
-rw-r--r--Documentation/device-mapper/snapshot.rst (renamed from Documentation/device-mapper/snapshot.txt)116
-rw-r--r--Documentation/device-mapper/statistics.rst (renamed from Documentation/device-mapper/statistics.txt)62
-rw-r--r--Documentation/device-mapper/striped.rst61
-rw-r--r--Documentation/device-mapper/striped.txt57
-rw-r--r--Documentation/device-mapper/switch.rst (renamed from Documentation/device-mapper/switch.txt)47
-rw-r--r--Documentation/device-mapper/thin-provisioning.rst (renamed from Documentation/device-mapper/thin-provisioning.txt)68
-rw-r--r--Documentation/device-mapper/unstriped.rst (renamed from Documentation/device-mapper/unstriped.txt)93
-rw-r--r--Documentation/device-mapper/verity.rst (renamed from Documentation/device-mapper/verity.txt)20
-rw-r--r--Documentation/device-mapper/writecache.rst (renamed from Documentation/device-mapper/writecache.txt)13
-rw-r--r--Documentation/device-mapper/zero.rst (renamed from Documentation/device-mapper/zero.txt)14
-rw-r--r--Documentation/devicetree/bindings/net/fsl-enetc.txt7
-rw-r--r--Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt2
-rw-r--r--Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt2
-rw-r--r--Documentation/devicetree/booting-without-of.txt2
-rw-r--r--Documentation/doc-guide/kernel-doc.rst2
-rw-r--r--Documentation/doc-guide/sphinx.rst32
-rw-r--r--Documentation/docutils.conf2
-rw-r--r--Documentation/driver-api/basics.rst3
-rw-r--r--Documentation/driver-api/clk.rst6
-rw-r--r--Documentation/driver-api/firmware/other_interfaces.rst2
-rw-r--r--Documentation/driver-api/gpio/board.rst2
-rw-r--r--Documentation/driver-api/gpio/consumer.rst2
-rw-r--r--Documentation/driver-api/iio/hw-consumer.rst1
-rw-r--r--Documentation/driver-api/pps.rst (renamed from Documentation/pps/pps.txt)67
-rw-r--r--Documentation/driver-api/ptp.rst (renamed from Documentation/ptp/ptp.txt)26
-rw-r--r--Documentation/driver-api/target.rst4
-rw-r--r--Documentation/fault-injection/fault-injection.rst (renamed from Documentation/fault-injection/fault-injection.txt)281
-rw-r--r--Documentation/fault-injection/index.rst20
-rw-r--r--Documentation/fault-injection/notifier-error-inject.rst (renamed from Documentation/fault-injection/notifier-error-inject.txt)18
-rw-r--r--Documentation/fault-injection/nvme-fault-injection.rst178
-rw-r--r--Documentation/fault-injection/nvme-fault-injection.txt172
-rw-r--r--Documentation/fault-injection/provoke-crashes.rst48
-rw-r--r--Documentation/fault-injection/provoke-crashes.txt38
-rw-r--r--Documentation/fb/api.rst (renamed from Documentation/fb/api.txt)29
-rw-r--r--Documentation/fb/arkfb.rst (renamed from Documentation/fb/arkfb.txt)8
-rw-r--r--Documentation/fb/aty128fb.rst (renamed from Documentation/fb/aty128fb.txt)35
-rw-r--r--Documentation/fb/cirrusfb.rst (renamed from Documentation/fb/cirrusfb.txt)47
-rw-r--r--Documentation/fb/cmap_xfbdev.rst (renamed from Documentation/fb/cmap_xfbdev.txt)57
-rw-r--r--Documentation/fb/deferred_io.rst (renamed from Documentation/fb/deferred_io.txt)28
-rw-r--r--Documentation/fb/efifb.rst (renamed from Documentation/fb/efifb.txt)18
-rw-r--r--Documentation/fb/ep93xx-fb.rst (renamed from Documentation/fb/ep93xx-fb.txt)27
-rw-r--r--Documentation/fb/fbcon.rst (renamed from Documentation/fb/fbcon.txt)179
-rw-r--r--Documentation/fb/framebuffer.rst (renamed from Documentation/fb/framebuffer.txt)80
-rw-r--r--Documentation/fb/gxfb.rst (renamed from Documentation/fb/gxfb.txt)24
-rw-r--r--Documentation/fb/index.rst50
-rw-r--r--Documentation/fb/intel810.rst (renamed from Documentation/fb/intel810.txt)79
-rw-r--r--Documentation/fb/intelfb.rst (renamed from Documentation/fb/intelfb.txt)62
-rw-r--r--Documentation/fb/internals.rst (renamed from Documentation/fb/internals.txt)24
-rw-r--r--Documentation/fb/lxfb.rst (renamed from Documentation/fb/lxfb.txt)25
-rw-r--r--Documentation/fb/matroxfb.rst443
-rw-r--r--Documentation/fb/matroxfb.txt413
-rw-r--r--Documentation/fb/metronomefb.rst (renamed from Documentation/fb/metronomefb.txt)8
-rw-r--r--Documentation/fb/modedb.rst (renamed from Documentation/fb/modedb.txt)44
-rw-r--r--Documentation/fb/pvr2fb.rst66
-rw-r--r--Documentation/fb/pvr2fb.txt65
-rw-r--r--Documentation/fb/pxafb.rst (renamed from Documentation/fb/pxafb.txt)81
-rw-r--r--Documentation/fb/s3fb.rst (renamed from Documentation/fb/s3fb.txt)8
-rw-r--r--Documentation/fb/sa1100fb.rst (renamed from Documentation/fb/sa1100fb.txt)23
-rw-r--r--Documentation/fb/sh7760fb.rst130
-rw-r--r--Documentation/fb/sh7760fb.txt131
-rw-r--r--Documentation/fb/sisfb.rst (renamed from Documentation/fb/sisfb.txt)40
-rw-r--r--Documentation/fb/sm501.rst (renamed from Documentation/fb/sm501.txt)7
-rw-r--r--Documentation/fb/sm712fb.rst (renamed from Documentation/fb/sm712fb.txt)18
-rw-r--r--Documentation/fb/sstfb.rst207
-rw-r--r--Documentation/fb/sstfb.txt174
-rw-r--r--Documentation/fb/tgafb.rst (renamed from Documentation/fb/tgafb.txt)30
-rw-r--r--Documentation/fb/tridentfb.rst (renamed from Documentation/fb/tridentfb.txt)36
-rw-r--r--Documentation/fb/udlfb.rst (renamed from Documentation/fb/udlfb.txt)55
-rw-r--r--Documentation/fb/uvesafb.rst (renamed from Documentation/fb/uvesafb.txt)142
-rw-r--r--Documentation/fb/vesafb.rst (renamed from Documentation/fb/vesafb.txt)121
-rw-r--r--Documentation/fb/viafb.rst297
-rw-r--r--Documentation/fb/viafb.txt252
-rw-r--r--Documentation/fb/vt8623fb.rst (renamed from Documentation/fb/vt8623fb.txt)10
-rw-r--r--Documentation/features/debug/stackprotector/arch-support.txt2
-rw-r--r--Documentation/filesystems/api-summary.rst3
-rw-r--r--Documentation/filesystems/ext4/index.rst8
-rw-r--r--Documentation/filesystems/index.rst13
-rw-r--r--Documentation/filesystems/porting10
-rw-r--r--Documentation/filesystems/ubifs-authentication.md4
-rw-r--r--Documentation/filesystems/vfs.rst1428
-rw-r--r--Documentation/filesystems/vfs.txt1268
-rw-r--r--Documentation/filesystems/xfs-delayed-logging-design.txt2
-rw-r--r--Documentation/firmware-guide/acpi/enumeration.rst2
-rw-r--r--Documentation/firmware-guide/acpi/method-tracing.rst2
-rw-r--r--Documentation/fpga/dfl.rst (renamed from Documentation/fpga/dfl.txt)58
-rw-r--r--Documentation/fpga/index.rst17
-rw-r--r--Documentation/gpu/msm-crash-dump.rst2
-rw-r--r--Documentation/hid/hid-transport.txt6
-rw-r--r--Documentation/i2c/instantiating-devices4
-rw-r--r--Documentation/i2c/upgrading-clients4
-rw-r--r--Documentation/ide/changelogs.rst17
-rw-r--r--Documentation/ide/ide-tape.rst (renamed from Documentation/ide/ide-tape.txt)23
-rw-r--r--Documentation/ide/ide.rst (renamed from Documentation/ide/ide.txt)147
-rw-r--r--Documentation/ide/index.rst21
-rw-r--r--Documentation/ide/warm-plug-howto.rst (renamed from Documentation/ide/warm-plug-howto.txt)10
-rw-r--r--Documentation/index.rst1
-rw-r--r--Documentation/interconnect/interconnect.rst7
-rw-r--r--Documentation/iostats.txt4
-rw-r--r--Documentation/kbuild/headers_install.rst (renamed from Documentation/kbuild/headers_install.txt)5
-rw-r--r--Documentation/kbuild/index.rst27
-rw-r--r--Documentation/kbuild/issues.rst11
-rw-r--r--Documentation/kbuild/kbuild.rst (renamed from Documentation/kbuild/kbuild.txt)119
-rw-r--r--Documentation/kbuild/kconfig-language.rst (renamed from Documentation/kbuild/kconfig-language.txt)242
-rw-r--r--Documentation/kbuild/kconfig-macro-language.rst (renamed from Documentation/kbuild/kconfig-macro-language.txt)37
-rw-r--r--Documentation/kbuild/kconfig.rst (renamed from Documentation/kbuild/kconfig.txt)136
-rw-r--r--Documentation/kbuild/makefiles.rst (renamed from Documentation/kbuild/makefiles.txt)530
-rw-r--r--Documentation/kbuild/modules.rst (renamed from Documentation/kbuild/modules.txt)170
-rw-r--r--Documentation/kdump/index.rst21
-rw-r--r--Documentation/kdump/kdump.rst (renamed from Documentation/kdump/kdump.txt)131
-rw-r--r--Documentation/kdump/vmcoreinfo.rst (renamed from Documentation/kdump/vmcoreinfo.txt)59
-rw-r--r--Documentation/kernel-hacking/hacking.rst4
-rw-r--r--Documentation/kernel-hacking/locking.rst6
-rw-r--r--Documentation/kernel-per-CPU-kthreads.txt2
-rw-r--r--Documentation/laptops/lg-laptop.rst2
-rw-r--r--Documentation/maintainer/index.rst1
-rw-r--r--Documentation/maintainer/rebasing-and-merging.rst226
-rw-r--r--Documentation/memory-barriers.txt2
-rw-r--r--Documentation/mic/index.rst18
-rw-r--r--Documentation/mic/mic_overview.rst (renamed from Documentation/mic/mic_overview.txt)6
-rw-r--r--Documentation/mic/scif_overview.rst (renamed from Documentation/mic/scif_overview.txt)58
-rw-r--r--Documentation/netlabel/cipso_ipv4.rst (renamed from Documentation/netlabel/cipso_ipv4.txt)19
-rw-r--r--Documentation/netlabel/draft_ietf.rst5
-rw-r--r--Documentation/netlabel/index.rst21
-rw-r--r--Documentation/netlabel/introduction.rst (renamed from Documentation/netlabel/introduction.txt)16
-rw-r--r--Documentation/netlabel/lsm_interface.rst (renamed from Documentation/netlabel/lsm_interface.txt)16
-rw-r--r--Documentation/networking/device_drivers/freescale/dpaa2/dpio-driver.rst4
-rw-r--r--Documentation/networking/dsa/dsa.rst4
-rw-r--r--Documentation/networking/dsa/sja1105.rst6
-rw-r--r--Documentation/networking/timestamping.txt2
-rw-r--r--Documentation/nvdimm/nvdimm.txt4
-rw-r--r--Documentation/pcmcia/devicetable.rst (renamed from Documentation/pcmcia/devicetable.txt)4
-rw-r--r--Documentation/pcmcia/driver-changes.rst (renamed from Documentation/pcmcia/driver-changes.txt)35
-rw-r--r--Documentation/pcmcia/driver.rst (renamed from Documentation/pcmcia/driver.txt)18
-rw-r--r--Documentation/pcmcia/index.rst20
-rw-r--r--Documentation/pcmcia/locking.rst (renamed from Documentation/pcmcia/locking.txt)39
-rw-r--r--Documentation/platform/x86-laptop-drivers.txt18
-rw-r--r--Documentation/powerpc/firmware-assisted-dump.txt2
-rw-r--r--Documentation/powerpc/isa-versions.rst2
-rw-r--r--Documentation/process/4.Coding.rst2
-rw-r--r--Documentation/process/coding-style.rst2
-rw-r--r--Documentation/process/maintainer-pgp-guide.rst31
-rw-r--r--Documentation/process/submit-checklist.rst2
-rw-r--r--Documentation/riscv/index.rst17
-rw-r--r--Documentation/riscv/pmu.rst (renamed from Documentation/riscv/pmu.txt)98
-rw-r--r--Documentation/scheduler/completion.rst (renamed from Documentation/scheduler/completion.txt)38
-rw-r--r--Documentation/scheduler/index.rst29
-rw-r--r--Documentation/scheduler/sched-arch.rst (renamed from Documentation/scheduler/sched-arch.txt)18
-rw-r--r--Documentation/scheduler/sched-bwc.rst (renamed from Documentation/scheduler/sched-bwc.txt)30
-rw-r--r--Documentation/scheduler/sched-deadline.rst (renamed from Documentation/scheduler/sched-deadline.txt)311
-rw-r--r--Documentation/scheduler/sched-design-CFS.rst (renamed from Documentation/scheduler/sched-design-CFS.txt)15
-rw-r--r--Documentation/scheduler/sched-domains.rst (renamed from Documentation/scheduler/sched-domains.txt)8
-rw-r--r--Documentation/scheduler/sched-energy.rst (renamed from Documentation/scheduler/sched-energy.txt)47
-rw-r--r--Documentation/scheduler/sched-nice-design.rst (renamed from Documentation/scheduler/sched-nice-design.txt)6
-rw-r--r--Documentation/scheduler/sched-rt-group.rst (renamed from Documentation/scheduler/sched-rt-group.txt)28
-rw-r--r--Documentation/scheduler/sched-stats.rst (renamed from Documentation/scheduler/sched-stats.txt)35
-rw-r--r--Documentation/scheduler/text_files.rst5
-rw-r--r--Documentation/security/keys/core.rst16
-rw-r--r--Documentation/security/keys/trusted-encrypted.rst4
-rw-r--r--Documentation/sphinx/automarkup.py101
-rw-r--r--Documentation/sphinx/cdomain.py5
-rw-r--r--Documentation/sphinx/requirements.txt4
-rw-r--r--Documentation/sysctl/kernel.txt4
-rw-r--r--Documentation/target/index.rst19
-rw-r--r--Documentation/target/scripts.rst11
-rw-r--r--Documentation/target/tcm_mod_builder.rst149
-rw-r--r--Documentation/target/tcm_mod_builder.txt145
-rw-r--r--Documentation/target/tcmu-design.rst (renamed from Documentation/target/tcmu-design.txt)272
-rw-r--r--Documentation/tee.txt2
-rw-r--r--Documentation/timers/highres.rst (renamed from Documentation/timers/highres.txt)13
-rw-r--r--Documentation/timers/hpet.rst (renamed from Documentation/timers/hpet.txt)4
-rw-r--r--Documentation/timers/hrtimers.rst (renamed from Documentation/timers/hrtimers.txt)6
-rw-r--r--Documentation/timers/index.rst22
-rw-r--r--Documentation/timers/no_hz.rst (renamed from Documentation/timers/NO_HZ.txt)40
-rw-r--r--Documentation/timers/timekeeping.rst (renamed from Documentation/timers/timekeeping.txt)3
-rw-r--r--Documentation/timers/timers-howto.rst (renamed from Documentation/timers/timers-howto.txt)15
-rw-r--r--Documentation/trace/coresight.txt82
-rw-r--r--Documentation/trace/histogram.rst10
-rw-r--r--Documentation/trace/kprobetrace.rst7
-rw-r--r--Documentation/trace/uprobetracer.rst7
-rw-r--r--Documentation/translations/it_IT/admin-guide/kernel-parameters.rst12
-rw-r--r--Documentation/translations/it_IT/doc-guide/sphinx.rst17
-rw-r--r--Documentation/translations/it_IT/kernel-hacking/hacking.rst4
-rw-r--r--Documentation/translations/it_IT/kernel-hacking/locking.rst6
-rw-r--r--Documentation/translations/it_IT/process/4.Coding.rst2
-rw-r--r--Documentation/translations/it_IT/process/adding-syscalls.rst2
-rw-r--r--Documentation/translations/it_IT/process/coding-style.rst2
-rw-r--r--Documentation/translations/it_IT/process/howto.rst2
-rw-r--r--Documentation/translations/it_IT/process/license-rules.rst28
-rw-r--r--Documentation/translations/it_IT/process/magic-number.rst2
-rw-r--r--Documentation/translations/it_IT/process/stable-kernel-rules.rst4
-rw-r--r--Documentation/translations/it_IT/process/submit-checklist.rst2
-rw-r--r--Documentation/translations/ko_KR/memory-barriers.txt2
-rw-r--r--Documentation/translations/zh_CN/arm64/booting.txt4
-rw-r--r--Documentation/translations/zh_CN/arm64/legacy_instructions.txt4
-rw-r--r--Documentation/translations/zh_CN/arm64/memory.txt4
-rw-r--r--Documentation/translations/zh_CN/arm64/silicon-errata.txt4
-rw-r--r--Documentation/translations/zh_CN/arm64/tagged-pointers.txt4
-rw-r--r--Documentation/translations/zh_CN/basic_profiling.txt71
-rw-r--r--Documentation/translations/zh_CN/oops-tracing.txt2
-rw-r--r--Documentation/translations/zh_CN/process/4.Coding.rst4
-rw-r--r--Documentation/translations/zh_CN/process/coding-style.rst2
-rw-r--r--Documentation/translations/zh_CN/process/management-style.rst4
-rw-r--r--Documentation/translations/zh_CN/process/programming-language.rst59
-rw-r--r--Documentation/translations/zh_CN/process/submit-checklist.rst2
-rw-r--r--Documentation/translations/zh_CN/process/submitting-drivers.rst2
-rw-r--r--Documentation/userspace-api/spec_ctrl.rst2
-rw-r--r--Documentation/virtual/kvm/amd-memory-encryption.rst3
-rw-r--r--Documentation/virtual/kvm/api.txt2
-rw-r--r--Documentation/virtual/kvm/devices/arm-vgic-its.txt2
-rw-r--r--Documentation/vm/hwpoison.rst52
-rw-r--r--Documentation/vm/numa.rst2
-rw-r--r--Documentation/watchdog/convert_drivers_to_kernel_api.rst (renamed from Documentation/watchdog/convert_drivers_to_kernel_api.txt)109
-rw-r--r--Documentation/watchdog/hpwdt.rst (renamed from Documentation/watchdog/hpwdt.txt)27
-rw-r--r--Documentation/watchdog/index.rst25
-rw-r--r--Documentation/watchdog/mlx-wdt.rst (renamed from Documentation/watchdog/mlx-wdt.txt)24
-rw-r--r--Documentation/watchdog/pcwd-watchdog.rst (renamed from Documentation/watchdog/pcwd-watchdog.txt)13
-rw-r--r--Documentation/watchdog/watchdog-api.rst (renamed from Documentation/watchdog/watchdog-api.txt)76
-rw-r--r--Documentation/watchdog/watchdog-kernel-api.rst (renamed from Documentation/watchdog/watchdog-kernel-api.txt)91
-rw-r--r--Documentation/watchdog/watchdog-parameters.rst736
-rw-r--r--Documentation/watchdog/watchdog-parameters.txt410
-rw-r--r--Documentation/watchdog/watchdog-pm.rst (renamed from Documentation/watchdog/watchdog-pm.txt)3
-rw-r--r--Documentation/watchdog/wdt.rst (renamed from Documentation/watchdog/wdt.txt)31
-rw-r--r--Documentation/x86/index.rst1
-rw-r--r--Documentation/x86/resctrl_ui.rst30
-rw-r--r--Documentation/x86/x86_64/5level-paging.rst2
-rw-r--r--Documentation/x86/x86_64/boot-options.rst4
-rw-r--r--Documentation/x86/x86_64/fake-numa-for-cpusets.rst2
-rw-r--r--Documentation/xilinx/eemi.rst (renamed from Documentation/xilinx/eemi.txt)8
-rw-r--r--Documentation/xilinx/index.rst17
-rw-r--r--Kconfig4
-rw-r--r--MAINTAINERS26
-rw-r--r--arch/arc/plat-eznps/Kconfig2
-rw-r--r--arch/arm/Kconfig4
-rw-r--r--arch/arm64/Kconfig2
-rw-r--r--arch/arm64/include/asm/efi.h2
-rw-r--r--arch/arm64/include/asm/image.h2
-rw-r--r--arch/arm64/include/uapi/asm/sigcontext.h2
-rw-r--r--arch/arm64/kernel/kexec_image.c2
-rw-r--r--arch/c6x/Kconfig2
-rw-r--r--arch/m68k/q40/README2
-rw-r--r--arch/microblaze/Kconfig.debug2
-rw-r--r--arch/microblaze/Kconfig.platform2
-rw-r--r--arch/nds32/Kconfig2
-rw-r--r--arch/openrisc/Kconfig2
-rw-r--r--arch/powerpc/Kconfig2
-rw-r--r--arch/powerpc/sysdev/Kconfig2
-rw-r--r--arch/riscv/Kconfig2
-rw-r--r--arch/sh/Kconfig2
-rw-r--r--arch/x86/Kconfig20
-rw-r--r--arch/x86/Kconfig.debug2
-rw-r--r--arch/x86/boot/header.S2
-rw-r--r--arch/x86/entry/entry_64.S2
-rw-r--r--arch/x86/include/asm/bootparam_utils.h2
-rw-r--r--arch/x86/include/asm/page_64_types.h2
-rw-r--r--arch/x86/include/asm/pgtable_64_types.h2
-rw-r--r--arch/x86/kernel/cpu/microcode/amd.c2
-rw-r--r--arch/x86/kernel/kexec-bzimage64.c2
-rw-r--r--arch/x86/kernel/kprobes/core.c2
-rw-r--r--arch/x86/kernel/pci-dma.c2
-rw-r--r--arch/x86/mm/tlb.c2
-rw-r--r--arch/x86/platform/pvh/enlighten.c2
-rw-r--r--drivers/acpi/Kconfig10
-rw-r--r--drivers/auxdisplay/Kconfig2
-rw-r--r--drivers/block/Kconfig2
-rw-r--r--drivers/cdrom/cdrom.c2
-rw-r--r--drivers/firmware/Kconfig2
-rw-r--r--drivers/gpu/drm/Kconfig2
-rw-r--r--drivers/ide/Kconfig20
-rw-r--r--drivers/ide/ide-cd.c2
-rw-r--r--drivers/isdn/mISDN/dsp_core.c2
-rw-r--r--drivers/md/Kconfig2
-rw-r--r--drivers/md/dm-init.c2
-rw-r--r--drivers/md/dm-raid.c2
-rw-r--r--drivers/media/usb/dvb-usb-v2/anysee.c2
-rw-r--r--drivers/misc/lkdtm/core.c2
-rw-r--r--drivers/mtd/devices/Kconfig2
-rw-r--r--drivers/net/ethernet/faraday/ftgmac100.c2
-rw-r--r--drivers/net/ethernet/smsc/Kconfig6
-rw-r--r--drivers/net/wireless/intel/iwlegacy/Kconfig4
-rw-r--r--drivers/net/wireless/intel/iwlwifi/Kconfig2
-rw-r--r--drivers/parport/Kconfig2
-rw-r--r--drivers/pcmcia/ds.c2
-rw-r--r--drivers/platform/x86/Kconfig3
-rw-r--r--drivers/regulator/core.c2
-rw-r--r--drivers/scsi/Kconfig4
-rw-r--r--drivers/scsi/hpsa.c4
-rw-r--r--drivers/staging/fieldbus/Documentation/fieldbus_dev.txt4
-rw-r--r--drivers/staging/sm750fb/Kconfig2
-rw-r--r--drivers/tty/Kconfig2
-rw-r--r--drivers/usb/misc/Kconfig4
-rw-r--r--drivers/vhost/vhost.c2
-rw-r--r--drivers/video/fbdev/Kconfig38
-rw-r--r--drivers/video/fbdev/matrox/matroxfb_base.c2
-rw-r--r--drivers/video/fbdev/pxafb.c2
-rw-r--r--drivers/video/fbdev/sh7760fb.c2
-rw-r--r--drivers/watchdog/Kconfig6
-rw-r--r--drivers/watchdog/smsc37b787_wdt.c2
-rw-r--r--include/acpi/acpi_drivers.h2
-rw-r--r--include/linux/dcache.h4
-rw-r--r--include/linux/fault-inject.h2
-rw-r--r--include/linux/fs.h2
-rw-r--r--include/linux/fs_context.h2
-rw-r--r--include/linux/iopoll.h4
-rw-r--r--include/linux/lsm_hooks.h2
-rw-r--r--include/linux/regmap.h4
-rw-r--r--include/pcmcia/ds.h2
-rw-r--r--include/pcmcia/ss.h2
-rw-r--r--init/Kconfig6
-rw-r--r--kernel/sched/deadline.c2
-rw-r--r--lib/Kconfig.debug2
-rw-r--r--lib/list_sort.c2
-rw-r--r--mm/Kconfig2
-rw-r--r--net/bridge/netfilter/Kconfig2
-rw-r--r--net/ipv4/netfilter/Kconfig2
-rw-r--r--net/ipv6/netfilter/Kconfig2
-rw-r--r--net/netfilter/Kconfig16
-rw-r--r--net/tipc/Kconfig2
-rw-r--r--scripts/Kbuild.include4
-rw-r--r--scripts/Makefile.host2
-rwxr-xr-xscripts/checkpatch.pl8
-rwxr-xr-xscripts/documentation-file-ref-check58
-rw-r--r--scripts/kconfig/symbol.c2
-rw-r--r--scripts/kconfig/tests/err_recursive_dep/expected_stderr14
-rwxr-xr-xscripts/kernel-doc18
-rwxr-xr-xscripts/sphinx-pre-install76
-rw-r--r--security/Kconfig2
-rw-r--r--sound/oss/dmasound/Kconfig6
-rw-r--r--sound/soc/sof/ops.h2
-rw-r--r--tools/include/linux/err.h2
-rw-r--r--tools/objtool/Documentation/stack-validation.txt4
-rw-r--r--tools/testing/fault-injection/failcmd.sh2
-rw-r--r--tools/testing/selftests/x86/protection_keys.c2
416 files changed, 12159 insertions, 8581 deletions
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 923fe2001472..d404603c6b52 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -137,7 +137,8 @@ Description: Discover cpuidle policy and mechanism
current_governor: (RW) displays current idle policy. Users can
switch the governor at runtime by writing to this file.
- See files in Documentation/cpuidle/ for more information.
+ See Documentation/admin-guide/pm/cpuidle.rst and
+ Documentation/driver-api/pm/cpuidle.rst for more information.
What: /sys/devices/system/cpu/cpuX/cpuidle/stateN/name
diff --git a/Documentation/ABI/testing/sysfs-kernel-uids b/Documentation/ABI/testing/sysfs-kernel-uids
index 28f14695a852..4182b7061816 100644
--- a/Documentation/ABI/testing/sysfs-kernel-uids
+++ b/Documentation/ABI/testing/sysfs-kernel-uids
@@ -11,4 +11,4 @@ Description:
example would be, if User A has shares = 1024 and user
B has shares = 2048, User B will get twice the CPU
bandwidth user A will. For more details refer
- Documentation/scheduler/sched-design-CFS.txt
+ Documentation/scheduler/sched-design-CFS.rst
diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
index 0076150fdccb..e47c63bd4887 100644
--- a/Documentation/DMA-API.txt
+++ b/Documentation/DMA-API.txt
@@ -198,7 +198,7 @@ call to set the mask to the value returned.
::
size_t
- dma_direct_max_mapping_size(struct device *dev);
+ dma_max_mapping_size(struct device *dev);
Returns the maximum size of a mapping for the device. The size parameter
of the mapping functions like dma_map_single(), dma_map_page() and
diff --git a/Documentation/EDID/HOWTO.txt b/Documentation/EDID/howto.rst
index 539871c3b785..725fd49a88ca 100644
--- a/Documentation/EDID/HOWTO.txt
+++ b/Documentation/EDID/howto.rst
@@ -1,3 +1,9 @@
+:orphan:
+
+====
+EDID
+====
+
In the good old days when graphics parameters were configured explicitly
in a file called xorg.conf, even broken hardware could be managed.
@@ -34,16 +40,19 @@ Makefile. Please note that the EDID data structure expects the timing
values in a different way as compared to the standard X11 format.
X11:
-HTimings: hdisp hsyncstart hsyncend htotal
-VTimings: vdisp vsyncstart vsyncend vtotal
-
-EDID:
-#define XPIX hdisp
-#define XBLANK htotal-hdisp
-#define XOFFSET hsyncstart-hdisp
-#define XPULSE hsyncend-hsyncstart
-
-#define YPIX vdisp
-#define YBLANK vtotal-vdisp
-#define YOFFSET vsyncstart-vdisp
-#define YPULSE vsyncend-vsyncstart
+ HTimings:
+ hdisp hsyncstart hsyncend htotal
+ VTimings:
+ vdisp vsyncstart vsyncend vtotal
+
+EDID::
+
+ #define XPIX hdisp
+ #define XBLANK htotal-hdisp
+ #define XOFFSET hsyncstart-hdisp
+ #define XPULSE hsyncend-hsyncstart
+
+ #define YPIX vdisp
+ #define YBLANK vtotal-vdisp
+ #define YOFFSET vsyncstart-vdisp
+ #define YPULSE vsyncend-vsyncstart
diff --git a/Documentation/Kconfig b/Documentation/Kconfig
new file mode 100644
index 000000000000..66046fa1c341
--- /dev/null
+++ b/Documentation/Kconfig
@@ -0,0 +1,13 @@
+config WARN_MISSING_DOCUMENTS
+
+ bool "Warn if there's a missing documentation file"
+ depends on COMPILE_TEST
+ help
+ It is not uncommon that a document gets renamed.
+ This option makes the Kernel to check for missing dependencies,
+ warning when something is missing. Works only if the Kernel
+ is built from a git tree.
+
+ If unsure, select 'N'.
+
+
diff --git a/Documentation/Makefile b/Documentation/Makefile
index e889e7cb8511..e145e4db508b 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -4,6 +4,11 @@
subdir-y := devicetree/bindings/
+# Check for broken documentation file references
+ifeq ($(CONFIG_WARN_MISSING_DOCUMENTS),y)
+$(shell $(srctree)/scripts/documentation-file-ref-check --warn)
+endif
+
# You can set these variables from the command line.
SPHINXBUILD = sphinx-build
SPHINXOPTS =
@@ -23,11 +28,13 @@ ifeq ($(HAVE_SPHINX),0)
.DEFAULT:
$(warning The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed and in PATH, or set the SPHINXBUILD make variable to point to the full path of the '$(SPHINXBUILD)' executable.)
@echo
- @./scripts/sphinx-pre-install
+ @$(srctree)/scripts/sphinx-pre-install
@echo " SKIP Sphinx $@ target."
else # HAVE_SPHINX
+export SPHINXOPTS = $(shell perl -e 'open IN,"sphinx-build --version 2>&1 |"; while (<IN>) { if (m/([\d\.]+)/) { print "-jauto" if ($$1 >= "1.7") } ;} close IN')
+
# User-friendly check for pdflatex and latexmk
HAVE_PDFLATEX := $(shell if which $(PDFLATEX) >/dev/null 2>&1; then echo 1; else echo 0; fi)
HAVE_LATEXMK := $(shell if which latexmk >/dev/null 2>&1; then echo 1; else echo 0; fi)
@@ -70,12 +77,14 @@ quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4)
$(abspath $(BUILDDIR)/$3/$4)
htmldocs:
+ @$(srctree)/scripts/sphinx-pre-install --version-check
@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,html,$(var),,$(var)))
linkcheckdocs:
@$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,linkcheck,$(var),,$(var)))
latexdocs:
+ @$(srctree)/scripts/sphinx-pre-install --version-check
@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,latex,$(var),latex,$(var)))
ifeq ($(HAVE_PDFLATEX),0)
@@ -87,14 +96,17 @@ pdfdocs:
else # HAVE_PDFLATEX
pdfdocs: latexdocs
+ @$(srctree)/scripts/sphinx-pre-install --version-check
$(foreach var,$(SPHINXDIRS), $(MAKE) PDFLATEX="$(PDFLATEX)" LATEXOPTS="$(LATEXOPTS)" -C $(BUILDDIR)/$(var)/latex || exit;)
endif # HAVE_PDFLATEX
epubdocs:
+ @$(srctree)/scripts/sphinx-pre-install --version-check
@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,epub,$(var),epub,$(var)))
xmldocs:
+ @$(srctree)/scripts/sphinx-pre-install --version-check
@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,xml,$(var),xml,$(var)))
endif # HAVE_SPHINX
diff --git a/Documentation/RCU/UP.txt b/Documentation/RCU/UP.rst
index 53bde717017b..e26dda27430c 100644
--- a/Documentation/RCU/UP.txt
+++ b/Documentation/RCU/UP.rst
@@ -1,17 +1,19 @@
-RCU on Uniprocessor Systems
+.. _up_doc:
+RCU on Uniprocessor Systems
+===========================
A common misconception is that, on UP systems, the call_rcu() primitive
may immediately invoke its function. The basis of this misconception
is that since there is only one CPU, it should not be necessary to
wait for anything else to get done, since there are no other CPUs for
-anything else to be happening on. Although this approach will -sort- -of-
+anything else to be happening on. Although this approach will *sort of*
work a surprising amount of the time, it is a very bad idea in general.
This document presents three examples that demonstrate exactly how bad
an idea this is.
-
Example 1: softirq Suicide
+--------------------------
Suppose that an RCU-based algorithm scans a linked list containing
elements A, B, and C in process context, and can delete elements from
@@ -28,8 +30,8 @@ your kernel.
This same problem can occur if call_rcu() is invoked from a hardware
interrupt handler.
-
Example 2: Function-Call Fatality
+---------------------------------
Of course, one could avert the suicide described in the preceding example
by having call_rcu() directly invoke its arguments only if it was called
@@ -46,11 +48,13 @@ its arguments would cause it to fail to make the fundamental guarantee
underlying RCU, namely that call_rcu() defers invoking its arguments until
all RCU read-side critical sections currently executing have completed.
-Quick Quiz #1: why is it -not- legal to invoke synchronize_rcu() in
- this case?
+Quick Quiz #1:
+ Why is it *not* legal to invoke synchronize_rcu() in this case?
+:ref:`Answers to Quick Quiz <answer_quick_quiz_up>`
Example 3: Death by Deadlock
+----------------------------
Suppose that call_rcu() is invoked while holding a lock, and that the
callback function must acquire this same lock. In this case, if
@@ -76,25 +80,30 @@ there are cases where this can be quite ugly:
If call_rcu() directly invokes the callback, painful locking restrictions
or API changes would be required.
-Quick Quiz #2: What locking restriction must RCU callbacks respect?
+Quick Quiz #2:
+ What locking restriction must RCU callbacks respect?
+:ref:`Answers to Quick Quiz <answer_quick_quiz_up>`
Summary
+-------
Permitting call_rcu() to immediately invoke its arguments breaks RCU,
even on a UP system. So do not do it! Even on a UP system, the RCU
-infrastructure -must- respect grace periods, and -must- invoke callbacks
+infrastructure *must* respect grace periods, and *must* invoke callbacks
from a known environment in which no locks are held.
-Note that it -is- safe for synchronize_rcu() to return immediately on
-UP systems, including !PREEMPT SMP builds running on UP systems.
+Note that it *is* safe for synchronize_rcu() to return immediately on
+UP systems, including PREEMPT SMP builds running on UP systems.
-Quick Quiz #3: Why can't synchronize_rcu() return immediately on
- UP systems running preemptable RCU?
+Quick Quiz #3:
+ Why can't synchronize_rcu() return immediately on UP systems running
+ preemptable RCU?
+.. _answer_quick_quiz_up:
Answer to Quick Quiz #1:
- Why is it -not- legal to invoke synchronize_rcu() in this case?
+ Why is it *not* legal to invoke synchronize_rcu() in this case?
Because the calling function is scanning an RCU-protected linked
list, and is therefore within an RCU read-side critical section.
@@ -104,12 +113,13 @@ Answer to Quick Quiz #1:
Answer to Quick Quiz #2:
What locking restriction must RCU callbacks respect?
- Any lock that is acquired within an RCU callback must be
- acquired elsewhere using an _irq variant of the spinlock
- primitive. For example, if "mylock" is acquired by an
- RCU callback, then a process-context acquisition of this
- lock must use something like spin_lock_irqsave() to
- acquire the lock.
+ Any lock that is acquired within an RCU callback must be acquired
+ elsewhere using an _bh variant of the spinlock primitive.
+ For example, if "mylock" is acquired by an RCU callback, then
+ a process-context acquisition of this lock must use something
+ like spin_lock_bh() to acquire the lock. Please note that
+ it is also OK to use _irq variants of spinlocks, for example,
+ spin_lock_irqsave().
If the process-context code were to simply use spin_lock(),
then, since RCU callbacks can be invoked from softirq context,
@@ -119,7 +129,7 @@ Answer to Quick Quiz #2:
This restriction might seem gratuitous, since very few RCU
callbacks acquire locks directly. However, a great many RCU
- callbacks do acquire locks -indirectly-, for example, via
+ callbacks do acquire locks *indirectly*, for example, via
the kfree() primitive.
Answer to Quick Quiz #3:
diff --git a/Documentation/RCU/index.rst b/Documentation/RCU/index.rst
new file mode 100644
index 000000000000..340a9725676c
--- /dev/null
+++ b/Documentation/RCU/index.rst
@@ -0,0 +1,19 @@
+.. _rcu_concepts:
+
+============
+RCU concepts
+============
+
+.. toctree::
+ :maxdepth: 1
+
+ rcu
+ listRCU
+ UP
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/RCU/listRCU.txt b/Documentation/RCU/listRCU.rst
index adb5a3782846..7956ff33042b 100644
--- a/Documentation/RCU/listRCU.txt
+++ b/Documentation/RCU/listRCU.rst
@@ -1,5 +1,7 @@
-Using RCU to Protect Read-Mostly Linked Lists
+.. _list_rcu_doc:
+Using RCU to Protect Read-Mostly Linked Lists
+=============================================
One of the best applications of RCU is to protect read-mostly linked lists
("struct list_head" in list.h). One big advantage of this approach
@@ -7,8 +9,8 @@ is that all of the required memory barriers are included for you in
the list macros. This document describes several applications of RCU,
with the best fits first.
-
Example 1: Read-Side Action Taken Outside of Lock, No In-Place Updates
+----------------------------------------------------------------------
The best applications are cases where, if reader-writer locking were
used, the read-side lock would be dropped before taking any action
@@ -24,7 +26,7 @@ added or deleted, rather than being modified in place.
A straightforward example of this use of RCU may be found in the
system-call auditing support. For example, a reader-writer locked
-implementation of audit_filter_task() might be as follows:
+implementation of audit_filter_task() might be as follows::
static enum audit_state audit_filter_task(struct task_struct *tsk)
{
@@ -48,7 +50,7 @@ the corresponding value is returned. By the time that this value is acted
on, the list may well have been modified. This makes sense, since if
you are turning auditing off, it is OK to audit a few extra system calls.
-This means that RCU can be easily applied to the read side, as follows:
+This means that RCU can be easily applied to the read side, as follows::
static enum audit_state audit_filter_task(struct task_struct *tsk)
{
@@ -73,7 +75,7 @@ become list_for_each_entry_rcu(). The _rcu() list-traversal primitives
insert the read-side memory barriers that are required on DEC Alpha CPUs.
The changes to the update side are also straightforward. A reader-writer
-lock might be used as follows for deletion and insertion:
+lock might be used as follows for deletion and insertion::
static inline int audit_del_rule(struct audit_rule *rule,
struct list_head *list)
@@ -106,7 +108,7 @@ lock might be used as follows for deletion and insertion:
return 0;
}
-Following are the RCU equivalents for these two functions:
+Following are the RCU equivalents for these two functions::
static inline int audit_del_rule(struct audit_rule *rule,
struct list_head *list)
@@ -154,13 +156,13 @@ otherwise cause concurrent readers to fail spectacularly.
So, when readers can tolerate stale data and when entries are either added
or deleted, without in-place modification, it is very easy to use RCU!
-
Example 2: Handling In-Place Updates
+------------------------------------
The system-call auditing code does not update auditing rules in place.
However, if it did, reader-writer-locked code to do so might look as
follows (presumably, the field_count is only permitted to decrease,
-otherwise, the added fields would need to be filled in):
+otherwise, the added fields would need to be filled in)::
static inline int audit_upd_rule(struct audit_rule *rule,
struct list_head *list,
@@ -187,7 +189,7 @@ otherwise, the added fields would need to be filled in):
The RCU version creates a copy, updates the copy, then replaces the old
entry with the newly updated entry. This sequence of actions, allowing
concurrent reads while doing a copy to perform an update, is what gives
-RCU ("read-copy update") its name. The RCU code is as follows:
+RCU ("read-copy update") its name. The RCU code is as follows::
static inline int audit_upd_rule(struct audit_rule *rule,
struct list_head *list,
@@ -216,8 +218,8 @@ RCU ("read-copy update") its name. The RCU code is as follows:
Again, this assumes that the caller holds audit_netlink_sem. Normally,
the reader-writer lock would become a spinlock in this sort of code.
-
Example 3: Eliminating Stale Data
+---------------------------------
The auditing examples above tolerate stale data, as do most algorithms
that are tracking external state. Because there is a delay from the
@@ -231,13 +233,16 @@ per-entry spinlock, and, if the "deleted" flag is set, pretends that the
entry does not exist. For this to be helpful, the search function must
return holding the per-entry spinlock, as ipc_lock() does in fact do.
-Quick Quiz: Why does the search function need to return holding the
- per-entry lock for this deleted-flag technique to be helpful?
+Quick Quiz:
+ Why does the search function need to return holding the per-entry lock for
+ this deleted-flag technique to be helpful?
+
+:ref:`Answer to Quick Quiz <answer_quick_quiz_list>`
If the system-call audit module were to ever need to reject stale data,
one way to accomplish this would be to add a "deleted" flag and a "lock"
spinlock to the audit_entry structure, and modify audit_filter_task()
-as follows:
+as follows::
static enum audit_state audit_filter_task(struct task_struct *tsk)
{
@@ -268,7 +273,7 @@ audit_upd_rule() would need additional memory barriers to ensure
that the list_add_rcu() was really executed before the list_del_rcu().
The audit_del_rule() function would need to set the "deleted"
-flag under the spinlock as follows:
+flag under the spinlock as follows::
static inline int audit_del_rule(struct audit_rule *rule,
struct list_head *list)
@@ -290,8 +295,8 @@ flag under the spinlock as follows:
return -EFAULT; /* No matching rule */
}
-
Summary
+-------
Read-mostly list-based data structures that can tolerate stale data are
the most amenable to use of RCU. The simplest case is where entries are
@@ -302,8 +307,9 @@ If stale data cannot be tolerated, then a "deleted" flag may be used
in conjunction with a per-entry spinlock in order to allow the search
function to reject newly deleted data.
+.. _answer_quick_quiz_list:
-Answer to Quick Quiz
+Answer to Quick Quiz:
Why does the search function need to return holding the per-entry
lock for this deleted-flag technique to be helpful?
diff --git a/Documentation/RCU/rcu.rst b/Documentation/RCU/rcu.rst
new file mode 100644
index 000000000000..8dfb437dacc3
--- /dev/null
+++ b/Documentation/RCU/rcu.rst
@@ -0,0 +1,92 @@
+.. _rcu_doc:
+
+RCU Concepts
+============
+
+The basic idea behind RCU (read-copy update) is to split destructive
+operations into two parts, one that prevents anyone from seeing the data
+item being destroyed, and one that actually carries out the destruction.
+A "grace period" must elapse between the two parts, and this grace period
+must be long enough that any readers accessing the item being deleted have
+since dropped their references. For example, an RCU-protected deletion
+from a linked list would first remove the item from the list, wait for
+a grace period to elapse, then free the element. See the
+Documentation/RCU/listRCU.rst file for more information on using RCU with
+linked lists.
+
+Frequently Asked Questions
+--------------------------
+
+- Why would anyone want to use RCU?
+
+ The advantage of RCU's two-part approach is that RCU readers need
+ not acquire any locks, perform any atomic instructions, write to
+ shared memory, or (on CPUs other than Alpha) execute any memory
+ barriers. The fact that these operations are quite expensive
+ on modern CPUs is what gives RCU its performance advantages
+ in read-mostly situations. The fact that RCU readers need not
+ acquire locks can also greatly simplify deadlock-avoidance code.
+
+- How can the updater tell when a grace period has completed
+ if the RCU readers give no indication when they are done?
+
+ Just as with spinlocks, RCU readers are not permitted to
+ block, switch to user-mode execution, or enter the idle loop.
+ Therefore, as soon as a CPU is seen passing through any of these
+ three states, we know that that CPU has exited any previous RCU
+ read-side critical sections. So, if we remove an item from a
+ linked list, and then wait until all CPUs have switched context,
+ executed in user mode, or executed in the idle loop, we can
+ safely free up that item.
+
+ Preemptible variants of RCU (CONFIG_PREEMPT_RCU) get the
+ same effect, but require that the readers manipulate CPU-local
+ counters. These counters allow limited types of blocking within
+ RCU read-side critical sections. SRCU also uses CPU-local
+ counters, and permits general blocking within RCU read-side
+ critical sections. These variants of RCU detect grace periods
+ by sampling these counters.
+
+- If I am running on a uniprocessor kernel, which can only do one
+ thing at a time, why should I wait for a grace period?
+
+ See the Documentation/RCU/UP.rst file for more information.
+
+- How can I see where RCU is currently used in the Linux kernel?
+
+ Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu",
+ "rcu_read_lock_bh", "rcu_read_unlock_bh", "srcu_read_lock",
+ "srcu_read_unlock", "synchronize_rcu", "synchronize_net",
+ "synchronize_srcu", and the other RCU primitives. Or grab one
+ of the cscope databases from:
+
+ (http://www.rdrop.com/users/paulmck/RCU/linuxusage/rculocktab.html).
+
+- What guidelines should I follow when writing code that uses RCU?
+
+ See the checklist.txt file in this directory.
+
+- Why the name "RCU"?
+
+ "RCU" stands for "read-copy update". The file Documentation/RCU/listRCU.rst
+ has more information on where this name came from, search for
+ "read-copy update" to find it.
+
+- I hear that RCU is patented? What is with that?
+
+ Yes, it is. There are several known patents related to RCU,
+ search for the string "Patent" in RTFP.txt to find them.
+ Of these, one was allowed to lapse by the assignee, and the
+ others have been contributed to the Linux kernel under GPL.
+ There are now also LGPL implementations of user-level RCU
+ available (http://liburcu.org/).
+
+- I hear that RCU needs work in order to support realtime kernels?
+
+ Realtime-friendly RCU can be enabled via the CONFIG_PREEMPT_RCU
+ kernel configuration parameter.
+
+- Where can I find more information on RCU?
+
+ See the RTFP.txt file in this directory.
+ Or point your browser at (http://www.rdrop.com/users/paulmck/RCU/).
diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt
deleted file mode 100644
index c818cf65c5a9..000000000000
--- a/Documentation/RCU/rcu.txt
+++ /dev/null
@@ -1,89 +0,0 @@
-RCU Concepts
-
-
-The basic idea behind RCU (read-copy update) is to split destructive
-operations into two parts, one that prevents anyone from seeing the data
-item being destroyed, and one that actually carries out the destruction.
-A "grace period" must elapse between the two parts, and this grace period
-must be long enough that any readers accessing the item being deleted have
-since dropped their references. For example, an RCU-protected deletion
-from a linked list would first remove the item from the list, wait for
-a grace period to elapse, then free the element. See the listRCU.txt
-file for more information on using RCU with linked lists.
-
-
-Frequently Asked Questions
-
-o Why would anyone want to use RCU?
-
- The advantage of RCU's two-part approach is that RCU readers need
- not acquire any locks, perform any atomic instructions, write to
- shared memory, or (on CPUs other than Alpha) execute any memory
- barriers. The fact that these operations are quite expensive
- on modern CPUs is what gives RCU its performance advantages
- in read-mostly situations. The fact that RCU readers need not
- acquire locks can also greatly simplify deadlock-avoidance code.
-
-o How can the updater tell when a grace period has completed
- if the RCU readers give no indication when they are done?
-
- Just as with spinlocks, RCU readers are not permitted to
- block, switch to user-mode execution, or enter the idle loop.
- Therefore, as soon as a CPU is seen passing through any of these
- three states, we know that that CPU has exited any previous RCU
- read-side critical sections. So, if we remove an item from a
- linked list, and then wait until all CPUs have switched context,
- executed in user mode, or executed in the idle loop, we can
- safely free up that item.
-
- Preemptible variants of RCU (CONFIG_PREEMPT_RCU) get the
- same effect, but require that the readers manipulate CPU-local
- counters. These counters allow limited types of blocking within
- RCU read-side critical sections. SRCU also uses CPU-local
- counters, and permits general blocking within RCU read-side
- critical sections. These variants of RCU detect grace periods
- by sampling these counters.
-
-o If I am running on a uniprocessor kernel, which can only do one
- thing at a time, why should I wait for a grace period?
-
- See the UP.txt file in this directory.
-
-o How can I see where RCU is currently used in the Linux kernel?
-
- Search for "rcu_read_lock", "rcu_read_unlock", "call_rcu",
- "rcu_read_lock_bh", "rcu_read_unlock_bh", "srcu_read_lock",
- "srcu_read_unlock", "synchronize_rcu", "synchronize_net",
- "synchronize_srcu", and the other RCU primitives. Or grab one
- of the cscope databases from:
-
- http://www.rdrop.com/users/paulmck/RCU/linuxusage/rculocktab.html
-
-o What guidelines should I follow when writing code that uses RCU?
-
- See the checklist.txt file in this directory.
-
-o Why the name "RCU"?
-
- "RCU" stands for "read-copy update". The file listRCU.txt has
- more information on where this name came from, search for
- "read-copy update" to find it.
-
-o I hear that RCU is patented? What is with that?
-
- Yes, it is. There are several known patents related to RCU,
- search for the string "Patent" in RTFP.txt to find them.
- Of these, one was allowed to lapse by the assignee, and the
- others have been contributed to the Linux kernel under GPL.
- There are now also LGPL implementations of user-level RCU
- available (http://liburcu.org/).
-
-o I hear that RCU needs work in order to support realtime kernels?
-
- Realtime-friendly RCU can be enabled via the CONFIG_PREEMPT_RCU
- kernel configuration parameter.
-
-o Where can I find more information on RCU?
-
- See the RTFP.txt file in this directory.
- Or point your browser at http://www.rdrop.com/users/paulmck/RCU/.
diff --git a/Documentation/accelerators/ocxl.rst b/Documentation/accelerators/ocxl.rst
index 14cefc020e2d..b1cea19a90f5 100644
--- a/Documentation/accelerators/ocxl.rst
+++ b/Documentation/accelerators/ocxl.rst
@@ -1,3 +1,5 @@
+:orphan:
+
========================================================
OpenCAPI (Open Coherent Accelerator Processor Interface)
========================================================
diff --git a/Documentation/acpi/dsd/leds.txt b/Documentation/acpi/dsd/leds.txt
index 81a63af42ed2..cc58b1a574c5 100644
--- a/Documentation/acpi/dsd/leds.txt
+++ b/Documentation/acpi/dsd/leds.txt
@@ -96,4 +96,4 @@ where
<URL:http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf>,
referenced 2019-02-21.
-[7] Documentation/acpi/dsd/data-node-reference.txt
+[7] Documentation/firmware-guide/acpi/dsd/data-node-references.rst
diff --git a/Documentation/admin-guide/README.rst b/Documentation/admin-guide/README.rst
index a582c780c3bd..cc6151fc0845 100644
--- a/Documentation/admin-guide/README.rst
+++ b/Documentation/admin-guide/README.rst
@@ -227,7 +227,7 @@ Configuring the kernel
"make tinyconfig" Configure the tiniest possible kernel.
You can find more information on using the Linux kernel config tools
- in Documentation/kbuild/kconfig.txt.
+ in Documentation/kbuild/kconfig.rst.
- NOTES on ``make config``:
diff --git a/Documentation/filesystems/binderfs.rst b/Documentation/admin-guide/binderfs.rst
index c009671f8434..c009671f8434 100644
--- a/Documentation/filesystems/binderfs.rst
+++ b/Documentation/admin-guide/binderfs.rst
diff --git a/Documentation/admin-guide/bug-hunting.rst b/Documentation/admin-guide/bug-hunting.rst
index f278b289e260..b761aa2a51d2 100644
--- a/Documentation/admin-guide/bug-hunting.rst
+++ b/Documentation/admin-guide/bug-hunting.rst
@@ -90,7 +90,7 @@ the disk is not available then you have three options:
run a null modem to a second machine and capture the output there
using your favourite communication program. Minicom works well.
-(3) Use Kdump (see Documentation/kdump/kdump.txt),
+(3) Use Kdump (see Documentation/kdump/kdump.rst),
extract the kernel ring buffer from old memory with using dmesg
gdbmacro in Documentation/kdump/gdbmacros.txt.
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index ffc064c1ec68..49311f3da6f2 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -9,5 +9,6 @@ are configurable at compile, boot or run time.
.. toctree::
:maxdepth: 1
+ spectre
l1tf
mds
diff --git a/Documentation/admin-guide/hw-vuln/spectre.rst b/Documentation/admin-guide/hw-vuln/spectre.rst
new file mode 100644
index 000000000000..25f3b2532198
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/spectre.rst
@@ -0,0 +1,697 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+Spectre Side Channels
+=====================
+
+Spectre is a class of side channel attacks that exploit branch prediction
+and speculative execution on modern CPUs to read memory, possibly
+bypassing access controls. Speculative execution side channel exploits
+do not modify memory but attempt to infer privileged data in the memory.
+
+This document covers Spectre variant 1 and Spectre variant 2.
+
+Affected processors
+-------------------
+
+Speculative execution side channel methods affect a wide range of modern
+high performance processors, since most modern high speed processors
+use branch prediction and speculative execution.
+
+The following CPUs are vulnerable:
+
+ - Intel Core, Atom, Pentium, and Xeon processors
+
+ - AMD Phenom, EPYC, and Zen processors
+
+ - IBM POWER and zSeries processors
+
+ - Higher end ARM processors
+
+ - Apple CPUs
+
+ - Higher end MIPS CPUs
+
+ - Likely most other high performance CPUs. Contact your CPU vendor for details.
+
+Whether a processor is affected or not can be read out from the Spectre
+vulnerability files in sysfs. See :ref:`spectre_sys_info`.
+
+Related CVEs
+------------
+
+The following CVE entries describe Spectre variants:
+
+ ============= ======================= =================
+ CVE-2017-5753 Bounds check bypass Spectre variant 1
+ CVE-2017-5715 Branch target injection Spectre variant 2
+ ============= ======================= =================
+
+Problem
+-------
+
+CPUs use speculative operations to improve performance. That may leave
+traces of memory accesses or computations in the processor's caches,
+buffers, and branch predictors. Malicious software may be able to
+influence the speculative execution paths, and then use the side effects
+of the speculative execution in the CPUs' caches and buffers to infer
+privileged data touched during the speculative execution.
+
+Spectre variant 1 attacks take advantage of speculative execution of
+conditional branches, while Spectre variant 2 attacks use speculative
+execution of indirect branches to leak privileged memory.
+See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[7] <spec_ref7>`
+:ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`.
+
+Spectre variant 1 (Bounds Check Bypass)
+---------------------------------------
+
+The bounds check bypass attack :ref:`[2] <spec_ref2>` takes advantage
+of speculative execution that bypasses conditional branch instructions
+used for memory access bounds check (e.g. checking if the index of an
+array results in memory access within a valid range). This results in
+memory accesses to invalid memory (with out-of-bound index) that are
+done speculatively before validation checks resolve. Such speculative
+memory accesses can leave side effects, creating side channels which
+leak information to the attacker.
+
+There are some extensions of Spectre variant 1 attacks for reading data
+over the network, see :ref:`[12] <spec_ref12>`. However such attacks
+are difficult, low bandwidth, fragile, and are considered low risk.
+
+Spectre variant 2 (Branch Target Injection)
+-------------------------------------------
+
+The branch target injection attack takes advantage of speculative
+execution of indirect branches :ref:`[3] <spec_ref3>`. The indirect
+branch predictors inside the processor used to guess the target of
+indirect branches can be influenced by an attacker, causing gadget code
+to be speculatively executed, thus exposing sensitive data touched by
+the victim. The side effects left in the CPU's caches during speculative
+execution can be measured to infer data values.
+
+.. _poison_btb:
+
+In Spectre variant 2 attacks, the attacker can steer speculative indirect
+branches in the victim to gadget code by poisoning the branch target
+buffer of a CPU used for predicting indirect branch addresses. Such
+poisoning could be done by indirect branching into existing code,
+with the address offset of the indirect branch under the attacker's
+control. Since the branch prediction on impacted hardware does not
+fully disambiguate branch address and uses the offset for prediction,
+this could cause privileged code's indirect branch to jump to a gadget
+code with the same offset.
+
+The most useful gadgets take an attacker-controlled input parameter (such
+as a register value) so that the memory read can be controlled. Gadgets
+without input parameters might be possible, but the attacker would have
+very little control over what memory can be read, reducing the risk of
+the attack revealing useful data.
+
+One other variant 2 attack vector is for the attacker to poison the
+return stack buffer (RSB) :ref:`[13] <spec_ref13>` to cause speculative
+subroutine return instruction execution to go to a gadget. An attacker's
+imbalanced subroutine call instructions might "poison" entries in the
+return stack buffer which are later consumed by a victim's subroutine
+return instructions. This attack can be mitigated by flushing the return
+stack buffer on context switch, or virtual machine (VM) exit.
+
+On systems with simultaneous multi-threading (SMT), attacks are possible
+from the sibling thread, as level 1 cache and branch target buffer
+(BTB) may be shared between hardware threads in a CPU core. A malicious
+program running on the sibling thread may influence its peer's BTB to
+steer its indirect branch speculations to gadget code, and measure the
+speculative execution's side effects left in level 1 cache to infer the
+victim's data.
+
+Attack scenarios
+----------------
+
+The following list of attack scenarios have been anticipated, but may
+not cover all possible attack vectors.
+
+1. A user process attacking the kernel
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The attacker passes a parameter to the kernel via a register or
+ via a known address in memory during a syscall. Such parameter may
+ be used later by the kernel as an index to an array or to derive
+ a pointer for a Spectre variant 1 attack. The index or pointer
+ is invalid, but bound checks are bypassed in the code branch taken
+ for speculative execution. This could cause privileged memory to be
+ accessed and leaked.
+
+ For kernel code that has been identified where data pointers could
+ potentially be influenced for Spectre attacks, new "nospec" accessor
+ macros are used to prevent speculative loading of data.
+
+ Spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
+ target buffer (BTB) before issuing syscall to launch an attack.
+ After entering the kernel, the kernel could use the poisoned branch
+ target buffer on indirect jump and jump to gadget code in speculative
+ execution.
+
+ If an attacker tries to control the memory addresses leaked during
+ speculative execution, he would also need to pass a parameter to the
+ gadget, either through a register or a known address in memory. After
+ the gadget has executed, he can measure the side effect.
+
+ The kernel can protect itself against consuming poisoned branch
+ target buffer entries by using return trampolines (also known as
+ "retpoline") :ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` for all
+ indirect branches. Return trampolines trap speculative execution paths
+ to prevent jumping to gadget code during speculative execution.
+ x86 CPUs with Enhanced Indirect Branch Restricted Speculation
+ (Enhanced IBRS) available in hardware should use the feature to
+ mitigate Spectre variant 2 instead of retpoline. Enhanced IBRS is
+ more efficient than retpoline.
+
+ There may be gadget code in firmware which could be exploited with
+ Spectre variant 2 attack by a rogue user process. To mitigate such
+ attacks on x86, Indirect Branch Restricted Speculation (IBRS) feature
+ is turned on before the kernel invokes any firmware code.
+
+2. A user process attacking another user process
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ A malicious user process can try to attack another user process,
+ either via a context switch on the same hardware thread, or from the
+ sibling hyperthread sharing a physical processor core on simultaneous
+ multi-threading (SMT) system.
+
+ Spectre variant 1 attacks generally require passing parameters
+ between the processes, which needs a data passing relationship, such
+ as remote procedure calls (RPC). Those parameters are used in gadget
+ code to derive invalid data pointers accessing privileged memory in
+ the attacked process.
+
+ Spectre variant 2 attacks can be launched from a rogue process by
+ :ref:`poisoning <poison_btb>` the branch target buffer. This can
+ influence the indirect branch targets for a victim process that either
+ runs later on the same hardware thread, or running concurrently on
+ a sibling hardware thread sharing the same physical core.
+
+ A user process can protect itself against Spectre variant 2 attacks
+ by using the prctl() syscall to disable indirect branch speculation
+ for itself. An administrator can also cordon off an unsafe process
+ from polluting the branch target buffer by disabling the process's
+ indirect branch speculation. This comes with a performance cost
+ from not using indirect branch speculation and clearing the branch
+ target buffer. When SMT is enabled on x86, for a process that has
+ indirect branch speculation disabled, Single Threaded Indirect Branch
+ Predictors (STIBP) :ref:`[4] <spec_ref4>` are turned on to prevent the
+ sibling thread from controlling branch target buffer. In addition,
+ the Indirect Branch Prediction Barrier (IBPB) is issued to clear the
+ branch target buffer when context switching to and from such process.
+
+ On x86, the return stack buffer is stuffed on context switch.
+ This prevents the branch target buffer from being used for branch
+ prediction when the return stack buffer underflows while switching to
+ a deeper call stack. Any poisoned entries in the return stack buffer
+ left by the previous process will also be cleared.
+
+ User programs should use address space randomization to make attacks
+ more difficult (Set /proc/sys/kernel/randomize_va_space = 1 or 2).
+
+3. A virtualized guest attacking the host
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The attack mechanism is similar to how user processes attack the
+ kernel. The kernel is entered via hyper-calls or other virtualization
+ exit paths.
+
+ For Spectre variant 1 attacks, rogue guests can pass parameters
+ (e.g. in registers) via hyper-calls to derive invalid pointers to
+ speculate into privileged memory after entering the kernel. For places
+ where such kernel code has been identified, nospec accessor macros
+ are used to stop speculative memory access.
+
+ For Spectre variant 2 attacks, rogue guests can :ref:`poison
+ <poison_btb>` the branch target buffer or return stack buffer, causing
+ the kernel to jump to gadget code in the speculative execution paths.
+
+ To mitigate variant 2, the host kernel can use return trampolines
+ for indirect branches to bypass the poisoned branch target buffer,
+ and flushing the return stack buffer on VM exit. This prevents rogue
+ guests from affecting indirect branching in the host kernel.
+
+ To protect host processes from rogue guests, host processes can have
+ indirect branch speculation disabled via prctl(). The branch target
+ buffer is cleared before context switching to such processes.
+
+4. A virtualized guest attacking other guest
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ A rogue guest may attack another guest to get data accessible by the
+ other guest.
+
+ Spectre variant 1 attacks are possible if parameters can be passed
+ between guests. This may be done via mechanisms such as shared memory
+ or message passing. Such parameters could be used to derive data
+ pointers to privileged data in guest. The privileged data could be
+ accessed by gadget code in the victim's speculation paths.
+
+ Spectre variant 2 attacks can be launched from a rogue guest by
+ :ref:`poisoning <poison_btb>` the branch target buffer or the return
+ stack buffer. Such poisoned entries could be used to influence
+ speculation execution paths in the victim guest.
+
+ Linux kernel mitigates attacks to other guests running in the same
+ CPU hardware thread by flushing the return stack buffer on VM exit,
+ and clearing the branch target buffer before switching to a new guest.
+
+ If SMT is used, Spectre variant 2 attacks from an untrusted guest
+ in the sibling hyperthread can be mitigated by the administrator,
+ by turning off the unsafe guest's indirect branch speculation via
+ prctl(). A guest can also protect itself by turning on microcode
+ based mitigations (such as IBPB or STIBP on x86) within the guest.
+
+.. _spectre_sys_info:
+
+Spectre system information
+--------------------------
+
+The Linux kernel provides a sysfs interface to enumerate the current
+mitigation status of the system for Spectre: whether the system is
+vulnerable, and which mitigations are active.
+
+The sysfs file showing Spectre variant 1 mitigation status is:
+
+ /sys/devices/system/cpu/vulnerabilities/spectre_v1
+
+The possible values in this file are:
+
+ ======================================= =================================
+ 'Mitigation: __user pointer sanitation' Protection in kernel on a case by
+ case base with explicit pointer
+ sanitation.
+ ======================================= =================================
+
+However, the protections are put in place on a case by case basis,
+and there is no guarantee that all possible attack vectors for Spectre
+variant 1 are covered.
+
+The spectre_v2 kernel file reports if the kernel has been compiled with
+retpoline mitigation or if the CPU has hardware mitigation, and if the
+CPU has support for additional process-specific mitigation.
+
+This file also reports CPU features enabled by microcode to mitigate
+attack between user processes:
+
+1. Indirect Branch Prediction Barrier (IBPB) to add additional
+ isolation between processes of different users.
+2. Single Thread Indirect Branch Predictors (STIBP) to add additional
+ isolation between CPU threads running on the same core.
+
+These CPU features may impact performance when used and can be enabled
+per process on a case-by-case base.
+
+The sysfs file showing Spectre variant 2 mitigation status is:
+
+ /sys/devices/system/cpu/vulnerabilities/spectre_v2
+
+The possible values in this file are:
+
+ - Kernel status:
+
+ ==================================== =================================
+ 'Not affected' The processor is not vulnerable
+ 'Vulnerable' Vulnerable, no mitigation
+ 'Mitigation: Full generic retpoline' Software-focused mitigation
+ 'Mitigation: Full AMD retpoline' AMD-specific software mitigation
+ 'Mitigation: Enhanced IBRS' Hardware-focused mitigation
+ ==================================== =================================
+
+ - Firmware status: Show if Indirect Branch Restricted Speculation (IBRS) is
+ used to protect against Spectre variant 2 attacks when calling firmware (x86 only).
+
+ ========== =============================================================
+ 'IBRS_FW' Protection against user program attacks when calling firmware
+ ========== =============================================================
+
+ - Indirect branch prediction barrier (IBPB) status for protection between
+ processes of different users. This feature can be controlled through
+ prctl() per process, or through kernel command line options. This is
+ an x86 only feature. For more details see below.
+
+ =================== ========================================================
+ 'IBPB: disabled' IBPB unused
+ 'IBPB: always-on' Use IBPB on all tasks
+ 'IBPB: conditional' Use IBPB on SECCOMP or indirect branch restricted tasks
+ =================== ========================================================
+
+ - Single threaded indirect branch prediction (STIBP) status for protection
+ between different hyper threads. This feature can be controlled through
+ prctl per process, or through kernel command line options. This is x86
+ only feature. For more details see below.
+
+ ==================== ========================================================
+ 'STIBP: disabled' STIBP unused
+ 'STIBP: forced' Use STIBP on all tasks
+ 'STIBP: conditional' Use STIBP on SECCOMP or indirect branch restricted tasks
+ ==================== ========================================================
+
+ - Return stack buffer (RSB) protection status:
+
+ ============= ===========================================
+ 'RSB filling' Protection of RSB on context switch enabled
+ ============= ===========================================
+
+Full mitigation might require a microcode update from the CPU
+vendor. When the necessary microcode is not available, the kernel will
+report vulnerability.
+
+Turning on mitigation for Spectre variant 1 and Spectre variant 2
+-----------------------------------------------------------------
+
+1. Kernel mitigation
+^^^^^^^^^^^^^^^^^^^^
+
+ For the Spectre variant 1, vulnerable kernel code (as determined
+ by code audit or scanning tools) is annotated on a case by case
+ basis to use nospec accessor macros for bounds clipping :ref:`[2]
+ <spec_ref2>` to avoid any usable disclosure gadgets. However, it may
+ not cover all attack vectors for Spectre variant 1.
+
+ For Spectre variant 2 mitigation, the compiler turns indirect calls or
+ jumps in the kernel into equivalent return trampolines (retpolines)
+ :ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` to go to the target
+ addresses. Speculative execution paths under retpolines are trapped
+ in an infinite loop to prevent any speculative execution jumping to
+ a gadget.
+
+ To turn on retpoline mitigation on a vulnerable CPU, the kernel
+ needs to be compiled with a gcc compiler that supports the
+ -mindirect-branch=thunk-extern -mindirect-branch-register options.
+ If the kernel is compiled with a Clang compiler, the compiler needs
+ to support -mretpoline-external-thunk option. The kernel config
+ CONFIG_RETPOLINE needs to be turned on, and the CPU needs to run with
+ the latest updated microcode.
+
+ On Intel Skylake-era systems the mitigation covers most, but not all,
+ cases. See :ref:`[3] <spec_ref3>` for more details.
+
+ On CPUs with hardware mitigation for Spectre variant 2 (e.g. Enhanced
+ IBRS on x86), retpoline is automatically disabled at run time.
+
+ The retpoline mitigation is turned on by default on vulnerable
+ CPUs. It can be forced on or off by the administrator
+ via the kernel command line and sysfs control files. See
+ :ref:`spectre_mitigation_control_command_line`.
+
+ On x86, indirect branch restricted speculation is turned on by default
+ before invoking any firmware code to prevent Spectre variant 2 exploits
+ using the firmware.
+
+ Using kernel address space randomization (CONFIG_RANDOMIZE_SLAB=y
+ and CONFIG_SLAB_FREELIST_RANDOM=y in the kernel configuration) makes
+ attacks on the kernel generally more difficult.
+
+2. User program mitigation
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ User programs can mitigate Spectre variant 1 using LFENCE or "bounds
+ clipping". For more details see :ref:`[2] <spec_ref2>`.
+
+ For Spectre variant 2 mitigation, individual user programs
+ can be compiled with return trampolines for indirect branches.
+ This protects them from consuming poisoned entries in the branch
+ target buffer left by malicious software. Alternatively, the
+ programs can disable their indirect branch speculation via prctl()
+ (See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`).
+ On x86, this will turn on STIBP to guard against attacks from the
+ sibling thread when the user program is running, and use IBPB to
+ flush the branch target buffer when switching to/from the program.
+
+ Restricting indirect branch speculation on a user program will
+ also prevent the program from launching a variant 2 attack
+ on x86. All sand-boxed SECCOMP programs have indirect branch
+ speculation restricted by default. Administrators can change
+ that behavior via the kernel command line and sysfs control files.
+ See :ref:`spectre_mitigation_control_command_line`.
+
+ Programs that disable their indirect branch speculation will have
+ more overhead and run slower.
+
+ User programs should use address space randomization
+ (/proc/sys/kernel/randomize_va_space = 1 or 2) to make attacks more
+ difficult.
+
+3. VM mitigation
+^^^^^^^^^^^^^^^^
+
+ Within the kernel, Spectre variant 1 attacks from rogue guests are
+ mitigated on a case by case basis in VM exit paths. Vulnerable code
+ uses nospec accessor macros for "bounds clipping", to avoid any
+ usable disclosure gadgets. However, this may not cover all variant
+ 1 attack vectors.
+
+ For Spectre variant 2 attacks from rogue guests to the kernel, the
+ Linux kernel uses retpoline or Enhanced IBRS to prevent consumption of
+ poisoned entries in branch target buffer left by rogue guests. It also
+ flushes the return stack buffer on every VM exit to prevent a return
+ stack buffer underflow so poisoned branch target buffer could be used,
+ or attacker guests leaving poisoned entries in the return stack buffer.
+
+ To mitigate guest-to-guest attacks in the same CPU hardware thread,
+ the branch target buffer is sanitized by flushing before switching
+ to a new guest on a CPU.
+
+ The above mitigations are turned on by default on vulnerable CPUs.
+
+ To mitigate guest-to-guest attacks from sibling thread when SMT is
+ in use, an untrusted guest running in the sibling thread can have
+ its indirect branch speculation disabled by administrator via prctl().
+
+ The kernel also allows guests to use any microcode based mitigation
+ they choose to use (such as IBPB or STIBP on x86) to protect themselves.
+
+.. _spectre_mitigation_control_command_line:
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+Spectre variant 2 mitigation can be disabled or force enabled at the
+kernel command line.
+
+ nospectre_v2
+
+ [X86] Disable all mitigations for the Spectre variant 2
+ (indirect branch prediction) vulnerability. System may
+ allow data leaks with this option, which is equivalent
+ to spectre_v2=off.
+
+
+ spectre_v2=
+
+ [X86] Control mitigation of Spectre variant 2
+ (indirect branch speculation) vulnerability.
+ The default operation protects the kernel from
+ user space attacks.
+
+ on
+ unconditionally enable, implies
+ spectre_v2_user=on
+ off
+ unconditionally disable, implies
+ spectre_v2_user=off
+ auto
+ kernel detects whether your CPU model is
+ vulnerable
+
+ Selecting 'on' will, and 'auto' may, choose a
+ mitigation method at run time according to the
+ CPU, the available microcode, the setting of the
+ CONFIG_RETPOLINE configuration option, and the
+ compiler with which the kernel was built.
+
+ Selecting 'on' will also enable the mitigation
+ against user space to user space task attacks.
+
+ Selecting 'off' will disable both the kernel and
+ the user space protections.
+
+ Specific mitigations can also be selected manually:
+
+ retpoline
+ replace indirect branches
+ retpoline,generic
+ google's original retpoline
+ retpoline,amd
+ AMD-specific minimal thunk
+
+ Not specifying this option is equivalent to
+ spectre_v2=auto.
+
+For user space mitigation:
+
+ spectre_v2_user=
+
+ [X86] Control mitigation of Spectre variant 2
+ (indirect branch speculation) vulnerability between
+ user space tasks
+
+ on
+ Unconditionally enable mitigations. Is
+ enforced by spectre_v2=on
+
+ off
+ Unconditionally disable mitigations. Is
+ enforced by spectre_v2=off
+
+ prctl
+ Indirect branch speculation is enabled,
+ but mitigation can be enabled via prctl
+ per thread. The mitigation control state
+ is inherited on fork.
+
+ prctl,ibpb
+ Like "prctl" above, but only STIBP is
+ controlled per thread. IBPB is issued
+ always when switching between different user
+ space processes.
+
+ seccomp
+ Same as "prctl" above, but all seccomp
+ threads will enable the mitigation unless
+ they explicitly opt out.
+
+ seccomp,ibpb
+ Like "seccomp" above, but only STIBP is
+ controlled per thread. IBPB is issued
+ always when switching between different
+ user space processes.
+
+ auto
+ Kernel selects the mitigation depending on
+ the available CPU features and vulnerability.
+
+ Default mitigation:
+ If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
+
+ Not specifying this option is equivalent to
+ spectre_v2_user=auto.
+
+ In general the kernel by default selects
+ reasonable mitigations for the current CPU. To
+ disable Spectre variant 2 mitigations, boot with
+ spectre_v2=off. Spectre variant 1 mitigations
+ cannot be disabled.
+
+Mitigation selection guide
+--------------------------
+
+1. Trusted userspace
+^^^^^^^^^^^^^^^^^^^^
+
+ If all userspace applications are from trusted sources and do not
+ execute externally supplied untrusted code, then the mitigations can
+ be disabled.
+
+2. Protect sensitive programs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ For security-sensitive programs that have secrets (e.g. crypto
+ keys), protection against Spectre variant 2 can be put in place by
+ disabling indirect branch speculation when the program is running
+ (See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`).
+
+3. Sandbox untrusted programs
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ Untrusted programs that could be a source of attacks can be cordoned
+ off by disabling their indirect branch speculation when they are run
+ (See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`).
+ This prevents untrusted programs from polluting the branch target
+ buffer. All programs running in SECCOMP sandboxes have indirect
+ branch speculation restricted by default. This behavior can be
+ changed via the kernel command line and sysfs control files. See
+ :ref:`spectre_mitigation_control_command_line`.
+
+3. High security mode
+^^^^^^^^^^^^^^^^^^^^^
+
+ All Spectre variant 2 mitigations can be forced on
+ at boot time for all programs (See the "on" option in
+ :ref:`spectre_mitigation_control_command_line`). This will add
+ overhead as indirect branch speculations for all programs will be
+ restricted.
+
+ On x86, branch target buffer will be flushed with IBPB when switching
+ to a new program. STIBP is left on all the time to protect programs
+ against variant 2 attacks originating from programs running on
+ sibling threads.
+
+ Alternatively, STIBP can be used only when running programs
+ whose indirect branch speculation is explicitly disabled,
+ while IBPB is still used all the time when switching to a new
+ program to clear the branch target buffer (See "ibpb" option in
+ :ref:`spectre_mitigation_control_command_line`). This "ibpb" option
+ has less performance cost than the "on" option, which leaves STIBP
+ on all the time.
+
+References on Spectre
+---------------------
+
+Intel white papers:
+
+.. _spec_ref1:
+
+[1] `Intel analysis of speculative execution side channels <https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/Intel-Analysis-of-Speculative-Execution-Side-Channels.pdf>`_.
+
+.. _spec_ref2:
+
+[2] `Bounds check bypass <https://software.intel.com/security-software-guidance/software-guidance/bounds-check-bypass>`_.
+
+.. _spec_ref3:
+
+[3] `Deep dive: Retpoline: A branch target injection mitigation <https://software.intel.com/security-software-guidance/insights/deep-dive-retpoline-branch-target-injection-mitigation>`_.
+
+.. _spec_ref4:
+
+[4] `Deep Dive: Single Thread Indirect Branch Predictors <https://software.intel.com/security-software-guidance/insights/deep-dive-single-thread-indirect-branch-predictors>`_.
+
+AMD white papers:
+
+.. _spec_ref5:
+
+[5] `AMD64 technology indirect branch control extension <https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Update_Indirect_Branch_Control.pdf>`_.
+
+.. _spec_ref6:
+
+[6] `Software techniques for managing speculation on AMD processors <https://developer.amd.com/wp-content/resources/90343-B_SoftwareTechniquesforManagingSpeculation_WP_7-18Update_FNL.pdf>`_.
+
+ARM white papers:
+
+.. _spec_ref7:
+
+[7] `Cache speculation side-channels <https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/download-the-whitepaper>`_.
+
+.. _spec_ref8:
+
+[8] `Cache speculation issues update <https://developer.arm.com/support/arm-security-updates/speculative-processor-vulnerability/latest-updates/cache-speculation-issues-update>`_.
+
+Google white paper:
+
+.. _spec_ref9:
+
+[9] `Retpoline: a software construct for preventing branch-target-injection <https://support.google.com/faqs/answer/7625886>`_.
+
+MIPS white paper:
+
+.. _spec_ref10:
+
+[10] `MIPS: response on speculative execution and side channel vulnerabilities <https://www.mips.com/blog/mips-response-on-speculative-execution-and-side-channel-vulnerabilities/>`_.
+
+Academic papers:
+
+.. _spec_ref11:
+
+[11] `Spectre Attacks: Exploiting Speculative Execution <https://spectreattack.com/spectre.pdf>`_.
+
+.. _spec_ref12:
+
+[12] `NetSpectre: Read Arbitrary Memory over Network <https://arxiv.org/abs/1807.10535>`_.
+
+.. _spec_ref13:
+
+[13] `Spectre Returns! Speculation Attacks using the Return Stack Buffer <https://www.usenix.org/system/files/conference/woot18/woot18-paper-koruyeh.pdf>`_.
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 8001917ee012..24fbe0568eff 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -70,6 +70,7 @@ configure specific aspects of kernel behavior to your liking.
ras
bcache
ext4
+ binderfs
pm/index
thunderbolt
LSM/index
diff --git a/Documentation/admin-guide/kernel-parameters.rst b/Documentation/admin-guide/kernel-parameters.rst
index 0124980dca2d..5d29ba5ad88c 100644
--- a/Documentation/admin-guide/kernel-parameters.rst
+++ b/Documentation/admin-guide/kernel-parameters.rst
@@ -9,11 +9,11 @@ and sorted into English Dictionary order (defined as ignoring all
punctuation and sorting digits before letters in a case insensitive
manner), and with descriptions where known.
-The kernel parses parameters from the kernel command line up to "--";
+The kernel parses parameters from the kernel command line up to "``--``";
if it doesn't recognize a parameter and it doesn't contain a '.', the
parameter gets passed to init: parameters with '=' go into init's
environment, others are passed as command line arguments to init.
-Everything after "--" is passed as an argument to init.
+Everything after "``--``" is passed as an argument to init.
Module parameters can be specified in two ways: via the kernel command
line with a module name prefix, or via modprobe, e.g.::
@@ -167,7 +167,7 @@ parameter is applicable::
X86-32 X86-32, aka i386 architecture is enabled.
X86-64 X86-64 architecture is enabled.
More X86-64 boot options can be found in
- Documentation/x86/x86_64/boot-options.txt .
+ Documentation/x86/x86_64/boot-options.rst.
X86 Either 32-bit or 64-bit x86 (same as X86-32+X86-64)
X86_UV SGI UV support is enabled.
XEN Xen support is enabled
@@ -181,10 +181,10 @@ In addition, the following text indicates that the option::
Parameters denoted with BOOT are actually interpreted by the boot
loader, and have no meaning to the kernel directly.
Do not modify the syntax of boot loader parameters without extreme
-need or coordination with <Documentation/x86/boot.txt>.
+need or coordination with <Documentation/x86/boot.rst>.
There are also arch-specific kernel-parameters not documented here.
-See for example <Documentation/x86/x86_64/boot-options.txt>.
+See for example <Documentation/x86/x86_64/boot-options.rst>.
Note that ALL kernel parameters listed below are CASE SENSITIVE, and that
a trailing = on the name of any parameter states that that parameter will
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 74d28efa1c40..f1c433daef6b 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -53,7 +53,7 @@
ACPI_DEBUG_PRINT statements, e.g.,
ACPI_DEBUG_PRINT((ACPI_DB_INFO, ...
The debug_level mask defaults to "info". See
- Documentation/acpi/debug.txt for more information about
+ Documentation/firmware-guide/acpi/debug.rst for more information about
debug layers and levels.
Enable processor driver info messages:
@@ -708,14 +708,14 @@
[KNL, x86_64] select a region under 4G first, and
fall back to reserve region above 4G when '@offset'
hasn't been specified.
- See Documentation/kdump/kdump.txt for further details.
+ See Documentation/kdump/kdump.rst for further details.
crashkernel=range1:size1[,range2:size2,...][@offset]
[KNL] Same as above, but depends on the memory
in the running system. The syntax of range is
start-[end] where start and end are both
a memory unit (amount[KMG]). See also
- Documentation/kdump/kdump.txt for an example.
+ Documentation/kdump/kdump.rst for an example.
crashkernel=size[KMG],high
[KNL, x86_64] range could be above 4G. Allow kernel
@@ -932,7 +932,7 @@
edid/1680x1050.bin, or edid/1920x1080.bin is given
and no file with the same name exists. Details and
instructions how to build your own EDID data are
- available in Documentation/EDID/HOWTO.txt. An EDID
+ available in Documentation/EDID/howto.rst. An EDID
data set will only be used for a particular connector,
if its name and a colon are prepended to the EDID
name. Each connector may use a unique EDID data
@@ -963,7 +963,7 @@
for details.
nompx [X86] Disables Intel Memory Protection Extensions.
- See Documentation/x86/intel_mpx.txt for more
+ See Documentation/x86/intel_mpx.rst for more
information about the feature.
nopku [X86] Disable Memory Protection Keys CPU feature found
@@ -1189,7 +1189,7 @@
that is to be dynamically loaded by Linux. If there are
multiple variables with the same name but with different
vendor GUIDs, all of them will be loaded. See
- Documentation/acpi/ssdt-overlays.txt for details.
+ Documentation/admin-guide/acpi/ssdt-overlays.rst for details.
eisa_irq_edge= [PARISC,HW]
@@ -1209,7 +1209,7 @@
Specifies physical address of start of kernel core
image elf header and optionally the size. Generally
kexec loader will pass this option to capture kernel.
- See Documentation/kdump/kdump.txt for details.
+ See Documentation/kdump/kdump.rst for details.
enable_mtrr_cleanup [X86]
The kernel tries to adjust MTRR layout from continuous
@@ -1388,9 +1388,6 @@
Valid parameters: "on", "off"
Default: "on"
- hisax= [HW,ISDN]
- See Documentation/isdn/README.HiSax.
-
hlt [BUGS=ARM,SH]
hpet= [X86-32,HPET] option to control HPET usage
@@ -1507,7 +1504,7 @@
Format: =0.0 to prevent dma on hda, =0.1 hdb =1.0 hdc
.vlb_clock .pci_clock .noflush .nohpa .noprobe .nowerr
.cdrom .chs .ignore_cable are additional options
- See Documentation/ide/ide.txt.
+ See Documentation/ide/ide.rst.
ide-generic.probe-mask= [HW] (E)IDE subsystem
Format: <int>
@@ -2383,7 +2380,7 @@
mce [X86-32] Machine Check Exception
- mce=option [X86-64] See Documentation/x86/x86_64/boot-options.txt
+ mce=option [X86-64] See Documentation/x86/x86_64/boot-options.rst
md= [HW] RAID subsystems devices and level
See Documentation/admin-guide/md.rst.
@@ -2439,7 +2436,7 @@
set according to the
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE kernel config
option.
- See Documentation/memory-hotplug.txt.
+ See Documentation/admin-guide/mm/memory-hotplug.rst.
memmap=exactmap [KNL,X86] Enable setting of an exact
E820 memory map, as specified by the user.
@@ -2528,7 +2525,7 @@
mem_encrypt=on: Activate SME
mem_encrypt=off: Do not activate SME
- Refer to Documentation/x86/amd-memory-encryption.txt
+ Refer to Documentation/virtual/kvm/amd-memory-encryption.rst
for details on when memory encryption can be activated.
mem_sleep_default= [SUSPEND] Default system suspend mode:
@@ -2836,8 +2833,9 @@
0 - turn hardlockup detector in nmi_watchdog off
1 - turn hardlockup detector in nmi_watchdog on
When panic is specified, panic when an NMI watchdog
- timeout occurs (or 'nopanic' to override the opposite
- default). To disable both hard and soft lockup detectors,
+ timeout occurs (or 'nopanic' to not panic on an NMI
+ watchdog, if CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is set)
+ To disable both hard and soft lockup detectors,
please see 'nowatchdog'.
This is useful when you use a panic=... timeout and
need the box quickly up again.
@@ -3528,7 +3526,7 @@
See Documentation/blockdev/paride.txt.
pirq= [SMP,APIC] Manual mp-table setup
- See Documentation/x86/i386/IO-APIC.txt.
+ See Documentation/x86/i386/IO-APIC.rst.
plip= [PPT,NET] Parallel port network link
Format: { parport<nr> | timid | 0 }
@@ -5032,7 +5030,7 @@
vector=percpu: enable percpu vector domain
video= [FB] Frame buffer configuration
- See Documentation/fb/modedb.txt.
+ See Documentation/fb/modedb.rst.
video.brightness_switch_enabled= [0,1]
If set to 1, on receiving an ACPI notify event
@@ -5060,7 +5058,7 @@
Can be used multiple times for multiple devices.
vga= [BOOT,X86-32] Select a particular video mode
- See Documentation/x86/boot.txt and
+ See Documentation/x86/boot.rst and
Documentation/svga.txt.
Use vga=ask for menu.
This is actually a boot loader parameter; the value is
@@ -5167,7 +5165,7 @@
Default: 3 = cyan.
watchdog timers [HW,WDT] For information on watchdog timers,
- see Documentation/watchdog/watchdog-parameters.txt
+ see Documentation/watchdog/watchdog-parameters.rst
or other driver-specific files in the
Documentation/watchdog/ directory.
diff --git a/Documentation/admin-guide/mm/numaperf.rst b/Documentation/admin-guide/mm/numaperf.rst
index c067ed145158..a80c3c37226e 100644
--- a/Documentation/admin-guide/mm/numaperf.rst
+++ b/Documentation/admin-guide/mm/numaperf.rst
@@ -165,5 +165,6 @@ write-through caching.
========
See Also
========
-.. [1] https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf
- Section 5.2.27
+
+[1] https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf
+- Section 5.2.27
diff --git a/Documentation/admin-guide/ras.rst b/Documentation/admin-guide/ras.rst
index c7495e42e6f4..2b20f5f7380d 100644
--- a/Documentation/admin-guide/ras.rst
+++ b/Documentation/admin-guide/ras.rst
@@ -199,7 +199,7 @@ Architecture (MCA)\ [#f3]_.
mode).
.. [#f3] For more details about the Machine Check Architecture (MCA),
- please read Documentation/x86/x86_64/machinecheck at the Kernel tree.
+ please read Documentation/x86/x86_64/machinecheck.rst at the Kernel tree.
EDAC - Error Detection And Correction
*************************************
diff --git a/Documentation/aoe/aoe.txt b/Documentation/aoe/aoe.rst
index c71487d399d1..58747ecec71d 100644
--- a/Documentation/aoe/aoe.txt
+++ b/Documentation/aoe/aoe.rst
@@ -1,3 +1,6 @@
+Introduction
+============
+
ATA over Ethernet is a network protocol that provides simple access to
block storage on the LAN.
@@ -22,7 +25,8 @@ document the use of the driver and are not necessary if you install
the aoetools.
-CREATING DEVICE NODES
+Creating Device Nodes
+=====================
Users of udev should find the block device nodes created
automatically, but to create all the necessary device nodes, use the
@@ -38,7 +42,8 @@ CREATING DEVICE NODES
confusing when an AoE device is not present the first time the a
command is run but appears a second later.
-USING DEVICE NODES
+Using Device Nodes
+==================
"cat /dev/etherd/err" blocks, waiting for error diagnostic output,
like any retransmitted packets.
@@ -55,7 +60,7 @@ USING DEVICE NODES
by sysfs counterparts. Using the commands in aoetools insulates
users from these implementation details.
- The block devices are named like this:
+ The block devices are named like this::
e{shelf}.{slot}
e{shelf}.{slot}p{part}
@@ -64,7 +69,8 @@ USING DEVICE NODES
first shelf (shelf address zero). That's the whole disk. The first
partition on that disk would be "e0.2p1".
-USING SYSFS
+Using sysfs
+===========
Each aoe block device in /sys/block has the extra attributes of
state, mac, and netif. The state attribute is "up" when the device
@@ -78,29 +84,29 @@ USING SYSFS
There is a script in this directory that formats this information in
a convenient way. Users with aoetools should use the aoe-stat
- command.
-
- root@makki root# sh Documentation/aoe/status.sh
- e10.0 eth3 up
- e10.1 eth3 up
- e10.2 eth3 up
- e10.3 eth3 up
- e10.4 eth3 up
- e10.5 eth3 up
- e10.6 eth3 up
- e10.7 eth3 up
- e10.8 eth3 up
- e10.9 eth3 up
- e4.0 eth1 up
- e4.1 eth1 up
- e4.2 eth1 up
- e4.3 eth1 up
- e4.4 eth1 up
- e4.5 eth1 up
- e4.6 eth1 up
- e4.7 eth1 up
- e4.8 eth1 up
- e4.9 eth1 up
+ command::
+
+ root@makki root# sh Documentation/aoe/status.sh
+ e10.0 eth3 up
+ e10.1 eth3 up
+ e10.2 eth3 up
+ e10.3 eth3 up
+ e10.4 eth3 up
+ e10.5 eth3 up
+ e10.6 eth3 up
+ e10.7 eth3 up
+ e10.8 eth3 up
+ e10.9 eth3 up
+ e4.0 eth1 up
+ e4.1 eth1 up
+ e4.2 eth1 up
+ e4.3 eth1 up
+ e4.4 eth1 up
+ e4.5 eth1 up
+ e4.6 eth1 up
+ e4.7 eth1 up
+ e4.8 eth1 up
+ e4.9 eth1 up
Use /sys/module/aoe/parameters/aoe_iflist (or better, the driver
option discussed below) instead of /dev/etherd/interfaces to limit
@@ -113,12 +119,13 @@ USING SYSFS
for this purpose. You can also directly use the
/dev/etherd/discover special file described above.
-DRIVER OPTIONS
+Driver Options
+==============
There is a boot option for the built-in aoe driver and a
corresponding module parameter, aoe_iflist. Without this option,
all network interfaces may be used for ATA over Ethernet. Here is a
- usage example for the module parameter.
+ usage example for the module parameter::
modprobe aoe_iflist="eth1 eth3"
diff --git a/Documentation/aoe/examples.rst b/Documentation/aoe/examples.rst
new file mode 100644
index 000000000000..91f3198e52c1
--- /dev/null
+++ b/Documentation/aoe/examples.rst
@@ -0,0 +1,23 @@
+Example of udev rules
+---------------------
+
+ .. include:: udev.txt
+ :literal:
+
+Example of udev install rules script
+------------------------------------
+
+ .. literalinclude:: udev-install.sh
+ :language: shell
+
+Example script to get status
+----------------------------
+
+ .. literalinclude:: status.sh
+ :language: shell
+
+Example of AoE autoload script
+------------------------------
+
+ .. literalinclude:: autoload.sh
+ :language: shell
diff --git a/Documentation/aoe/index.rst b/Documentation/aoe/index.rst
new file mode 100644
index 000000000000..4394b9b7913c
--- /dev/null
+++ b/Documentation/aoe/index.rst
@@ -0,0 +1,19 @@
+:orphan:
+
+=======================
+ATA over Ethernet (AoE)
+=======================
+
+.. toctree::
+ :maxdepth: 1
+
+ aoe
+ todo
+ examples
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/aoe/todo.txt b/Documentation/aoe/todo.rst
index c09dfad4aed8..dea8db5a33e1 100644
--- a/Documentation/aoe/todo.txt
+++ b/Documentation/aoe/todo.rst
@@ -1,3 +1,6 @@
+TODO
+====
+
There is a potential for deadlock when allocating a struct sk_buff for
data that needs to be written out to aoe storage. If the data is
being written from a dirty page in order to free that page, and if
diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt
index 1f06daf03f5b..54feda5a0772 100644
--- a/Documentation/aoe/udev.txt
+++ b/Documentation/aoe/udev.txt
@@ -11,7 +11,7 @@
# udev_rules="/etc/udev/rules.d/"
# bash# ls /etc/udev/rules.d/
# 10-wacom.rules 50-udev.rules
-# bash# cp /path/to/linux-2.6.xx/Documentation/aoe/udev.txt \
+# bash# cp /path/to/linux/Documentation/aoe/udev.txt \
# /etc/udev/rules.d/60-aoe.rules
#
diff --git a/Documentation/arm/mem_alignment b/Documentation/arm/mem_alignment
index 6335fcacbba9..e110e2781039 100644
--- a/Documentation/arm/mem_alignment
+++ b/Documentation/arm/mem_alignment
@@ -1,4 +1,4 @@
-Too many problems poped up because of unnoticed misaligned memory access in
+Too many problems popped up because of unnoticed misaligned memory access in
kernel code lately. Therefore the alignment fixup is now unconditionally
configured in for SA11x0 based targets. According to Alan Cox, this is a
bad idea to configure it out, but Russell King has some good reasons for
diff --git a/Documentation/arm/stm32/overview.rst b/Documentation/arm/stm32/overview.rst
index 85cfc8410798..f7e734153860 100644
--- a/Documentation/arm/stm32/overview.rst
+++ b/Documentation/arm/stm32/overview.rst
@@ -1,3 +1,5 @@
+:orphan:
+
========================
STM32 ARM Linux Overview
========================
diff --git a/Documentation/arm/stm32/stm32f429-overview.rst b/Documentation/arm/stm32/stm32f429-overview.rst
index 18feda97f483..65bbb1c3b423 100644
--- a/Documentation/arm/stm32/stm32f429-overview.rst
+++ b/Documentation/arm/stm32/stm32f429-overview.rst
@@ -1,3 +1,5 @@
+:orphan:
+
STM32F429 Overview
==================
diff --git a/Documentation/arm/stm32/stm32f746-overview.rst b/Documentation/arm/stm32/stm32f746-overview.rst
index b5f4b6ce7656..42d593085015 100644
--- a/Documentation/arm/stm32/stm32f746-overview.rst
+++ b/Documentation/arm/stm32/stm32f746-overview.rst
@@ -1,3 +1,5 @@
+:orphan:
+
STM32F746 Overview
==================
diff --git a/Documentation/arm/stm32/stm32f769-overview.rst b/Documentation/arm/stm32/stm32f769-overview.rst
index 228656ced2fe..f6adac862b17 100644
--- a/Documentation/arm/stm32/stm32f769-overview.rst
+++ b/Documentation/arm/stm32/stm32f769-overview.rst
@@ -1,3 +1,5 @@
+:orphan:
+
STM32F769 Overview
==================
diff --git a/Documentation/arm/stm32/stm32h743-overview.rst b/Documentation/arm/stm32/stm32h743-overview.rst
index 3458dc00095d..c525835e7473 100644
--- a/Documentation/arm/stm32/stm32h743-overview.rst
+++ b/Documentation/arm/stm32/stm32h743-overview.rst
@@ -1,3 +1,5 @@
+:orphan:
+
STM32H743 Overview
==================
diff --git a/Documentation/arm/stm32/stm32mp157-overview.rst b/Documentation/arm/stm32/stm32mp157-overview.rst
index 62e176d47ca7..2c52cd020601 100644
--- a/Documentation/arm/stm32/stm32mp157-overview.rst
+++ b/Documentation/arm/stm32/stm32mp157-overview.rst
@@ -1,3 +1,5 @@
+:orphan:
+
STM32MP157 Overview
===================
diff --git a/Documentation/arm64/acpi_object_usage.txt b/Documentation/arm64/acpi_object_usage.rst
index c77010c5c1f0..d51b69dc624d 100644
--- a/Documentation/arm64/acpi_object_usage.txt
+++ b/Documentation/arm64/acpi_object_usage.rst
@@ -1,5 +1,7 @@
+===========
ACPI Tables
------------
+===========
+
The expectations of individual ACPI tables are discussed in the list that
follows.
@@ -11,54 +13,71 @@ outside of the UEFI Forum (see Section 5.2.6 of the specification).
For ACPI on arm64, tables also fall into the following categories:
- -- Required: DSDT, FADT, GTDT, MADT, MCFG, RSDP, SPCR, XSDT
+ - Required: DSDT, FADT, GTDT, MADT, MCFG, RSDP, SPCR, XSDT
- -- Recommended: BERT, EINJ, ERST, HEST, PCCT, SSDT
+ - Recommended: BERT, EINJ, ERST, HEST, PCCT, SSDT
- -- Optional: BGRT, CPEP, CSRT, DBG2, DRTM, ECDT, FACS, FPDT, IORT,
+ - Optional: BGRT, CPEP, CSRT, DBG2, DRTM, ECDT, FACS, FPDT, IORT,
MCHI, MPST, MSCT, NFIT, PMTT, RASF, SBST, SLIT, SPMI, SRAT, STAO,
TCPA, TPM2, UEFI, XENV
- -- Not supported: BOOT, DBGP, DMAR, ETDT, HPET, IBFT, IVRS, LPIT,
+ - Not supported: BOOT, DBGP, DMAR, ETDT, HPET, IBFT, IVRS, LPIT,
MSDM, OEMx, PSDT, RSDT, SLIC, WAET, WDAT, WDRT, WPBT
+====== ========================================================================
Table Usage for ARMv8 Linux
------ ----------------------------------------------------------------
+====== ========================================================================
BERT Section 18.3 (signature == "BERT")
- == Boot Error Record Table ==
+
+ **Boot Error Record Table**
+
Must be supplied if RAS support is provided by the platform. It
is recommended this table be supplied.
BOOT Signature Reserved (signature == "BOOT")
- == simple BOOT flag table ==
+
+ **simple BOOT flag table**
+
Microsoft only table, will not be supported.
BGRT Section 5.2.22 (signature == "BGRT")
- == Boot Graphics Resource Table ==
+
+ **Boot Graphics Resource Table**
+
Optional, not currently supported, with no real use-case for an
ARM server.
CPEP Section 5.2.18 (signature == "CPEP")
- == Corrected Platform Error Polling table ==
+
+ **Corrected Platform Error Polling table**
+
Optional, not currently supported, and not recommended until such
time as ARM-compatible hardware is available, and the specification
suitably modified.
CSRT Signature Reserved (signature == "CSRT")
- == Core System Resources Table ==
+
+ **Core System Resources Table**
+
Optional, not currently supported.
DBG2 Signature Reserved (signature == "DBG2")
- == DeBuG port table 2 ==
+
+ **DeBuG port table 2**
+
License has changed and should be usable. Optional if used instead
of earlycon=<device> on the command line.
DBGP Signature Reserved (signature == "DBGP")
- == DeBuG Port table ==
+
+ **DeBuG Port table**
+
Microsoft only table, will not be supported.
DSDT Section 5.2.11.1 (signature == "DSDT")
- == Differentiated System Description Table ==
+
+ **Differentiated System Description Table**
+
A DSDT is required; see also SSDT.
ACPI tables contain only one DSDT but can contain one or more SSDTs,
@@ -66,22 +85,30 @@ DSDT Section 5.2.11.1 (signature == "DSDT")
but cannot modify or replace anything in the DSDT.
DMAR Signature Reserved (signature == "DMAR")
- == DMA Remapping table ==
+
+ **DMA Remapping table**
+
x86 only table, will not be supported.
DRTM Signature Reserved (signature == "DRTM")
- == Dynamic Root of Trust for Measurement table ==
+
+ **Dynamic Root of Trust for Measurement table**
+
Optional, not currently supported.
ECDT Section 5.2.16 (signature == "ECDT")
- == Embedded Controller Description Table ==
+
+ **Embedded Controller Description Table**
+
Optional, not currently supported, but could be used on ARM if and
only if one uses the GPE_BIT field to represent an IRQ number, since
there are no GPE blocks defined in hardware reduced mode. This would
need to be modified in the ACPI specification.
EINJ Section 18.6 (signature == "EINJ")
- == Error Injection table ==
+
+ **Error Injection table**
+
This table is very useful for testing platform response to error
conditions; it allows one to inject an error into the system as
if it had actually occurred. However, this table should not be
@@ -89,27 +116,35 @@ EINJ Section 18.6 (signature == "EINJ")
and executed with the ACPICA tools only during testing.
ERST Section 18.5 (signature == "ERST")
- == Error Record Serialization Table ==
+
+ **Error Record Serialization Table**
+
On a platform supports RAS, this table must be supplied if it is not
UEFI-based; if it is UEFI-based, this table may be supplied. When this
table is not present, UEFI run time service will be utilized to save
and retrieve hardware error information to and from a persistent store.
ETDT Signature Reserved (signature == "ETDT")
- == Event Timer Description Table ==
+
+ **Event Timer Description Table**
+
Obsolete table, will not be supported.
FACS Section 5.2.10 (signature == "FACS")
- == Firmware ACPI Control Structure ==
+
+ **Firmware ACPI Control Structure**
+
It is unlikely that this table will be terribly useful. If it is
provided, the Global Lock will NOT be used since it is not part of
the hardware reduced profile, and only 64-bit address fields will
be considered valid.
FADT Section 5.2.9 (signature == "FACP")
- == Fixed ACPI Description Table ==
+
+ **Fixed ACPI Description Table**
Required for arm64.
+
The HW_REDUCED_ACPI flag must be set. All of the fields that are
to be ignored when HW_REDUCED_ACPI is set are expected to be set to
zero.
@@ -118,22 +153,28 @@ FADT Section 5.2.9 (signature == "FACP")
used, not FIRMWARE_CTRL.
If PSCI is used (as is recommended), make sure that ARM_BOOT_ARCH is
- filled in properly -- that the PSCI_COMPLIANT flag is set and that
+ filled in properly - that the PSCI_COMPLIANT flag is set and that
PSCI_USE_HVC is set or unset as needed (see table 5-37).
For the DSDT that is also required, the X_DSDT field is to be used,
not the DSDT field.
FPDT Section 5.2.23 (signature == "FPDT")
- == Firmware Performance Data Table ==
+
+ **Firmware Performance Data Table**
+
Optional, not currently supported.
GTDT Section 5.2.24 (signature == "GTDT")
- == Generic Timer Description Table ==
+
+ **Generic Timer Description Table**
+
Required for arm64.
HEST Section 18.3.2 (signature == "HEST")
- == Hardware Error Source Table ==
+
+ **Hardware Error Source Table**
+
ARM-specific error sources have been defined; please use those or the
PCI types such as type 6 (AER Root Port), 7 (AER Endpoint), or 8 (AER
Bridge), or use type 9 (Generic Hardware Error Source). Firmware first
@@ -144,122 +185,174 @@ HEST Section 18.3.2 (signature == "HEST")
is recommended this table be supplied.
HPET Signature Reserved (signature == "HPET")
- == High Precision Event timer Table ==
+
+ **High Precision Event timer Table**
+
x86 only table, will not be supported.
IBFT Signature Reserved (signature == "IBFT")
- == iSCSI Boot Firmware Table ==
+
+ **iSCSI Boot Firmware Table**
+
Microsoft defined table, support TBD.
IORT Signature Reserved (signature == "IORT")
- == Input Output Remapping Table ==
+
+ **Input Output Remapping Table**
+
arm64 only table, required in order to describe IO topology, SMMUs,
and GIC ITSs, and how those various components are connected together,
such as identifying which components are behind which SMMUs/ITSs.
This table will only be required on certain SBSA platforms (e.g.,
- when using GICv3-ITS and an SMMU); on SBSA Level 0 platforms, it
+ when using GICv3-ITS and an SMMU); on SBSA Level 0 platforms, it
remains optional.
IVRS Signature Reserved (signature == "IVRS")
- == I/O Virtualization Reporting Structure ==
+
+ **I/O Virtualization Reporting Structure**
+
x86_64 (AMD) only table, will not be supported.
LPIT Signature Reserved (signature == "LPIT")
- == Low Power Idle Table ==
+
+ **Low Power Idle Table**
+
x86 only table as of ACPI 5.1; starting with ACPI 6.0, processor
descriptions and power states on ARM platforms should use the DSDT
and define processor container devices (_HID ACPI0010, Section 8.4,
and more specifically 8.4.3 and and 8.4.4).
MADT Section 5.2.12 (signature == "APIC")
- == Multiple APIC Description Table ==
+
+ **Multiple APIC Description Table**
+
Required for arm64. Only the GIC interrupt controller structures
should be used (types 0xA - 0xF).
MCFG Signature Reserved (signature == "MCFG")
- == Memory-mapped ConFiGuration space ==
+
+ **Memory-mapped ConFiGuration space**
+
If the platform supports PCI/PCIe, an MCFG table is required.
MCHI Signature Reserved (signature == "MCHI")
- == Management Controller Host Interface table ==
+
+ **Management Controller Host Interface table**
+
Optional, not currently supported.
MPST Section 5.2.21 (signature == "MPST")
- == Memory Power State Table ==
+
+ **Memory Power State Table**
+
Optional, not currently supported.
MSCT Section 5.2.19 (signature == "MSCT")
- == Maximum System Characteristic Table ==
+
+ **Maximum System Characteristic Table**
+
Optional, not currently supported.
MSDM Signature Reserved (signature == "MSDM")
- == Microsoft Data Management table ==
+
+ **Microsoft Data Management table**
+
Microsoft only table, will not be supported.
NFIT Section 5.2.25 (signature == "NFIT")
- == NVDIMM Firmware Interface Table ==
+
+ **NVDIMM Firmware Interface Table**
+
Optional, not currently supported.
OEMx Signature of "OEMx" only
- == OEM Specific Tables ==
+
+ **OEM Specific Tables**
+
All tables starting with a signature of "OEM" are reserved for OEM
use. Since these are not meant to be of general use but are limited
to very specific end users, they are not recommended for use and are
not supported by the kernel for arm64.
PCCT Section 14.1 (signature == "PCCT)
- == Platform Communications Channel Table ==
+
+ **Platform Communications Channel Table**
+
Recommend for use on arm64; use of PCC is recommended when using CPPC
to control performance and power for platform processors.
PMTT Section 5.2.21.12 (signature == "PMTT")
- == Platform Memory Topology Table ==
+
+ **Platform Memory Topology Table**
+
Optional, not currently supported.
PSDT Section 5.2.11.3 (signature == "PSDT")
- == Persistent System Description Table ==
+
+ **Persistent System Description Table**
+
Obsolete table, will not be supported.
RASF Section 5.2.20 (signature == "RASF")
- == RAS Feature table ==
+
+ **RAS Feature table**
+
Optional, not currently supported.
RSDP Section 5.2.5 (signature == "RSD PTR")
- == Root System Description PoinTeR ==
+
+ **Root System Description PoinTeR**
+
Required for arm64.
RSDT Section 5.2.7 (signature == "RSDT")
- == Root System Description Table ==
+
+ **Root System Description Table**
+
Since this table can only provide 32-bit addresses, it is deprecated
on arm64, and will not be used. If provided, it will be ignored.
SBST Section 5.2.14 (signature == "SBST")
- == Smart Battery Subsystem Table ==
+
+ **Smart Battery Subsystem Table**
+
Optional, not currently supported.
SLIC Signature Reserved (signature == "SLIC")
- == Software LIcensing table ==
+
+ **Software LIcensing table**
+
Microsoft only table, will not be supported.
SLIT Section 5.2.17 (signature == "SLIT")
- == System Locality distance Information Table ==
+
+ **System Locality distance Information Table**
+
Optional in general, but required for NUMA systems.
SPCR Signature Reserved (signature == "SPCR")
- == Serial Port Console Redirection table ==
+
+ **Serial Port Console Redirection table**
+
Required for arm64.
SPMI Signature Reserved (signature == "SPMI")
- == Server Platform Management Interface table ==
+
+ **Server Platform Management Interface table**
+
Optional, not currently supported.
SRAT Section 5.2.16 (signature == "SRAT")
- == System Resource Affinity Table ==
+
+ **System Resource Affinity Table**
+
Optional, but if used, only the GICC Affinity structures are read.
To support arm64 NUMA, this table is required.
SSDT Section 5.2.11.2 (signature == "SSDT")
- == Secondary System Description Table ==
+
+ **Secondary System Description Table**
+
These tables are a continuation of the DSDT; these are recommended
for use with devices that can be added to a running system, but can
also serve the purpose of dividing up device descriptions into more
@@ -272,49 +365,69 @@ SSDT Section 5.2.11.2 (signature == "SSDT")
one DSDT but can contain many SSDTs.
STAO Signature Reserved (signature == "STAO")
- == _STA Override table ==
+
+ **_STA Override table**
+
Optional, but only necessary in virtualized environments in order to
hide devices from guest OSs.
TCPA Signature Reserved (signature == "TCPA")
- == Trusted Computing Platform Alliance table ==
+
+ **Trusted Computing Platform Alliance table**
+
Optional, not currently supported, and may need changes to fully
interoperate with arm64.
TPM2 Signature Reserved (signature == "TPM2")
- == Trusted Platform Module 2 table ==
+
+ **Trusted Platform Module 2 table**
+
Optional, not currently supported, and may need changes to fully
interoperate with arm64.
UEFI Signature Reserved (signature == "UEFI")
- == UEFI ACPI data table ==
+
+ **UEFI ACPI data table**
+
Optional, not currently supported. No known use case for arm64,
at present.
WAET Signature Reserved (signature == "WAET")
- == Windows ACPI Emulated devices Table ==
+
+ **Windows ACPI Emulated devices Table**
+
Microsoft only table, will not be supported.
WDAT Signature Reserved (signature == "WDAT")
- == Watch Dog Action Table ==
+
+ **Watch Dog Action Table**
+
Microsoft only table, will not be supported.
WDRT Signature Reserved (signature == "WDRT")
- == Watch Dog Resource Table ==
+
+ **Watch Dog Resource Table**
+
Microsoft only table, will not be supported.
WPBT Signature Reserved (signature == "WPBT")
- == Windows Platform Binary Table ==
+
+ **Windows Platform Binary Table**
+
Microsoft only table, will not be supported.
XENV Signature Reserved (signature == "XENV")
- == Xen project table ==
+
+ **Xen project table**
+
Optional, used only by Xen at present.
XSDT Section 5.2.8 (signature == "XSDT")
- == eXtended System Description Table ==
- Required for arm64.
+ **eXtended System Description Table**
+
+ Required for arm64.
+====== ========================================================================
ACPI Objects
------------
@@ -323,10 +436,11 @@ shown in the list that follows; any object not explicitly mentioned below
should be used as needed for a particular platform or particular subsystem,
such as power management or PCI.
+===== ================ ========================================================
Name Section Usage for ARMv8 Linux
----- ------------ -------------------------------------------------
+===== ================ ========================================================
_CCA 6.2.17 This method must be defined for all bus masters
- on arm64 -- there are no assumptions made about
+ on arm64 - there are no assumptions made about
whether such devices are cache coherent or not.
The _CCA value is inherited by all descendants of
these devices so it does not need to be repeated.
@@ -422,8 +536,8 @@ _OSC 6.2.11 This method can be a global method in ACPI (i.e.,
by the kernel community, then register it with the
UEFI Forum.
-\_OSI 5.7.2 Deprecated on ARM64. As far as ACPI firmware is
- concerned, _OSI is not to be used to determine what
+\_OSI 5.7.2 Deprecated on ARM64. As far as ACPI firmware is
+ concerned, _OSI is not to be used to determine what
sort of system is being used or what functionality
is provided. The _OSC method is to be used instead.
@@ -447,7 +561,7 @@ _PSx 7.3.2-5 Use as needed; power management specific. If _PS0 is
usage, change them in these methods.
_RDI 8.4.4.4 Recommended for use with processor definitions (_HID
- ACPI0010) on arm64. This should only be used in
+ ACPI0010) on arm64. This should only be used in
conjunction with _LPI.
\_REV 5.7.4 Always returns the latest version of ACPI supported.
@@ -476,6 +590,7 @@ _SWS 7.4.3 Use as needed; power management specific; this may
_UID 6.1.12 Recommended for distinguishing devices of the same
class; define it if at all possible.
+===== ================ ========================================================
@@ -488,7 +603,7 @@ platforms, ACPI events must be signaled differently.
There are two options: GPIO-signaled interrupts (Section 5.6.5), and
interrupt-signaled events (Section 5.6.9). Interrupt-signaled events are a
-new feature in the ACPI 6.1 specification. Either -- or both -- can be used
+new feature in the ACPI 6.1 specification. Either - or both - can be used
on a given platform, and which to use may be dependent of limitations in any
given SoC. If possible, interrupt-signaled events are recommended.
@@ -564,39 +679,40 @@ supported.
The following classes of objects are not supported:
- -- Section 9.2: ambient light sensor devices
+ - Section 9.2: ambient light sensor devices
- -- Section 9.3: battery devices
+ - Section 9.3: battery devices
- -- Section 9.4: lids (e.g., laptop lids)
+ - Section 9.4: lids (e.g., laptop lids)
- -- Section 9.8.2: IDE controllers
+ - Section 9.8.2: IDE controllers
- -- Section 9.9: floppy controllers
+ - Section 9.9: floppy controllers
- -- Section 9.10: GPE block devices
+ - Section 9.10: GPE block devices
- -- Section 9.15: PC/AT RTC/CMOS devices
+ - Section 9.15: PC/AT RTC/CMOS devices
- -- Section 9.16: user presence detection devices
+ - Section 9.16: user presence detection devices
- -- Section 9.17: I/O APIC devices; all GICs must be enumerable via MADT
+ - Section 9.17: I/O APIC devices; all GICs must be enumerable via MADT
- -- Section 9.18: time and alarm devices (see 9.15)
+ - Section 9.18: time and alarm devices (see 9.15)
- -- Section 10: power source and power meter devices
+ - Section 10: power source and power meter devices
- -- Section 11: thermal management
+ - Section 11: thermal management
- -- Section 12: embedded controllers interface
+ - Section 12: embedded controllers interface
- -- Section 13: SMBus interfaces
+ - Section 13: SMBus interfaces
This also means that there is no support for the following objects:
+==== =========================== ==== ==========
Name Section Name Section
----- ------------ ---- ------------
+==== =========================== ==== ==========
_ALC 9.3.4 _FDM 9.10.3
_ALI 9.3.2 _FIX 6.2.7
_ALP 9.3.6 _GAI 10.4.5
@@ -619,4 +735,4 @@ _DCK 6.5.2 _UPD 9.16.1
_EC 12.12 _UPP 9.16.2
_FDE 9.10.1 _WPC 10.5.2
_FDI 9.10.2 _WPP 10.5.3
-
+==== =========================== ==== ==========
diff --git a/Documentation/arm64/arm-acpi.txt b/Documentation/arm64/arm-acpi.rst
index 1a74a041a443..872dbbc73d4a 100644
--- a/Documentation/arm64/arm-acpi.txt
+++ b/Documentation/arm64/arm-acpi.rst
@@ -1,5 +1,7 @@
+=====================
ACPI on ARMv8 Servers
----------------------
+=====================
+
ACPI can be used for ARMv8 general purpose servers designed to follow
the ARM SBSA (Server Base System Architecture) [0] and SBBR (Server
Base Boot Requirements) [1] specifications. Please note that the SBBR
@@ -34,28 +36,28 @@ of the summary text almost directly, to be honest.
The short form of the rationale for ACPI on ARM is:
--- ACPI’s byte code (AML) allows the platform to encode hardware behavior,
+- ACPI’s byte code (AML) allows the platform to encode hardware behavior,
while DT explicitly does not support this. For hardware vendors, being
able to encode behavior is a key tool used in supporting operating
system releases on new hardware.
--- ACPI’s OSPM defines a power management model that constrains what the
+- ACPI’s OSPM defines a power management model that constrains what the
platform is allowed to do into a specific model, while still providing
flexibility in hardware design.
--- In the enterprise server environment, ACPI has established bindings (such
+- In the enterprise server environment, ACPI has established bindings (such
as for RAS) which are currently used in production systems. DT does not.
Such bindings could be defined in DT at some point, but doing so means ARM
and x86 would end up using completely different code paths in both firmware
and the kernel.
--- Choosing a single interface to describe the abstraction between a platform
+- Choosing a single interface to describe the abstraction between a platform
and an OS is important. Hardware vendors would not be required to implement
both DT and ACPI if they want to support multiple operating systems. And,
agreeing on a single interface instead of being fragmented into per OS
interfaces makes for better interoperability overall.
--- The new ACPI governance process works well and Linux is now at the same
+- The new ACPI governance process works well and Linux is now at the same
table as hardware vendors and other OS vendors. In fact, there is no
longer any reason to feel that ACPI only belongs to Windows or that
Linux is in any way secondary to Microsoft in this arena. The move of
@@ -169,31 +171,31 @@ For the ACPI core to operate properly, and in turn provide the information
the kernel needs to configure devices, it expects to find the following
tables (all section numbers refer to the ACPI 6.1 specification):
- -- RSDP (Root System Description Pointer), section 5.2.5
+ - RSDP (Root System Description Pointer), section 5.2.5
- -- XSDT (eXtended System Description Table), section 5.2.8
+ - XSDT (eXtended System Description Table), section 5.2.8
- -- FADT (Fixed ACPI Description Table), section 5.2.9
+ - FADT (Fixed ACPI Description Table), section 5.2.9
- -- DSDT (Differentiated System Description Table), section
+ - DSDT (Differentiated System Description Table), section
5.2.11.1
- -- MADT (Multiple APIC Description Table), section 5.2.12
+ - MADT (Multiple APIC Description Table), section 5.2.12
- -- GTDT (Generic Timer Description Table), section 5.2.24
+ - GTDT (Generic Timer Description Table), section 5.2.24
- -- If PCI is supported, the MCFG (Memory mapped ConFiGuration
+ - If PCI is supported, the MCFG (Memory mapped ConFiGuration
Table), section 5.2.6, specifically Table 5-31.
- -- If booting without a console=<device> kernel parameter is
+ - If booting without a console=<device> kernel parameter is
supported, the SPCR (Serial Port Console Redirection table),
section 5.2.6, specifically Table 5-31.
- -- If necessary to describe the I/O topology, SMMUs and GIC ITSs,
+ - If necessary to describe the I/O topology, SMMUs and GIC ITSs,
the IORT (Input Output Remapping Table, section 5.2.6, specifically
Table 5-31).
- -- If NUMA is supported, the SRAT (System Resource Affinity Table)
+ - If NUMA is supported, the SRAT (System Resource Affinity Table)
and SLIT (System Locality distance Information Table), sections
5.2.16 and 5.2.17, respectively.
@@ -269,9 +271,9 @@ describes how to define the structure of an object returned via _DSD, and
how specific data structures are defined by specific UUIDs. Linux should
only use the _DSD Device Properties UUID [5]:
- -- UUID: daffd814-6eba-4d8c-8a91-bc9bbf4aa301
+ - UUID: daffd814-6eba-4d8c-8a91-bc9bbf4aa301
- -- http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf
+ - http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf
The UEFI Forum provides a mechanism for registering device properties [4]
so that they may be used across all operating systems supporting ACPI.
@@ -327,10 +329,10 @@ turning a device full off.
There are two options for using those Power Resources. They can:
- -- be managed in a _PSx method which gets called on entry to power
+ - be managed in a _PSx method which gets called on entry to power
state Dx.
- -- be declared separately as power resources with their own _ON and _OFF
+ - be declared separately as power resources with their own _ON and _OFF
methods. They are then tied back to D-states for a particular device
via _PRx which specifies which power resources a device needs to be on
while in Dx. Kernel then tracks number of devices using a power resource
@@ -339,16 +341,16 @@ There are two options for using those Power Resources. They can:
The kernel ACPI code will also assume that the _PSx methods follow the normal
ACPI rules for such methods:
- -- If either _PS0 or _PS3 is implemented, then the other method must also
+ - If either _PS0 or _PS3 is implemented, then the other method must also
be implemented.
- -- If a device requires usage or setup of a power resource when on, the ASL
+ - If a device requires usage or setup of a power resource when on, the ASL
should organize that it is allocated/enabled using the _PS0 method.
- -- Resources allocated or enabled in the _PS0 method should be disabled
+ - Resources allocated or enabled in the _PS0 method should be disabled
or de-allocated in the _PS3 method.
- -- Firmware will leave the resources in a reasonable state before handing
+ - Firmware will leave the resources in a reasonable state before handing
over control to the kernel.
Such code in _PSx methods will of course be very platform specific. But,
@@ -394,52 +396,52 @@ else must be discovered by the driver probe function. Then, have the rest
of the driver operate off of the contents of that struct. Doing so should
allow most divergence between ACPI and DT functionality to be kept local to
the probe function instead of being scattered throughout the driver. For
-example:
-
-static int device_probe_dt(struct platform_device *pdev)
-{
- /* DT specific functionality */
- ...
-}
-
-static int device_probe_acpi(struct platform_device *pdev)
-{
- /* ACPI specific functionality */
- ...
-}
-
-static int device_probe(struct platform_device *pdev)
-{
- ...
- struct device_node node = pdev->dev.of_node;
- ...
-
- if (node)
- ret = device_probe_dt(pdev);
- else if (ACPI_HANDLE(&pdev->dev))
- ret = device_probe_acpi(pdev);
- else
- /* other initialization */
- ...
- /* Continue with any generic probe operations */
- ...
-}
+example::
+
+ static int device_probe_dt(struct platform_device *pdev)
+ {
+ /* DT specific functionality */
+ ...
+ }
+
+ static int device_probe_acpi(struct platform_device *pdev)
+ {
+ /* ACPI specific functionality */
+ ...
+ }
+
+ static int device_probe(struct platform_device *pdev)
+ {
+ ...
+ struct device_node node = pdev->dev.of_node;
+ ...
+
+ if (node)
+ ret = device_probe_dt(pdev);
+ else if (ACPI_HANDLE(&pdev->dev))
+ ret = device_probe_acpi(pdev);
+ else
+ /* other initialization */
+ ...
+ /* Continue with any generic probe operations */
+ ...
+ }
DO keep the MODULE_DEVICE_TABLE entries together in the driver to make it
clear the different names the driver is probed for, both from DT and from
-ACPI:
+ACPI::
-static struct of_device_id virtio_mmio_match[] = {
- { .compatible = "virtio,mmio", },
- { }
-};
-MODULE_DEVICE_TABLE(of, virtio_mmio_match);
+ static struct of_device_id virtio_mmio_match[] = {
+ { .compatible = "virtio,mmio", },
+ { }
+ };
+ MODULE_DEVICE_TABLE(of, virtio_mmio_match);
-static const struct acpi_device_id virtio_mmio_acpi_match[] = {
- { "LNRO0005", },
- { }
-};
-MODULE_DEVICE_TABLE(acpi, virtio_mmio_acpi_match);
+ static const struct acpi_device_id virtio_mmio_acpi_match[] = {
+ { "LNRO0005", },
+ { }
+ };
+ MODULE_DEVICE_TABLE(acpi, virtio_mmio_acpi_match);
ASWG
@@ -471,7 +473,8 @@ Linux Code
Individual items specific to Linux on ARM, contained in the the Linux
source code, are in the list that follows:
-ACPI_OS_NAME This macro defines the string to be returned when
+ACPI_OS_NAME
+ This macro defines the string to be returned when
an ACPI method invokes the _OS method. On ARM64
systems, this macro will be "Linux" by default.
The command line parameter acpi_os=<string>
@@ -482,38 +485,44 @@ ACPI_OS_NAME This macro defines the string to be returned when
ACPI Objects
------------
Detailed expectations for ACPI tables and object are listed in the file
-Documentation/arm64/acpi_object_usage.txt.
+Documentation/arm64/acpi_object_usage.rst.
References
----------
-[0] http://silver.arm.com -- document ARM-DEN-0029, or newer
+[0] http://silver.arm.com
+ document ARM-DEN-0029, or newer:
"Server Base System Architecture", version 2.3, dated 27 Mar 2014
[1] http://infocenter.arm.com/help/topic/com.arm.doc.den0044a/Server_Base_Boot_Requirements.pdf
Document ARM-DEN-0044A, or newer: "Server Base Boot Requirements, System
Software on ARM Platforms", dated 16 Aug 2014
-[2] http://www.secretlab.ca/archives/151, 10 Jan 2015, Copyright (c) 2015,
+[2] http://www.secretlab.ca/archives/151,
+ 10 Jan 2015, Copyright (c) 2015,
Linaro Ltd., written by Grant Likely.
-[3] AMD ACPI for Seattle platform documentation:
+[3] AMD ACPI for Seattle platform documentation
http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/Seattle_ACPI_Guide.pdf
-[4] http://www.uefi.org/acpi -- please see the link for the "ACPI _DSD Device
+
+[4] http://www.uefi.org/acpi
+ please see the link for the "ACPI _DSD Device
Property Registry Instructions"
-[5] http://www.uefi.org/acpi -- please see the link for the "_DSD (Device
+[5] http://www.uefi.org/acpi
+ please see the link for the "_DSD (Device
Specific Data) Implementation Guide"
-[6] Kernel code for the unified device property interface can be found in
+[6] Kernel code for the unified device
+ property interface can be found in
include/linux/property.h and drivers/base/property.c.
Authors
-------
-Al Stone <al.stone@linaro.org>
-Graeme Gregory <graeme.gregory@linaro.org>
-Hanjun Guo <hanjun.guo@linaro.org>
+- Al Stone <al.stone@linaro.org>
+- Graeme Gregory <graeme.gregory@linaro.org>
+- Hanjun Guo <hanjun.guo@linaro.org>
-Grant Likely <grant.likely@linaro.org>, for the "Why ACPI on ARM?" section
+- Grant Likely <grant.likely@linaro.org>, for the "Why ACPI on ARM?" section
diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.rst
index fbab7e21d116..3d041d0d16e8 100644
--- a/Documentation/arm64/booting.txt
+++ b/Documentation/arm64/booting.rst
@@ -1,7 +1,9 @@
- Booting AArch64 Linux
- =====================
+=====================
+Booting AArch64 Linux
+=====================
Author: Will Deacon <will.deacon@arm.com>
+
Date : 07 September 2012
This document is based on the ARM booting document by Russell King and
@@ -12,7 +14,7 @@ The AArch64 exception model is made up of a number of exception levels
counterpart. EL2 is the hypervisor level and exists only in non-secure
mode. EL3 is the highest priority level and exists only in secure mode.
-For the purposes of this document, we will use the term `boot loader'
+For the purposes of this document, we will use the term `boot loader`
simply to define all software that executes on the CPU(s) before control
is passed to the Linux kernel. This may include secure monitor and
hypervisor code, or it may just be a handful of instructions for
@@ -70,7 +72,7 @@ Image target is available instead.
Requirement: MANDATORY
-The decompressed kernel image contains a 64-byte header as follows:
+The decompressed kernel image contains a 64-byte header as follows::
u32 code0; /* Executable code */
u32 code1; /* Executable code */
@@ -103,19 +105,26 @@ Header notes:
- The flags field (introduced in v3.17) is a little-endian 64-bit field
composed as follows:
- Bit 0: Kernel endianness. 1 if BE, 0 if LE.
- Bit 1-2: Kernel Page size.
- 0 - Unspecified.
- 1 - 4K
- 2 - 16K
- 3 - 64K
- Bit 3: Kernel physical placement
- 0 - 2MB aligned base should be as close as possible
- to the base of DRAM, since memory below it is not
- accessible via the linear mapping
- 1 - 2MB aligned base may be anywhere in physical
- memory
- Bits 4-63: Reserved.
+
+ ============= ===============================================================
+ Bit 0 Kernel endianness. 1 if BE, 0 if LE.
+ Bit 1-2 Kernel Page size.
+
+ * 0 - Unspecified.
+ * 1 - 4K
+ * 2 - 16K
+ * 3 - 64K
+ Bit 3 Kernel physical placement
+
+ 0
+ 2MB aligned base should be as close as possible
+ to the base of DRAM, since memory below it is not
+ accessible via the linear mapping
+ 1
+ 2MB aligned base may be anywhere in physical
+ memory
+ Bits 4-63 Reserved.
+ ============= ===============================================================
- When image_size is zero, a bootloader should attempt to keep as much
memory as possible free for use by the kernel immediately after the
@@ -147,19 +156,22 @@ Before jumping into the kernel, the following conditions must be met:
corrupted by bogus network packets or disk data. This will save
you many hours of debug.
-- Primary CPU general-purpose register settings
- x0 = physical address of device tree blob (dtb) in system RAM.
- x1 = 0 (reserved for future use)
- x2 = 0 (reserved for future use)
- x3 = 0 (reserved for future use)
+- Primary CPU general-purpose register settings:
+
+ - x0 = physical address of device tree blob (dtb) in system RAM.
+ - x1 = 0 (reserved for future use)
+ - x2 = 0 (reserved for future use)
+ - x3 = 0 (reserved for future use)
- CPU mode
+
All forms of interrupts must be masked in PSTATE.DAIF (Debug, SError,
IRQ and FIQ).
The CPU must be in either EL2 (RECOMMENDED in order to have access to
the virtualisation extensions) or non-secure EL1.
- Caches, MMUs
+
The MMU must be off.
Instruction cache may be on or off.
The address range corresponding to the loaded kernel image must be
@@ -172,18 +184,21 @@ Before jumping into the kernel, the following conditions must be met:
operations (not recommended) must be configured and disabled.
- Architected timers
+
CNTFRQ must be programmed with the timer frequency and CNTVOFF must
be programmed with a consistent value on all CPUs. If entering the
kernel at EL1, CNTHCTL_EL2 must have EL1PCTEN (bit 0) set where
available.
- Coherency
+
All CPUs to be booted by the kernel must be part of the same coherency
domain on entry to the kernel. This may require IMPLEMENTATION DEFINED
initialisation to enable the receiving of maintenance operations on
each CPU.
- System registers
+
All writable architected system registers at the exception level where
the kernel image will be entered must be initialised by software at a
higher exception level to prevent execution in an UNKNOWN state.
@@ -195,28 +210,40 @@ Before jumping into the kernel, the following conditions must be met:
For systems with a GICv3 interrupt controller to be used in v3 mode:
- If EL3 is present:
- ICC_SRE_EL3.Enable (bit 3) must be initialiased to 0b1.
- ICC_SRE_EL3.SRE (bit 0) must be initialised to 0b1.
+
+ - ICC_SRE_EL3.Enable (bit 3) must be initialiased to 0b1.
+ - ICC_SRE_EL3.SRE (bit 0) must be initialised to 0b1.
+
- If the kernel is entered at EL1:
- ICC.SRE_EL2.Enable (bit 3) must be initialised to 0b1
- ICC_SRE_EL2.SRE (bit 0) must be initialised to 0b1.
+
+ - ICC.SRE_EL2.Enable (bit 3) must be initialised to 0b1
+ - ICC_SRE_EL2.SRE (bit 0) must be initialised to 0b1.
+
- The DT or ACPI tables must describe a GICv3 interrupt controller.
For systems with a GICv3 interrupt controller to be used in
compatibility (v2) mode:
+
- If EL3 is present:
- ICC_SRE_EL3.SRE (bit 0) must be initialised to 0b0.
+
+ ICC_SRE_EL3.SRE (bit 0) must be initialised to 0b0.
+
- If the kernel is entered at EL1:
- ICC_SRE_EL2.SRE (bit 0) must be initialised to 0b0.
+
+ ICC_SRE_EL2.SRE (bit 0) must be initialised to 0b0.
+
- The DT or ACPI tables must describe a GICv2 interrupt controller.
For CPUs with pointer authentication functionality:
- If EL3 is present:
- SCR_EL3.APK (bit 16) must be initialised to 0b1
- SCR_EL3.API (bit 17) must be initialised to 0b1
+
+ - SCR_EL3.APK (bit 16) must be initialised to 0b1
+ - SCR_EL3.API (bit 17) must be initialised to 0b1
+
- If the kernel is entered at EL1:
- HCR_EL2.APK (bit 40) must be initialised to 0b1
- HCR_EL2.API (bit 41) must be initialised to 0b1
+
+ - HCR_EL2.APK (bit 40) must be initialised to 0b1
+ - HCR_EL2.API (bit 41) must be initialised to 0b1
The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs. All CPUs must
diff --git a/Documentation/arm64/cpu-feature-registers.txt b/Documentation/arm64/cpu-feature-registers.rst
index 684a0da39378..2955287e9acc 100644
--- a/Documentation/arm64/cpu-feature-registers.txt
+++ b/Documentation/arm64/cpu-feature-registers.rst
@@ -1,5 +1,6 @@
- ARM64 CPU Feature Registers
- ===========================
+===========================
+ARM64 CPU Feature Registers
+===========================
Author: Suzuki K Poulose <suzuki.poulose@arm.com>
@@ -9,7 +10,7 @@ registers to userspace. The availability of this ABI is advertised
via the HWCAP_CPUID in HWCAPs.
1. Motivation
----------------
+-------------
The ARM architecture defines a set of feature registers, which describe
the capabilities of the CPU/system. Access to these system registers is
@@ -33,9 +34,10 @@ there are some issues with their usage.
2. Requirements
------------------
+---------------
+
+ a) Safety:
- a) Safety :
Applications should be able to use the information provided by the
infrastructure to run safely across the system. This has greater
implications on a system with heterogeneous CPUs.
@@ -47,7 +49,8 @@ there are some issues with their usage.
Otherwise an application could crash when scheduled on the CPU
which doesn't support CRC32.
- b) Security :
+ b) Security:
+
Applications should only be able to receive information that is
relevant to the normal operation in userspace. Hence, some of the
fields are masked out(i.e, made invisible) and their values are set to
@@ -58,10 +61,12 @@ there are some issues with their usage.
(even when the CPU provides it).
c) Implementation Defined Features
+
The infrastructure doesn't expose any register which is
IMPLEMENTATION DEFINED as per ARMv8-A Architecture.
- d) CPU Identification :
+ d) CPU Identification:
+
MIDR_EL1 is exposed to help identify the processor. On a
heterogeneous system, this could be racy (just like getcpu()). The
process could be migrated to another CPU by the time it uses the
@@ -70,7 +75,7 @@ there are some issues with their usage.
currently executing on. The REVIDR is not exposed due to this
constraint, as REVIDR makes sense only in conjunction with the
MIDR. Alternately, MIDR_EL1 and REVIDR_EL1 are exposed via sysfs
- at:
+ at::
/sys/devices/system/cpu/cpu$ID/regs/identification/
\- midr
@@ -85,7 +90,8 @@ exception and ends up in SIGILL being delivered to the process.
The infrastructure hooks into the exception handler and emulates the
operation if the source belongs to the supported system register space.
-The infrastructure emulates only the following system register space:
+The infrastructure emulates only the following system register space::
+
Op0=3, Op1=0, CRn=0, CRm=0,4,5,6,7
(See Table C5-6 'System instruction encodings for non-Debug System
@@ -107,73 +113,76 @@ infrastructure:
-------------------------------------------
1) ID_AA64ISAR0_EL1 - Instruction Set Attribute Register 0
- x--------------------------------------------------x
+
+ +------------------------------+---------+---------+
| Name | bits | visible |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| TS | [55-52] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| FHM | [51-48] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| DP | [47-44] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| SM4 | [43-40] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| SM3 | [39-36] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| SHA3 | [35-32] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| RDM | [31-28] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| ATOMICS | [23-20] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| CRC32 | [19-16] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| SHA2 | [15-12] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| SHA1 | [11-8] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| AES | [7-4] | y |
- x--------------------------------------------------x
+ +------------------------------+---------+---------+
2) ID_AA64PFR0_EL1 - Processor Feature Register 0
- x--------------------------------------------------x
+
+ +------------------------------+---------+---------+
| Name | bits | visible |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| DIT | [51-48] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| SVE | [35-32] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| GIC | [27-24] | n |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| AdvSIMD | [23-20] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| FP | [19-16] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| EL3 | [15-12] | n |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| EL2 | [11-8] | n |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| EL1 | [7-4] | n |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| EL0 | [3-0] | n |
- x--------------------------------------------------x
+ +------------------------------+---------+---------+
3) MIDR_EL1 - Main ID Register
- x--------------------------------------------------x
+
+ +------------------------------+---------+---------+
| Name | bits | visible |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| Implementer | [31-24] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| Variant | [23-20] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| Architecture | [19-16] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| PartNum | [15-4] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| Revision | [3-0] | y |
- x--------------------------------------------------x
+ +------------------------------+---------+---------+
NOTE: The 'visible' fields of MIDR_EL1 will contain the value
as available on the CPU where it is fetched and is not a system
@@ -181,90 +190,92 @@ infrastructure:
4) ID_AA64ISAR1_EL1 - Instruction set attribute register 1
- x--------------------------------------------------x
+ +------------------------------+---------+---------+
| Name | bits | visible |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| GPI | [31-28] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| GPA | [27-24] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| LRCPC | [23-20] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| FCMA | [19-16] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| JSCVT | [15-12] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| API | [11-8] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| APA | [7-4] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| DPB | [3-0] | y |
- x--------------------------------------------------x
+ +------------------------------+---------+---------+
5) ID_AA64MMFR2_EL1 - Memory model feature register 2
- x--------------------------------------------------x
+ +------------------------------+---------+---------+
| Name | bits | visible |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| AT | [35-32] | y |
- x--------------------------------------------------x
+ +------------------------------+---------+---------+
6) ID_AA64ZFR0_EL1 - SVE feature ID register 0
- x--------------------------------------------------x
+ +------------------------------+---------+---------+
| Name | bits | visible |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| SM4 | [43-40] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| SHA3 | [35-32] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| BitPerm | [19-16] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| AES | [7-4] | y |
- |--------------------------------------------------|
+ +------------------------------+---------+---------+
| SVEVer | [3-0] | y |
- x--------------------------------------------------x
+ +------------------------------+---------+---------+
Appendix I: Example
----------------------------
-
-/*
- * Sample program to demonstrate the MRS emulation ABI.
- *
- * Copyright (C) 2015-2016, ARM Ltd
- *
- * Author: Suzuki K Poulose <suzuki.poulose@arm.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- */
-
-#include <asm/hwcap.h>
-#include <stdio.h>
-#include <sys/auxv.h>
-
-#define get_cpu_ftr(id) ({ \
+-------------------
+
+::
+
+ /*
+ * Sample program to demonstrate the MRS emulation ABI.
+ *
+ * Copyright (C) 2015-2016, ARM Ltd
+ *
+ * Author: Suzuki K Poulose <suzuki.poulose@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+ #include <asm/hwcap.h>
+ #include <stdio.h>
+ #include <sys/auxv.h>
+
+ #define get_cpu_ftr(id) ({ \
unsigned long __val; \
asm("mrs %0, "#id : "=r" (__val)); \
printf("%-20s: 0x%016lx\n", #id, __val); \
})
-int main(void)
-{
+ int main(void)
+ {
if (!(getauxval(AT_HWCAP) & HWCAP_CPUID)) {
fputs("CPUID registers unavailable\n", stderr);
@@ -284,13 +295,10 @@ int main(void)
get_cpu_ftr(MPIDR_EL1);
get_cpu_ftr(REVIDR_EL1);
-#if 0
+ #if 0
/* Unexposed register access causes SIGILL */
get_cpu_ftr(ID_MMFR0_EL1);
-#endif
+ #endif
return 0;
-}
-
-
-
+ }
diff --git a/Documentation/arm64/elf_hwcaps.txt b/Documentation/arm64/elf_hwcaps.rst
index 5ae2ef2c12f3..91f79529c58c 100644
--- a/Documentation/arm64/elf_hwcaps.txt
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -1,3 +1,4 @@
+================
ARM64 ELF hwcaps
================
@@ -15,16 +16,16 @@ of flags called hwcaps, exposed in the auxilliary vector.
Userspace software can test for features by acquiring the AT_HWCAP or
AT_HWCAP2 entry of the auxiliary vector, and testing whether the relevant
-flags are set, e.g.
+flags are set, e.g.::
-bool floating_point_is_present(void)
-{
- unsigned long hwcaps = getauxval(AT_HWCAP);
- if (hwcaps & HWCAP_FP)
- return true;
+ bool floating_point_is_present(void)
+ {
+ unsigned long hwcaps = getauxval(AT_HWCAP);
+ if (hwcaps & HWCAP_FP)
+ return true;
- return false;
-}
+ return false;
+ }
Where software relies on a feature described by a hwcap, it should check
the relevant hwcap flag to verify that the feature is present before
@@ -45,7 +46,7 @@ userspace code at EL0. These hwcaps are defined in terms of ID register
fields, and should be interpreted with reference to the definition of
these fields in the ARM Architecture Reference Manual (ARM ARM).
-Such hwcaps are described below in the form:
+Such hwcaps are described below in the form::
Functionality implied by idreg.field == val.
@@ -64,75 +65,58 @@ reference to ID registers, and may refer to other documentation.
---------------------------------
HWCAP_FP
-
Functionality implied by ID_AA64PFR0_EL1.FP == 0b0000.
HWCAP_ASIMD
-
Functionality implied by ID_AA64PFR0_EL1.AdvSIMD == 0b0000.
HWCAP_EVTSTRM
-
The generic timer is configured to generate events at a frequency of
approximately 100KHz.
HWCAP_AES
-
Functionality implied by ID_AA64ISAR0_EL1.AES == 0b0001.
HWCAP_PMULL
-
Functionality implied by ID_AA64ISAR0_EL1.AES == 0b0010.
HWCAP_SHA1
-
Functionality implied by ID_AA64ISAR0_EL1.SHA1 == 0b0001.
HWCAP_SHA2
-
Functionality implied by ID_AA64ISAR0_EL1.SHA2 == 0b0001.
HWCAP_CRC32
-
Functionality implied by ID_AA64ISAR0_EL1.CRC32 == 0b0001.
HWCAP_ATOMICS
-
Functionality implied by ID_AA64ISAR0_EL1.Atomic == 0b0010.
HWCAP_FPHP
-
Functionality implied by ID_AA64PFR0_EL1.FP == 0b0001.
HWCAP_ASIMDHP
-
Functionality implied by ID_AA64PFR0_EL1.AdvSIMD == 0b0001.
HWCAP_CPUID
-
EL0 access to certain ID registers is available, to the extent
- described by Documentation/arm64/cpu-feature-registers.txt.
+ described by Documentation/arm64/cpu-feature-registers.rst.
These ID registers may imply the availability of features.
HWCAP_ASIMDRDM
-
Functionality implied by ID_AA64ISAR0_EL1.RDM == 0b0001.
HWCAP_JSCVT
-
Functionality implied by ID_AA64ISAR1_EL1.JSCVT == 0b0001.
HWCAP_FCMA
-
Functionality implied by ID_AA64ISAR1_EL1.FCMA == 0b0001.
HWCAP_LRCPC
-
Functionality implied by ID_AA64ISAR1_EL1.LRCPC == 0b0001.
HWCAP_DCPOP
-
Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0001.
HWCAP2_DCPODP
@@ -140,27 +124,21 @@ HWCAP2_DCPODP
Functionality implied by ID_AA64ISAR1_EL1.DPB == 0b0010.
HWCAP_SHA3
-
Functionality implied by ID_AA64ISAR0_EL1.SHA3 == 0b0001.
HWCAP_SM3
-
Functionality implied by ID_AA64ISAR0_EL1.SM3 == 0b0001.
HWCAP_SM4
-
Functionality implied by ID_AA64ISAR0_EL1.SM4 == 0b0001.
HWCAP_ASIMDDP
-
Functionality implied by ID_AA64ISAR0_EL1.DP == 0b0001.
HWCAP_SHA512
-
Functionality implied by ID_AA64ISAR0_EL1.SHA2 == 0b0010.
HWCAP_SVE
-
Functionality implied by ID_AA64PFR0_EL1.SVE == 0b0001.
HWCAP2_SVE2
@@ -188,23 +166,18 @@ HWCAP2_SVESM4
Functionality implied by ID_AA64ZFR0_EL1.SM4 == 0b0001.
HWCAP_ASIMDFHM
-
Functionality implied by ID_AA64ISAR0_EL1.FHM == 0b0001.
HWCAP_DIT
-
Functionality implied by ID_AA64PFR0_EL1.DIT == 0b0001.
HWCAP_USCAT
-
Functionality implied by ID_AA64MMFR2_EL1.AT == 0b0001.
HWCAP_ILRCPC
-
Functionality implied by ID_AA64ISAR1_EL1.LRCPC == 0b0010.
HWCAP_FLAGM
-
Functionality implied by ID_AA64ISAR0_EL1.TS == 0b0001.
HWCAP2_FLAGM2
@@ -212,20 +185,17 @@ HWCAP2_FLAGM2
Functionality implied by ID_AA64ISAR0_EL1.TS == 0b0010.
HWCAP_SSBS
-
Functionality implied by ID_AA64PFR1_EL1.SSBS == 0b0010.
HWCAP_PACA
-
Functionality implied by ID_AA64ISAR1_EL1.APA == 0b0001 or
ID_AA64ISAR1_EL1.API == 0b0001, as described by
- Documentation/arm64/pointer-authentication.txt.
+ Documentation/arm64/pointer-authentication.rst.
HWCAP_PACG
-
Functionality implied by ID_AA64ISAR1_EL1.GPA == 0b0001 or
ID_AA64ISAR1_EL1.GPI == 0b0001, as described by
- Documentation/arm64/pointer-authentication.txt.
+ Documentation/arm64/pointer-authentication.rst.
HWCAP2_FRINT
diff --git a/Documentation/arm64/hugetlbpage.txt b/Documentation/arm64/hugetlbpage.rst
index cfae87dc653b..b44f939e5210 100644
--- a/Documentation/arm64/hugetlbpage.txt
+++ b/Documentation/arm64/hugetlbpage.rst
@@ -1,3 +1,4 @@
+====================
HugeTLBpage on ARM64
====================
@@ -31,8 +32,10 @@ and level of the page table.
The following hugepage sizes are supported -
- CONT PTE PMD CONT PMD PUD
- -------- --- -------- ---
+ ====== ======== ==== ======== ===
+ - CONT PTE PMD CONT PMD PUD
+ ====== ======== ==== ======== ===
4K: 64K 2M 32M 1G
16K: 2M 32M 1G
64K: 2M 512M 16G
+ ====== ======== ==== ======== ===
diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
new file mode 100644
index 000000000000..018b7836ecb7
--- /dev/null
+++ b/Documentation/arm64/index.rst
@@ -0,0 +1,28 @@
+:orphan:
+
+==================
+ARM64 Architecture
+==================
+
+.. toctree::
+ :maxdepth: 1
+
+ acpi_object_usage
+ arm-acpi
+ booting
+ cpu-feature-registers
+ elf_hwcaps
+ hugetlbpage
+ legacy_instructions
+ memory
+ pointer-authentication
+ silicon-errata
+ sve
+ tagged-pointers
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/arm64/legacy_instructions.txt b/Documentation/arm64/legacy_instructions.rst
index 01bf3d9fac85..54401b22cb8f 100644
--- a/Documentation/arm64/legacy_instructions.txt
+++ b/Documentation/arm64/legacy_instructions.rst
@@ -1,3 +1,7 @@
+===================
+Legacy instructions
+===================
+
The arm64 port of the Linux kernel provides infrastructure to support
emulation of instructions which have been deprecated, or obsoleted in
the architecture. The infrastructure code uses undefined instruction
@@ -9,19 +13,22 @@ The emulation mode can be controlled by writing to sysctl nodes
behaviours and the corresponding values of the sysctl nodes -
* Undef
- Value: 0
+ Value: 0
+
Generates undefined instruction abort. Default for instructions that
have been obsoleted in the architecture, e.g., SWP
* Emulate
- Value: 1
+ Value: 1
+
Uses software emulation. To aid migration of software, in this mode
usage of emulated instruction is traced as well as rate limited
warnings are issued. This is the default for deprecated
instructions, .e.g., CP15 barriers
* Hardware Execution
- Value: 2
+ Value: 2
+
Although marked as deprecated, some implementations may support the
enabling/disabling of hardware support for the execution of these
instructions. Using hardware execution generally provides better
@@ -38,20 +45,24 @@ individual instruction notes for further information.
Supported legacy instructions
-----------------------------
* SWP{B}
-Node: /proc/sys/abi/swp
-Status: Obsolete
-Default: Undef (0)
+
+:Node: /proc/sys/abi/swp
+:Status: Obsolete
+:Default: Undef (0)
* CP15 Barriers
-Node: /proc/sys/abi/cp15_barrier
-Status: Deprecated
-Default: Emulate (1)
+
+:Node: /proc/sys/abi/cp15_barrier
+:Status: Deprecated
+:Default: Emulate (1)
* SETEND
-Node: /proc/sys/abi/setend
-Status: Deprecated
-Default: Emulate (1)*
-Note: All the cpus on the system must have mixed endian support at EL0
-for this feature to be enabled. If a new CPU - which doesn't support mixed
-endian - is hotplugged in after this feature has been enabled, there could
-be unexpected results in the application.
+
+:Node: /proc/sys/abi/setend
+:Status: Deprecated
+:Default: Emulate (1)*
+
+ Note: All the cpus on the system must have mixed endian support at EL0
+ for this feature to be enabled. If a new CPU - which doesn't support mixed
+ endian - is hotplugged in after this feature has been enabled, there could
+ be unexpected results in the application.
diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst
new file mode 100644
index 000000000000..464b880fc4b7
--- /dev/null
+++ b/Documentation/arm64/memory.rst
@@ -0,0 +1,98 @@
+==============================
+Memory Layout on AArch64 Linux
+==============================
+
+Author: Catalin Marinas <catalin.marinas@arm.com>
+
+This document describes the virtual memory layout used by the AArch64
+Linux kernel. The architecture allows up to 4 levels of translation
+tables with a 4KB page size and up to 3 levels with a 64KB page size.
+
+AArch64 Linux uses either 3 levels or 4 levels of translation tables
+with the 4KB page configuration, allowing 39-bit (512GB) or 48-bit
+(256TB) virtual addresses, respectively, for both user and kernel. With
+64KB pages, only 2 levels of translation tables, allowing 42-bit (4TB)
+virtual address, are used but the memory layout is the same.
+
+User addresses have bits 63:48 set to 0 while the kernel addresses have
+the same bits set to 1. TTBRx selection is given by bit 63 of the
+virtual address. The swapper_pg_dir contains only kernel (global)
+mappings while the user pgd contains only user (non-global) mappings.
+The swapper_pg_dir address is written to TTBR1 and never written to
+TTBR0.
+
+
+AArch64 Linux memory layout with 4KB pages + 3 levels::
+
+ Start End Size Use
+ -----------------------------------------------------------------------
+ 0000000000000000 0000007fffffffff 512GB user
+ ffffff8000000000 ffffffffffffffff 512GB kernel
+
+
+AArch64 Linux memory layout with 4KB pages + 4 levels::
+
+ Start End Size Use
+ -----------------------------------------------------------------------
+ 0000000000000000 0000ffffffffffff 256TB user
+ ffff000000000000 ffffffffffffffff 256TB kernel
+
+
+AArch64 Linux memory layout with 64KB pages + 2 levels::
+
+ Start End Size Use
+ -----------------------------------------------------------------------
+ 0000000000000000 000003ffffffffff 4TB user
+ fffffc0000000000 ffffffffffffffff 4TB kernel
+
+
+AArch64 Linux memory layout with 64KB pages + 3 levels::
+
+ Start End Size Use
+ -----------------------------------------------------------------------
+ 0000000000000000 0000ffffffffffff 256TB user
+ ffff000000000000 ffffffffffffffff 256TB kernel
+
+
+For details of the virtual kernel memory layout please see the kernel
+booting log.
+
+
+Translation table lookup with 4KB pages::
+
+ +--------+--------+--------+--------+--------+--------+--------+--------+
+ |63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0|
+ +--------+--------+--------+--------+--------+--------+--------+--------+
+ | | | | | |
+ | | | | | v
+ | | | | | [11:0] in-page offset
+ | | | | +-> [20:12] L3 index
+ | | | +-----------> [29:21] L2 index
+ | | +---------------------> [38:30] L1 index
+ | +-------------------------------> [47:39] L0 index
+ +-------------------------------------------------> [63] TTBR0/1
+
+
+Translation table lookup with 64KB pages::
+
+ +--------+--------+--------+--------+--------+--------+--------+--------+
+ |63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0|
+ +--------+--------+--------+--------+--------+--------+--------+--------+
+ | | | | |
+ | | | | v
+ | | | | [15:0] in-page offset
+ | | | +----------> [28:16] L3 index
+ | | +--------------------------> [41:29] L2 index
+ | +-------------------------------> [47:42] L1 index
+ +-------------------------------------------------> [63] TTBR0/1
+
+
+When using KVM without the Virtualization Host Extensions, the
+hypervisor maps kernel pages in EL2 at a fixed (and potentially
+random) offset from the linear mapping. See the kern_hyp_va macro and
+kvm_update_va_mask function for more details. MMIO devices such as
+GICv2 gets mapped next to the HYP idmap page, as do vectors when
+ARM64_HARDEN_EL2_VECTORS is selected for particular CPUs.
+
+When using KVM with the Virtualization Host Extensions, no additional
+mappings are created, since the host kernel runs directly in EL2.
diff --git a/Documentation/arm64/memory.txt b/Documentation/arm64/memory.txt
deleted file mode 100644
index c5dab30d3389..000000000000
--- a/Documentation/arm64/memory.txt
+++ /dev/null
@@ -1,97 +0,0 @@
- Memory Layout on AArch64 Linux
- ==============================
-
-Author: Catalin Marinas <catalin.marinas@arm.com>
-
-This document describes the virtual memory layout used by the AArch64
-Linux kernel. The architecture allows up to 4 levels of translation
-tables with a 4KB page size and up to 3 levels with a 64KB page size.
-
-AArch64 Linux uses either 3 levels or 4 levels of translation tables
-with the 4KB page configuration, allowing 39-bit (512GB) or 48-bit
-(256TB) virtual addresses, respectively, for both user and kernel. With
-64KB pages, only 2 levels of translation tables, allowing 42-bit (4TB)
-virtual address, are used but the memory layout is the same.
-
-User addresses have bits 63:48 set to 0 while the kernel addresses have
-the same bits set to 1. TTBRx selection is given by bit 63 of the
-virtual address. The swapper_pg_dir contains only kernel (global)
-mappings while the user pgd contains only user (non-global) mappings.
-The swapper_pg_dir address is written to TTBR1 and never written to
-TTBR0.
-
-
-AArch64 Linux memory layout with 4KB pages + 3 levels:
-
-Start End Size Use
------------------------------------------------------------------------
-0000000000000000 0000007fffffffff 512GB user
-ffffff8000000000 ffffffffffffffff 512GB kernel
-
-
-AArch64 Linux memory layout with 4KB pages + 4 levels:
-
-Start End Size Use
------------------------------------------------------------------------
-0000000000000000 0000ffffffffffff 256TB user
-ffff000000000000 ffffffffffffffff 256TB kernel
-
-
-AArch64 Linux memory layout with 64KB pages + 2 levels:
-
-Start End Size Use
------------------------------------------------------------------------
-0000000000000000 000003ffffffffff 4TB user
-fffffc0000000000 ffffffffffffffff 4TB kernel
-
-
-AArch64 Linux memory layout with 64KB pages + 3 levels:
-
-Start End Size Use
------------------------------------------------------------------------
-0000000000000000 0000ffffffffffff 256TB user
-ffff000000000000 ffffffffffffffff 256TB kernel
-
-
-For details of the virtual kernel memory layout please see the kernel
-booting log.
-
-
-Translation table lookup with 4KB pages:
-
-+--------+--------+--------+--------+--------+--------+--------+--------+
-|63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0|
-+--------+--------+--------+--------+--------+--------+--------+--------+
- | | | | | |
- | | | | | v
- | | | | | [11:0] in-page offset
- | | | | +-> [20:12] L3 index
- | | | +-----------> [29:21] L2 index
- | | +---------------------> [38:30] L1 index
- | +-------------------------------> [47:39] L0 index
- +-------------------------------------------------> [63] TTBR0/1
-
-
-Translation table lookup with 64KB pages:
-
-+--------+--------+--------+--------+--------+--------+--------+--------+
-|63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0|
-+--------+--------+--------+--------+--------+--------+--------+--------+
- | | | | |
- | | | | v
- | | | | [15:0] in-page offset
- | | | +----------> [28:16] L3 index
- | | +--------------------------> [41:29] L2 index
- | +-------------------------------> [47:42] L1 index
- +-------------------------------------------------> [63] TTBR0/1
-
-
-When using KVM without the Virtualization Host Extensions, the
-hypervisor maps kernel pages in EL2 at a fixed (and potentially
-random) offset from the linear mapping. See the kern_hyp_va macro and
-kvm_update_va_mask function for more details. MMIO devices such as
-GICv2 gets mapped next to the HYP idmap page, as do vectors when
-ARM64_HARDEN_EL2_VECTORS is selected for particular CPUs.
-
-When using KVM with the Virtualization Host Extensions, no additional
-mappings are created, since the host kernel runs directly in EL2.
diff --git a/Documentation/arm64/pointer-authentication.txt b/Documentation/arm64/pointer-authentication.rst
index fc71b33de87e..30b2ab06526b 100644
--- a/Documentation/arm64/pointer-authentication.txt
+++ b/Documentation/arm64/pointer-authentication.rst
@@ -1,7 +1,9 @@
+=======================================
Pointer authentication in AArch64 Linux
=======================================
Author: Mark Rutland <mark.rutland@arm.com>
+
Date: 2017-07-19
This document briefly describes the provision of pointer authentication
diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.rst
index 2735462d5958..c792774be59e 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.rst
@@ -1,7 +1,9 @@
- Silicon Errata and Software Workarounds
- =======================================
+=======================================
+Silicon Errata and Software Workarounds
+=======================================
Author: Will Deacon <will.deacon@arm.com>
+
Date : 27 November 2015
It is an unfortunate fact of life that hardware is often produced with
@@ -9,11 +11,13 @@ so-called "errata", which can cause it to deviate from the architecture
under specific circumstances. For hardware produced by ARM, these
errata are broadly classified into the following categories:
- Category A: A critical error without a viable workaround.
- Category B: A significant or critical error with an acceptable
+ ========== ========================================================
+ Category A A critical error without a viable workaround.
+ Category B A significant or critical error with an acceptable
workaround.
- Category C: A minor error that is not expected to occur under normal
+ Category C A minor error that is not expected to occur under normal
operation.
+ ========== ========================================================
For more information, consult one of the "Software Developers Errata
Notice" documents available on infocenter.arm.com (registration
@@ -42,47 +46,86 @@ file acts as a registry of software workarounds in the Linux Kernel and
will be updated when new workarounds are committed and backported to
stable kernels.
-| Implementor | Component | Erratum ID | Kconfig |
+----------------+-----------------+-----------------+-----------------------------+
+| Implementor | Component | Erratum ID | Kconfig |
++================+=================+=================+=============================+
| Allwinner | A64/R18 | UNKNOWN1 | SUN50I_ERRATUM_UNKNOWN1 |
-| | | | |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A53 | #826319 | ARM64_ERRATUM_826319 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A53 | #827319 | ARM64_ERRATUM_827319 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A53 | #824069 | ARM64_ERRATUM_824069 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A53 | #819472 | ARM64_ERRATUM_819472 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A53 | #845719 | ARM64_ERRATUM_845719 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A53 | #843419 | ARM64_ERRATUM_843419 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A57 | #832075 | ARM64_ERRATUM_832075 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A57 | #852523 | N/A |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A57 | #834220 | ARM64_ERRATUM_834220 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A72 | #853709 | N/A |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A73 | #858921 | ARM64_ERRATUM_858921 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A55 | #1024718 | ARM64_ERRATUM_1024718 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A76 | #1188873,1418040| ARM64_ERRATUM_1418040 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A76 | #1165522 | ARM64_ERRATUM_1165522 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A76 | #1286807 | ARM64_ERRATUM_1286807 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A76 | #1463225 | ARM64_ERRATUM_1463225 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Neoverse-N1 | #1188873,1418040| ARM64_ERRATUM_1418040 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | MMU-500 | #841119,826419 | N/A |
-| | | | |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX ITS | #22375,24313 | CAVIUM_ERRATUM_22375 |
++----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX ITS | #23144 | CAVIUM_ERRATUM_23144 |
++----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX GICv3 | #23154 | CAVIUM_ERRATUM_23154 |
++----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX Core | #27456 | CAVIUM_ERRATUM_27456 |
++----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX Core | #30115 | CAVIUM_ERRATUM_30115 |
++----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX SMMUv2 | #27704 | N/A |
++----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX2 SMMUv3| #74 | N/A |
++----------------+-----------------+-----------------+-----------------------------+
| Cavium | ThunderX2 SMMUv3| #126 | N/A |
-| | | | |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
-| | | | |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip0{5,6,7} | #161010101 | HISILICON_ERRATUM_161010101 |
++----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip0{6,7} | #161010701 | N/A |
++----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip07 | #161600802 | HISILICON_ERRATUM_161600802 |
++----------------+-----------------+-----------------+-----------------------------+
| Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A |
-| | | | |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 |
++----------------+-----------------+-----------------+-----------------------------+
| Qualcomm Tech. | Falkor v1 | E1009 | QCOM_FALKOR_ERRATUM_1009 |
++----------------+-----------------+-----------------+-----------------------------+
| Qualcomm Tech. | QDF2400 ITS | E0065 | QCOM_QDF2400_ERRATUM_0065 |
++----------------+-----------------+-----------------+-----------------------------+
| Qualcomm Tech. | Falkor v{1,2} | E1041 | QCOM_FALKOR_ERRATUM_1041 |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| Fujitsu | A64FX | E#010001 | FUJITSU_ERRATUM_010001 |
++----------------+-----------------+-----------------+-----------------------------+
diff --git a/Documentation/arm64/sve.txt b/Documentation/arm64/sve.rst
index 5689fc9a976a..5689c74c8082 100644
--- a/Documentation/arm64/sve.txt
+++ b/Documentation/arm64/sve.rst
@@ -1,7 +1,9 @@
- Scalable Vector Extension support for AArch64 Linux
- ===================================================
+===================================================
+Scalable Vector Extension support for AArch64 Linux
+===================================================
Author: Dave Martin <Dave.Martin@arm.com>
+
Date: 4 August 2017
This document outlines briefly the interface provided to userspace by Linux in
@@ -442,7 +444,7 @@ In A64 state, SVE adds the following:
* FPSR and FPCR are retained from ARMv8-A, and interact with SVE floating-point
operations in a similar way to the way in which they interact with ARMv8
- floating-point operations.
+ floating-point operations::
8VL-1 128 0 bit index
+---- //// -----------------+
@@ -499,6 +501,8 @@ ARMv8-A defines the following floating-point / SIMD register state:
* 32 128-bit vector registers V0..V31
* 2 32-bit status/control registers FPSR, FPCR
+::
+
127 0 bit index
+---------------+
V0 | |
@@ -533,7 +537,7 @@ References
[2] arch/arm64/include/uapi/asm/ptrace.h
AArch64 Linux ptrace ABI definitions
-[3] Documentation/arm64/cpu-feature-registers.txt
+[3] Documentation/arm64/cpu-feature-registers.rst
[4] ARM IHI0055C
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/IHI0055C_beta_aapcs64.pdf
diff --git a/Documentation/arm64/tagged-pointers.txt b/Documentation/arm64/tagged-pointers.rst
index a25a99e82bb1..2acdec3ebbeb 100644
--- a/Documentation/arm64/tagged-pointers.txt
+++ b/Documentation/arm64/tagged-pointers.rst
@@ -1,7 +1,9 @@
- Tagged virtual addresses in AArch64 Linux
- =========================================
+=========================================
+Tagged virtual addresses in AArch64 Linux
+=========================================
Author: Will Deacon <will.deacon@arm.com>
+
Date : 12 June 2013
This document briefly describes the provision of tagged virtual
diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst
index 35d83e24dbdb..4d565d202ce3 100644
--- a/Documentation/bpf/btf.rst
+++ b/Documentation/bpf/btf.rst
@@ -151,6 +151,7 @@ for the type. The maximum value of ``BTF_INT_BITS()`` is 128.
The ``BTF_INT_OFFSET()`` specifies the starting bit offset to calculate values
for this int. For example, a bitfield struct member has:
+
* btf member bit offset 100 from the start of the structure,
* btf member pointing to an int type,
* the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
@@ -160,6 +161,7 @@ from bits ``100 + 2 = 102``.
Alternatively, the bitfield struct member can be the following to access the
same bits as the above:
+
* btf member bit offset 102,
* btf member pointing to an int type,
* the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
diff --git a/Documentation/cdrom/Makefile b/Documentation/cdrom/Makefile
deleted file mode 100644
index a19e321928e1..000000000000
--- a/Documentation/cdrom/Makefile
+++ /dev/null
@@ -1,21 +0,0 @@
-LATEXFILE = cdrom-standard
-
-all:
- make clean
- latex $(LATEXFILE)
- latex $(LATEXFILE)
- @if [ -x `which gv` ]; then \
- `dvips -q -t letter -o $(LATEXFILE).ps $(LATEXFILE).dvi` ;\
- `gv -antialias -media letter -nocenter $(LATEXFILE).ps` ;\
- else \
- `xdvi $(LATEXFILE).dvi &` ;\
- fi
- make sortofclean
-
-clean:
- rm -f $(LATEXFILE).ps $(LATEXFILE).dvi $(LATEXFILE).aux $(LATEXFILE).log
-
-sortofclean:
- rm -f $(LATEXFILE).aux $(LATEXFILE).log
-
-
diff --git a/Documentation/cdrom/cdrom-standard.rst b/Documentation/cdrom/cdrom-standard.rst
new file mode 100644
index 000000000000..dde4f7f7fdbf
--- /dev/null
+++ b/Documentation/cdrom/cdrom-standard.rst
@@ -0,0 +1,1063 @@
+=======================
+A Linux CD-ROM standard
+=======================
+
+:Author: David van Leeuwen <david@ElseWare.cistron.nl>
+:Date: 12 March 1999
+:Updated by: Erik Andersen (andersee@debian.org)
+:Updated by: Jens Axboe (axboe@image.dk)
+
+
+Introduction
+============
+
+Linux is probably the Unix-like operating system that supports
+the widest variety of hardware devices. The reasons for this are
+presumably
+
+- The large list of hardware devices available for the many platforms
+ that Linux now supports (i.e., i386-PCs, Sparc Suns, etc.)
+- The open design of the operating system, such that anybody can write a
+ driver for Linux.
+- There is plenty of source code around as examples of how to write a driver.
+
+The openness of Linux, and the many different types of available
+hardware has allowed Linux to support many different hardware devices.
+Unfortunately, the very openness that has allowed Linux to support
+all these different devices has also allowed the behavior of each
+device driver to differ significantly from one device to another.
+This divergence of behavior has been very significant for CD-ROM
+devices; the way a particular drive reacts to a `standard` *ioctl()*
+call varies greatly from one device driver to another. To avoid making
+their drivers totally inconsistent, the writers of Linux CD-ROM
+drivers generally created new device drivers by understanding, copying,
+and then changing an existing one. Unfortunately, this practice did not
+maintain uniform behavior across all the Linux CD-ROM drivers.
+
+This document describes an effort to establish Uniform behavior across
+all the different CD-ROM device drivers for Linux. This document also
+defines the various *ioctl()'s*, and how the low-level CD-ROM device
+drivers should implement them. Currently (as of the Linux 2.1.\ *x*
+development kernels) several low-level CD-ROM device drivers, including
+both IDE/ATAPI and SCSI, now use this Uniform interface.
+
+When the CD-ROM was developed, the interface between the CD-ROM drive
+and the computer was not specified in the standards. As a result, many
+different CD-ROM interfaces were developed. Some of them had their
+own proprietary design (Sony, Mitsumi, Panasonic, Philips), other
+manufacturers adopted an existing electrical interface and changed
+the functionality (CreativeLabs/SoundBlaster, Teac, Funai) or simply
+adapted their drives to one or more of the already existing electrical
+interfaces (Aztech, Sanyo, Funai, Vertos, Longshine, Optics Storage and
+most of the `NoName` manufacturers). In cases where a new drive really
+brought its own interface or used its own command set and flow control
+scheme, either a separate driver had to be written, or an existing
+driver had to be enhanced. History has delivered us CD-ROM support for
+many of these different interfaces. Nowadays, almost all new CD-ROM
+drives are either IDE/ATAPI or SCSI, and it is very unlikely that any
+manufacturer will create a new interface. Even finding drives for the
+old proprietary interfaces is getting difficult.
+
+When (in the 1.3.70's) I looked at the existing software interface,
+which was expressed through `cdrom.h`, it appeared to be a rather wild
+set of commands and data formats [#f1]_. It seemed that many
+features of the software interface had been added to accommodate the
+capabilities of a particular drive, in an *ad hoc* manner. More
+importantly, it appeared that the behavior of the `standard` commands
+was different for most of the different drivers: e. g., some drivers
+close the tray if an *open()* call occurs when the tray is open, while
+others do not. Some drivers lock the door upon opening the device, to
+prevent an incoherent file system, but others don't, to allow software
+ejection. Undoubtedly, the capabilities of the different drives vary,
+but even when two drives have the same capability their drivers'
+behavior was usually different.
+
+.. [#f1]
+ I cannot recollect what kernel version I looked at, then,
+ presumably 1.2.13 and 1.3.34 --- the latest kernel that I was
+ indirectly involved in.
+
+I decided to start a discussion on how to make all the Linux CD-ROM
+drivers behave more uniformly. I began by contacting the developers of
+the many CD-ROM drivers found in the Linux kernel. Their reactions
+encouraged me to write the Uniform CD-ROM Driver which this document is
+intended to describe. The implementation of the Uniform CD-ROM Driver is
+in the file `cdrom.c`. This driver is intended to be an additional software
+layer that sits on top of the low-level device drivers for each CD-ROM drive.
+By adding this additional layer, it is possible to have all the different
+CD-ROM devices behave **exactly** the same (insofar as the underlying
+hardware will allow).
+
+The goal of the Uniform CD-ROM Driver is **not** to alienate driver developers
+whohave not yet taken steps to support this effort. The goal of Uniform CD-ROM
+Driver is simply to give people writing application programs for CD-ROM drives
+**one** Linux CD-ROM interface with consistent behavior for all
+CD-ROM devices. In addition, this also provides a consistent interface
+between the low-level device driver code and the Linux kernel. Care
+is taken that 100% compatibility exists with the data structures and
+programmer's interface defined in `cdrom.h`. This guide was written to
+help CD-ROM driver developers adapt their code to use the Uniform CD-ROM
+Driver code defined in `cdrom.c`.
+
+Personally, I think that the most important hardware interfaces are
+the IDE/ATAPI drives and, of course, the SCSI drives, but as prices
+of hardware drop continuously, it is also likely that people may have
+more than one CD-ROM drive, possibly of mixed types. It is important
+that these drives behave in the same way. In December 1994, one of the
+cheapest CD-ROM drives was a Philips cm206, a double-speed proprietary
+drive. In the months that I was busy writing a Linux driver for it,
+proprietary drives became obsolete and IDE/ATAPI drives became the
+standard. At the time of the last update to this document (November
+1997) it is becoming difficult to even **find** anything less than a
+16 speed CD-ROM drive, and 24 speed drives are common.
+
+.. _cdrom_api:
+
+Standardizing through another software level
+============================================
+
+At the time this document was conceived, all drivers directly
+implemented the CD-ROM *ioctl()* calls through their own routines. This
+led to the danger of different drivers forgetting to do important things
+like checking that the user was giving the driver valid data. More
+importantly, this led to the divergence of behavior, which has already
+been discussed.
+
+For this reason, the Uniform CD-ROM Driver was created to enforce consistent
+CD-ROM drive behavior, and to provide a common set of services to the various
+low-level CD-ROM device drivers. The Uniform CD-ROM Driver now provides another
+software-level, that separates the *ioctl()* and *open()* implementation
+from the actual hardware implementation. Note that this effort has
+made few changes which will affect a user's application programs. The
+greatest change involved moving the contents of the various low-level
+CD-ROM drivers\' header files to the kernel's cdrom directory. This was
+done to help ensure that the user is only presented with only one cdrom
+interface, the interface defined in `cdrom.h`.
+
+CD-ROM drives are specific enough (i. e., different from other
+block-devices such as floppy or hard disc drives), to define a set
+of common **CD-ROM device operations**, *<cdrom-device>_dops*.
+These operations are different from the classical block-device file
+operations, *<block-device>_fops*.
+
+The routines for the Uniform CD-ROM Driver interface level are implemented
+in the file `cdrom.c`. In this file, the Uniform CD-ROM Driver interfaces
+with the kernel as a block device by registering the following general
+*struct file_operations*::
+
+ struct file_operations cdrom_fops = {
+ NULL, /∗ lseek ∗/
+ block _read , /∗ read—general block-dev read ∗/
+ block _write, /∗ write—general block-dev write ∗/
+ NULL, /∗ readdir ∗/
+ NULL, /∗ select ∗/
+ cdrom_ioctl, /∗ ioctl ∗/
+ NULL, /∗ mmap ∗/
+ cdrom_open, /∗ open ∗/
+ cdrom_release, /∗ release ∗/
+ NULL, /∗ fsync ∗/
+ NULL, /∗ fasync ∗/
+ cdrom_media_changed, /∗ media change ∗/
+ NULL /∗ revalidate ∗/
+ };
+
+Every active CD-ROM device shares this *struct*. The routines
+declared above are all implemented in `cdrom.c`, since this file is the
+place where the behavior of all CD-ROM-devices is defined and
+standardized. The actual interface to the various types of CD-ROM
+hardware is still performed by various low-level CD-ROM-device
+drivers. These routines simply implement certain **capabilities**
+that are common to all CD-ROM (and really, all removable-media
+devices).
+
+Registration of a low-level CD-ROM device driver is now done through
+the general routines in `cdrom.c`, not through the Virtual File System
+(VFS) any more. The interface implemented in `cdrom.c` is carried out
+through two general structures that contain information about the
+capabilities of the driver, and the specific drives on which the
+driver operates. The structures are:
+
+cdrom_device_ops
+ This structure contains information about the low-level driver for a
+ CD-ROM device. This structure is conceptually connected to the major
+ number of the device (although some drivers may have different
+ major numbers, as is the case for the IDE driver).
+
+cdrom_device_info
+ This structure contains information about a particular CD-ROM drive,
+ such as its device name, speed, etc. This structure is conceptually
+ connected to the minor number of the device.
+
+Registering a particular CD-ROM drive with the Uniform CD-ROM Driver
+is done by the low-level device driver though a call to::
+
+ register_cdrom(struct cdrom_device_info * <device>_info)
+
+The device information structure, *<device>_info*, contains all the
+information needed for the kernel to interface with the low-level
+CD-ROM device driver. One of the most important entries in this
+structure is a pointer to the *cdrom_device_ops* structure of the
+low-level driver.
+
+The device operations structure, *cdrom_device_ops*, contains a list
+of pointers to the functions which are implemented in the low-level
+device driver. When `cdrom.c` accesses a CD-ROM device, it does it
+through the functions in this structure. It is impossible to know all
+the capabilities of future CD-ROM drives, so it is expected that this
+list may need to be expanded from time to time as new technologies are
+developed. For example, CD-R and CD-R/W drives are beginning to become
+popular, and support will soon need to be added for them. For now, the
+current *struct* is::
+
+ struct cdrom_device_ops {
+ int (*open)(struct cdrom_device_info *, int)
+ void (*release)(struct cdrom_device_info *);
+ int (*drive_status)(struct cdrom_device_info *, int);
+ unsigned int (*check_events)(struct cdrom_device_info *,
+ unsigned int, int);
+ int (*media_changed)(struct cdrom_device_info *, int);
+ int (*tray_move)(struct cdrom_device_info *, int);
+ int (*lock_door)(struct cdrom_device_info *, int);
+ int (*select_speed)(struct cdrom_device_info *, int);
+ int (*select_disc)(struct cdrom_device_info *, int);
+ int (*get_last_session) (struct cdrom_device_info *,
+ struct cdrom_multisession *);
+ int (*get_mcn)(struct cdrom_device_info *, struct cdrom_mcn *);
+ int (*reset)(struct cdrom_device_info *);
+ int (*audio_ioctl)(struct cdrom_device_info *,
+ unsigned int, void *);
+ const int capability; /* capability flags */
+ int (*generic_packet)(struct cdrom_device_info *,
+ struct packet_command *);
+ };
+
+When a low-level device driver implements one of these capabilities,
+it should add a function pointer to this *struct*. When a particular
+function is not implemented, however, this *struct* should contain a
+NULL instead. The *capability* flags specify the capabilities of the
+CD-ROM hardware and/or low-level CD-ROM driver when a CD-ROM drive
+is registered with the Uniform CD-ROM Driver.
+
+Note that most functions have fewer parameters than their
+*blkdev_fops* counterparts. This is because very little of the
+information in the structures *inode* and *file* is used. For most
+drivers, the main parameter is the *struct* *cdrom_device_info*, from
+which the major and minor number can be extracted. (Most low-level
+CD-ROM drivers don't even look at the major and minor number though,
+since many of them only support one device.) This will be available
+through *dev* in *cdrom_device_info* described below.
+
+The drive-specific, minor-like information that is registered with
+`cdrom.c`, currently contains the following fields::
+
+ struct cdrom_device_info {
+ const struct cdrom_device_ops * ops; /* device operations for this major */
+ struct list_head list; /* linked list of all device_info */
+ struct gendisk * disk; /* matching block layer disk */
+ void * handle; /* driver-dependent data */
+
+ int mask; /* mask of capability: disables them */
+ int speed; /* maximum speed for reading data */
+ int capacity; /* number of discs in a jukebox */
+
+ unsigned int options:30; /* options flags */
+ unsigned mc_flags:2; /* media-change buffer flags */
+ unsigned int vfs_events; /* cached events for vfs path */
+ unsigned int ioctl_events; /* cached events for ioctl path */
+ int use_count; /* number of times device is opened */
+ char name[20]; /* name of the device type */
+
+ __u8 sanyo_slot : 2; /* Sanyo 3-CD changer support */
+ __u8 keeplocked : 1; /* CDROM_LOCKDOOR status */
+ __u8 reserved : 5; /* not used yet */
+ int cdda_method; /* see CDDA_* flags */
+ __u8 last_sense; /* saves last sense key */
+ __u8 media_written; /* dirty flag, DVD+RW bookkeeping */
+ unsigned short mmc3_profile; /* current MMC3 profile */
+ int for_data; /* unknown:TBD */
+ int (*exit)(struct cdrom_device_info *);/* unknown:TBD */
+ int mrw_mode_page; /* which MRW mode page is in use */
+ };
+
+Using this *struct*, a linked list of the registered minor devices is
+built, using the *next* field. The device number, the device operations
+struct and specifications of properties of the drive are stored in this
+structure.
+
+The *mask* flags can be used to mask out some of the capabilities listed
+in *ops->capability*, if a specific drive doesn't support a feature
+of the driver. The value *speed* specifies the maximum head-rate of the
+drive, measured in units of normal audio speed (176kB/sec raw data or
+150kB/sec file system data). The parameters are declared *const*
+because they describe properties of the drive, which don't change after
+registration.
+
+A few registers contain variables local to the CD-ROM drive. The
+flags *options* are used to specify how the general CD-ROM routines
+should behave. These various flags registers should provide enough
+flexibility to adapt to the different users' wishes (and **not** the
+`arbitrary` wishes of the author of the low-level device driver, as is
+the case in the old scheme). The register *mc_flags* is used to buffer
+the information from *media_changed()* to two separate queues. Other
+data that is specific to a minor drive, can be accessed through *handle*,
+which can point to a data structure specific to the low-level driver.
+The fields *use_count*, *next*, *options* and *mc_flags* need not be
+initialized.
+
+The intermediate software layer that `cdrom.c` forms will perform some
+additional bookkeeping. The use count of the device (the number of
+processes that have the device opened) is registered in *use_count*. The
+function *cdrom_ioctl()* will verify the appropriate user-memory regions
+for read and write, and in case a location on the CD is transferred,
+it will `sanitize` the format by making requests to the low-level
+drivers in a standard format, and translating all formats between the
+user-software and low level drivers. This relieves much of the drivers'
+memory checking and format checking and translation. Also, the necessary
+structures will be declared on the program stack.
+
+The implementation of the functions should be as defined in the
+following sections. Two functions **must** be implemented, namely
+*open()* and *release()*. Other functions may be omitted, their
+corresponding capability flags will be cleared upon registration.
+Generally, a function returns zero on success and negative on error. A
+function call should return only after the command has completed, but of
+course waiting for the device should not use processor time.
+
+::
+
+ int open(struct cdrom_device_info *cdi, int purpose)
+
+*Open()* should try to open the device for a specific *purpose*, which
+can be either:
+
+- Open for reading data, as done by `mount()` (2), or the
+ user commands `dd` or `cat`.
+- Open for *ioctl* commands, as done by audio-CD playing programs.
+
+Notice that any strategic code (closing tray upon *open()*, etc.) is
+done by the calling routine in `cdrom.c`, so the low-level routine
+should only be concerned with proper initialization, such as spinning
+up the disc, etc.
+
+::
+
+ void release(struct cdrom_device_info *cdi)
+
+Device-specific actions should be taken such as spinning down the device.
+However, strategic actions such as ejection of the tray, or unlocking
+the door, should be left over to the general routine *cdrom_release()*.
+This is the only function returning type *void*.
+
+.. _cdrom_drive_status:
+
+::
+
+ int drive_status(struct cdrom_device_info *cdi, int slot_nr)
+
+The function *drive_status*, if implemented, should provide
+information on the status of the drive (not the status of the disc,
+which may or may not be in the drive). If the drive is not a changer,
+*slot_nr* should be ignored. In `cdrom.h` the possibilities are listed::
+
+
+ CDS_NO_INFO /* no information available */
+ CDS_NO_DISC /* no disc is inserted, tray is closed */
+ CDS_TRAY_OPEN /* tray is opened */
+ CDS_DRIVE_NOT_READY /* something is wrong, tray is moving? */
+ CDS_DISC_OK /* a disc is loaded and everything is fine */
+
+::
+
+ int media_changed(struct cdrom_device_info *cdi, int disc_nr)
+
+This function is very similar to the original function in $struct
+file_operations*. It returns 1 if the medium of the device *cdi->dev*
+has changed since the last call, and 0 otherwise. The parameter
+*disc_nr* identifies a specific slot in a juke-box, it should be
+ignored for single-disc drives. Note that by `re-routing` this
+function through *cdrom_media_changed()*, we can implement separate
+queues for the VFS and a new *ioctl()* function that can report device
+changes to software (e. g., an auto-mounting daemon).
+
+::
+
+ int tray_move(struct cdrom_device_info *cdi, int position)
+
+This function, if implemented, should control the tray movement. (No
+other function should control this.) The parameter *position* controls
+the desired direction of movement:
+
+- 0 Close tray
+- 1 Open tray
+
+This function returns 0 upon success, and a non-zero value upon
+error. Note that if the tray is already in the desired position, no
+action need be taken, and the return value should be 0.
+
+::
+
+ int lock_door(struct cdrom_device_info *cdi, int lock)
+
+This function (and no other code) controls locking of the door, if the
+drive allows this. The value of *lock* controls the desired locking
+state:
+
+- 0 Unlock door, manual opening is allowed
+- 1 Lock door, tray cannot be ejected manually
+
+This function returns 0 upon success, and a non-zero value upon
+error. Note that if the door is already in the requested state, no
+action need be taken, and the return value should be 0.
+
+::
+
+ int select_speed(struct cdrom_device_info *cdi, int speed)
+
+Some CD-ROM drives are capable of changing their head-speed. There
+are several reasons for changing the speed of a CD-ROM drive. Badly
+pressed CD-ROM s may benefit from less-than-maximum head rate. Modern
+CD-ROM drives can obtain very high head rates (up to *24x* is
+common). It has been reported that these drives can make reading
+errors at these high speeds, reducing the speed can prevent data loss
+in these circumstances. Finally, some of these drives can
+make an annoyingly loud noise, which a lower speed may reduce.
+
+This function specifies the speed at which data is read or audio is
+played back. The value of *speed* specifies the head-speed of the
+drive, measured in units of standard cdrom speed (176kB/sec raw data
+or 150kB/sec file system data). So to request that a CD-ROM drive
+operate at 300kB/sec you would call the CDROM_SELECT_SPEED *ioctl*
+with *speed=2*. The special value `0` means `auto-selection`, i. e.,
+maximum data-rate or real-time audio rate. If the drive doesn't have
+this `auto-selection` capability, the decision should be made on the
+current disc loaded and the return value should be positive. A negative
+return value indicates an error.
+
+::
+
+ int select_disc(struct cdrom_device_info *cdi, int number)
+
+If the drive can store multiple discs (a juke-box) this function
+will perform disc selection. It should return the number of the
+selected disc on success, a negative value on error. Currently, only
+the ide-cd driver supports this functionality.
+
+::
+
+ int get_last_session(struct cdrom_device_info *cdi,
+ struct cdrom_multisession *ms_info)
+
+This function should implement the old corresponding *ioctl()*. For
+device *cdi->dev*, the start of the last session of the current disc
+should be returned in the pointer argument *ms_info*. Note that
+routines in `cdrom.c` have sanitized this argument: its requested
+format will **always** be of the type *CDROM_LBA* (linear block
+addressing mode), whatever the calling software requested. But
+sanitization goes even further: the low-level implementation may
+return the requested information in *CDROM_MSF* format if it wishes so
+(setting the *ms_info->addr_format* field appropriately, of
+course) and the routines in `cdrom.c` will make the transformation if
+necessary. The return value is 0 upon success.
+
+::
+
+ int get_mcn(struct cdrom_device_info *cdi,
+ struct cdrom_mcn *mcn)
+
+Some discs carry a `Media Catalog Number` (MCN), also called
+`Universal Product Code` (UPC). This number should reflect the number
+that is generally found in the bar-code on the product. Unfortunately,
+the few discs that carry such a number on the disc don't even use the
+same format. The return argument to this function is a pointer to a
+pre-declared memory region of type *struct cdrom_mcn*. The MCN is
+expected as a 13-character string, terminated by a null-character.
+
+::
+
+ int reset(struct cdrom_device_info *cdi)
+
+This call should perform a hard-reset on the drive (although in
+circumstances that a hard-reset is necessary, a drive may very well not
+listen to commands anymore). Preferably, control is returned to the
+caller only after the drive has finished resetting. If the drive is no
+longer listening, it may be wise for the underlying low-level cdrom
+driver to time out.
+
+::
+
+ int audio_ioctl(struct cdrom_device_info *cdi,
+ unsigned int cmd, void *arg)
+
+Some of the CD-ROM-\ *ioctl()*\ 's defined in `cdrom.h` can be
+implemented by the routines described above, and hence the function
+*cdrom_ioctl* will use those. However, most *ioctl()*\ 's deal with
+audio-control. We have decided to leave these to be accessed through a
+single function, repeating the arguments *cmd* and *arg*. Note that
+the latter is of type *void*, rather than *unsigned long int*.
+The routine *cdrom_ioctl()* does do some useful things,
+though. It sanitizes the address format type to *CDROM_MSF* (Minutes,
+Seconds, Frames) for all audio calls. It also verifies the memory
+location of *arg*, and reserves stack-memory for the argument. This
+makes implementation of the *audio_ioctl()* much simpler than in the
+old driver scheme. For example, you may look up the function
+*cm206_audio_ioctl()* `cm206.c` that should be updated with
+this documentation.
+
+An unimplemented ioctl should return *-ENOSYS*, but a harmless request
+(e. g., *CDROMSTART*) may be ignored by returning 0 (success). Other
+errors should be according to the standards, whatever they are. When
+an error is returned by the low-level driver, the Uniform CD-ROM Driver
+tries whenever possible to return the error code to the calling program.
+(We may decide to sanitize the return value in *cdrom_ioctl()* though, in
+order to guarantee a uniform interface to the audio-player software.)
+
+::
+
+ int dev_ioctl(struct cdrom_device_info *cdi,
+ unsigned int cmd, unsigned long arg)
+
+Some *ioctl()'s* seem to be specific to certain CD-ROM drives. That is,
+they are introduced to service some capabilities of certain drives. In
+fact, there are 6 different *ioctl()'s* for reading data, either in some
+particular kind of format, or audio data. Not many drives support
+reading audio tracks as data, I believe this is because of protection
+of copyrights of artists. Moreover, I think that if audio-tracks are
+supported, it should be done through the VFS and not via *ioctl()'s*. A
+problem here could be the fact that audio-frames are 2352 bytes long,
+so either the audio-file-system should ask for 75264 bytes at once
+(the least common multiple of 512 and 2352), or the drivers should
+bend their backs to cope with this incoherence (to which I would be
+opposed). Furthermore, it is very difficult for the hardware to find
+the exact frame boundaries, since there are no synchronization headers
+in audio frames. Once these issues are resolved, this code should be
+standardized in `cdrom.c`.
+
+Because there are so many *ioctl()'s* that seem to be introduced to
+satisfy certain drivers [#f2]_, any non-standard *ioctl()*\ s
+are routed through the call *dev_ioctl()*. In principle, `private`
+*ioctl()*\ 's should be numbered after the device's major number, and not
+the general CD-ROM *ioctl* number, `0x53`. Currently the
+non-supported *ioctl()'s* are:
+
+ CDROMREADMODE1, CDROMREADMODE2, CDROMREADAUDIO, CDROMREADRAW,
+ CDROMREADCOOKED, CDROMSEEK, CDROMPLAY-BLK and CDROM-READALL
+
+.. [#f2]
+
+ Is there software around that actually uses these? I'd be interested!
+
+.. _cdrom_capabilities:
+
+CD-ROM capabilities
+-------------------
+
+Instead of just implementing some *ioctl* calls, the interface in
+`cdrom.c` supplies the possibility to indicate the **capabilities**
+of a CD-ROM drive. This can be done by ORing any number of
+capability-constants that are defined in `cdrom.h` at the registration
+phase. Currently, the capabilities are any of::
+
+ CDC_CLOSE_TRAY /* can close tray by software control */
+ CDC_OPEN_TRAY /* can open tray */
+ CDC_LOCK /* can lock and unlock the door */
+ CDC_SELECT_SPEED /* can select speed, in units of * sim*150 ,kB/s */
+ CDC_SELECT_DISC /* drive is juke-box */
+ CDC_MULTI_SESSION /* can read sessions *> rm1* */
+ CDC_MCN /* can read Media Catalog Number */
+ CDC_MEDIA_CHANGED /* can report if disc has changed */
+ CDC_PLAY_AUDIO /* can perform audio-functions (play, pause, etc) */
+ CDC_RESET /* hard reset device */
+ CDC_IOCTLS /* driver has non-standard ioctls */
+ CDC_DRIVE_STATUS /* driver implements drive status */
+
+The capability flag is declared *const*, to prevent drivers from
+accidentally tampering with the contents. The capability fags actually
+inform `cdrom.c` of what the driver can do. If the drive found
+by the driver does not have the capability, is can be masked out by
+the *cdrom_device_info* variable *mask*. For instance, the SCSI CD-ROM
+driver has implemented the code for loading and ejecting CD-ROM's, and
+hence its corresponding flags in *capability* will be set. But a SCSI
+CD-ROM drive might be a caddy system, which can't load the tray, and
+hence for this drive the *cdrom_device_info* struct will have set
+the *CDC_CLOSE_TRAY* bit in *mask*.
+
+In the file `cdrom.c` you will encounter many constructions of the type::
+
+ if (cdo->capability & ∼cdi->mask & CDC _⟨capability⟩) ...
+
+There is no *ioctl* to set the mask... The reason is that
+I think it is better to control the **behavior** rather than the
+**capabilities**.
+
+Options
+-------
+
+A final flag register controls the **behavior** of the CD-ROM
+drives, in order to satisfy different users' wishes, hopefully
+independently of the ideas of the respective author who happened to
+have made the drive's support available to the Linux community. The
+current behavior options are::
+
+ CDO_AUTO_CLOSE /* try to close tray upon device open() */
+ CDO_AUTO_EJECT /* try to open tray on last device close() */
+ CDO_USE_FFLAGS /* use file_pointer->f_flags to indicate purpose for open() */
+ CDO_LOCK /* try to lock door if device is opened */
+ CDO_CHECK_TYPE /* ensure disc type is data if opened for data */
+
+The initial value of this register is
+`CDO_AUTO_CLOSE | CDO_USE_FFLAGS | CDO_LOCK`, reflecting my own view on user
+interface and software standards. Before you protest, there are two
+new *ioctl()'s* implemented in `cdrom.c`, that allow you to control the
+behavior by software. These are::
+
+ CDROM_SET_OPTIONS /* set options specified in (int)arg */
+ CDROM_CLEAR_OPTIONS /* clear options specified in (int)arg */
+
+One option needs some more explanation: *CDO_USE_FFLAGS*. In the next
+newsection we explain what the need for this option is.
+
+A software package `setcd`, available from the Debian distribution
+and `sunsite.unc.edu`, allows user level control of these flags.
+
+
+The need to know the purpose of opening the CD-ROM device
+=========================================================
+
+Traditionally, Unix devices can be used in two different `modes`,
+either by reading/writing to the device file, or by issuing
+controlling commands to the device, by the device's *ioctl()*
+call. The problem with CD-ROM drives, is that they can be used for
+two entirely different purposes. One is to mount removable
+file systems, CD-ROM's, the other is to play audio CD's. Audio commands
+are implemented entirely through *ioctl()\'s*, presumably because the
+first implementation (SUN?) has been such. In principle there is
+nothing wrong with this, but a good control of the `CD player` demands
+that the device can **always** be opened in order to give the
+*ioctl* commands, regardless of the state the drive is in.
+
+On the other hand, when used as a removable-media disc drive (what the
+original purpose of CD-ROM s is) we would like to make sure that the
+disc drive is ready for operation upon opening the device. In the old
+scheme, some CD-ROM drivers don't do any integrity checking, resulting
+in a number of i/o errors reported by the VFS to the kernel when an
+attempt for mounting a CD-ROM on an empty drive occurs. This is not a
+particularly elegant way to find out that there is no CD-ROM inserted;
+it more-or-less looks like the old IBM-PC trying to read an empty floppy
+drive for a couple of seconds, after which the system complains it
+can't read from it. Nowadays we can **sense** the existence of a
+removable medium in a drive, and we believe we should exploit that
+fact. An integrity check on opening of the device, that verifies the
+availability of a CD-ROM and its correct type (data), would be
+desirable.
+
+These two ways of using a CD-ROM drive, principally for data and
+secondarily for playing audio discs, have different demands for the
+behavior of the *open()* call. Audio use simply wants to open the
+device in order to get a file handle which is needed for issuing
+*ioctl* commands, while data use wants to open for correct and
+reliable data transfer. The only way user programs can indicate what
+their *purpose* of opening the device is, is through the *flags*
+parameter (see `open(2)`). For CD-ROM devices, these flags aren't
+implemented (some drivers implement checking for write-related flags,
+but this is not strictly necessary if the device file has correct
+permission flags). Most option flags simply don't make sense to
+CD-ROM devices: *O_CREAT*, *O_NOCTTY*, *O_TRUNC*, *O_APPEND*, and
+*O_SYNC* have no meaning to a CD-ROM.
+
+We therefore propose to use the flag *O_NONBLOCK* to indicate
+that the device is opened just for issuing *ioctl*
+commands. Strictly, the meaning of *O_NONBLOCK* is that opening and
+subsequent calls to the device don't cause the calling process to
+wait. We could interpret this as don't wait until someone has
+inserted some valid data-CD-ROM. Thus, our proposal of the
+implementation for the *open()* call for CD-ROM s is:
+
+- If no other flags are set than *O_RDONLY*, the device is opened
+ for data transfer, and the return value will be 0 only upon successful
+ initialization of the transfer. The call may even induce some actions
+ on the CD-ROM, such as closing the tray.
+- If the option flag *O_NONBLOCK* is set, opening will always be
+ successful, unless the whole device doesn't exist. The drive will take
+ no actions whatsoever.
+
+And what about standards?
+-------------------------
+
+You might hesitate to accept this proposal as it comes from the
+Linux community, and not from some standardizing institute. What
+about SUN, SGI, HP and all those other Unix and hardware vendors?
+Well, these companies are in the lucky position that they generally
+control both the hardware and software of their supported products,
+and are large enough to set their own standard. They do not have to
+deal with a dozen or more different, competing hardware
+configurations\ [#f3]_.
+
+.. [#f3]
+
+ Incidentally, I think that SUN's approach to mounting CD-ROM s is very
+ good in origin: under Solaris a volume-daemon automatically mounts a
+ newly inserted CD-ROM under `/cdrom/*<volume-name>*`.
+
+ In my opinion they should have pushed this
+ further and have **every** CD-ROM on the local area network be
+ mounted at the similar location, i. e., no matter in which particular
+ machine you insert a CD-ROM, it will always appear at the same
+ position in the directory tree, on every system. When I wanted to
+ implement such a user-program for Linux, I came across the
+ differences in behavior of the various drivers, and the need for an
+ *ioctl* informing about media changes.
+
+We believe that using *O_NONBLOCK* to indicate that a device is being opened
+for *ioctl* commands only can be easily introduced in the Linux
+community. All the CD-player authors will have to be informed, we can
+even send in our own patches to the programs. The use of *O_NONBLOCK*
+has most likely no influence on the behavior of the CD-players on
+other operating systems than Linux. Finally, a user can always revert
+to old behavior by a call to
+*ioctl(file_descriptor, CDROM_CLEAR_OPTIONS, CDO_USE_FFLAGS)*.
+
+The preferred strategy of *open()*
+----------------------------------
+
+The routines in `cdrom.c` are designed in such a way that run-time
+configuration of the behavior of CD-ROM devices (of **any** type)
+can be carried out, by the *CDROM_SET/CLEAR_OPTIONS* *ioctls*. Thus, various
+modes of operation can be set:
+
+`CDO_AUTO_CLOSE | CDO_USE_FFLAGS | CDO_LOCK`
+ This is the default setting. (With *CDO_CHECK_TYPE* it will be better, in
+ the future.) If the device is not yet opened by any other process, and if
+ the device is being opened for data (*O_NONBLOCK* is not set) and the
+ tray is found to be open, an attempt to close the tray is made. Then,
+ it is verified that a disc is in the drive and, if *CDO_CHECK_TYPE* is
+ set, that it contains tracks of type `data mode 1`. Only if all tests
+ are passed is the return value zero. The door is locked to prevent file
+ system corruption. If the drive is opened for audio (*O_NONBLOCK* is
+ set), no actions are taken and a value of 0 will be returned.
+
+`CDO_AUTO_CLOSE | CDO_AUTO_EJECT | CDO_LOCK`
+ This mimics the behavior of the current sbpcd-driver. The option flags are
+ ignored, the tray is closed on the first open, if necessary. Similarly,
+ the tray is opened on the last release, i. e., if a CD-ROM is unmounted,
+ it is automatically ejected, such that the user can replace it.
+
+We hope that these option can convince everybody (both driver
+maintainers and user program developers) to adopt the new CD-ROM
+driver scheme and option flag interpretation.
+
+Description of routines in `cdrom.c`
+====================================
+
+Only a few routines in `cdrom.c` are exported to the drivers. In this
+new section we will discuss these, as well as the functions that `take
+over' the CD-ROM interface to the kernel. The header file belonging
+to `cdrom.c` is called `cdrom.h`. Formerly, some of the contents of this
+file were placed in the file `ucdrom.h`, but this file has now been
+merged back into `cdrom.h`.
+
+::
+
+ struct file_operations cdrom_fops
+
+The contents of this structure were described in cdrom_api_.
+A pointer to this structure is assigned to the *fops* field
+of the *struct gendisk*.
+
+::
+
+ int register_cdrom(struct cdrom_device_info *cdi)
+
+This function is used in about the same way one registers *cdrom_fops*
+with the kernel, the device operations and information structures,
+as described in cdrom_api_, should be registered with the
+Uniform CD-ROM Driver::
+
+ register_cdrom(&<device>_info);
+
+
+This function returns zero upon success, and non-zero upon
+failure. The structure *<device>_info* should have a pointer to the
+driver's *<device>_dops*, as in::
+
+ struct cdrom_device_info <device>_info = {
+ <device>_dops;
+ ...
+ }
+
+Note that a driver must have one static structure, *<device>_dops*, while
+it may have as many structures *<device>_info* as there are minor devices
+active. *Register_cdrom()* builds a linked list from these.
+
+
+::
+
+ void unregister_cdrom(struct cdrom_device_info *cdi)
+
+Unregistering device *cdi* with minor number *MINOR(cdi->dev)* removes
+the minor device from the list. If it was the last registered minor for
+the low-level driver, this disconnects the registered device-operation
+routines from the CD-ROM interface. This function returns zero upon
+success, and non-zero upon failure.
+
+::
+
+ int cdrom_open(struct inode * ip, struct file * fp)
+
+This function is not called directly by the low-level drivers, it is
+listed in the standard *cdrom_fops*. If the VFS opens a file, this
+function becomes active. A strategy is implemented in this routine,
+taking care of all capabilities and options that are set in the
+*cdrom_device_ops* connected to the device. Then, the program flow is
+transferred to the device_dependent *open()* call.
+
+::
+
+ void cdrom_release(struct inode *ip, struct file *fp)
+
+This function implements the reverse-logic of *cdrom_open()*, and then
+calls the device-dependent *release()* routine. When the use-count has
+reached 0, the allocated buffers are flushed by calls to *sync_dev(dev)*
+and *invalidate_buffers(dev)*.
+
+
+.. _cdrom_ioctl:
+
+::
+
+ int cdrom_ioctl(struct inode *ip, struct file *fp,
+ unsigned int cmd, unsigned long arg)
+
+This function handles all the standard *ioctl* requests for CD-ROM
+devices in a uniform way. The different calls fall into three
+categories: *ioctl()'s* that can be directly implemented by device
+operations, ones that are routed through the call *audio_ioctl()*, and
+the remaining ones, that are presumable device-dependent. Generally, a
+negative return value indicates an error.
+
+Directly implemented *ioctl()'s*
+--------------------------------
+
+The following `old` CD-ROM *ioctl()*\ 's are implemented by directly
+calling device-operations in *cdrom_device_ops*, if implemented and
+not masked:
+
+`CDROMMULTISESSION`
+ Requests the last session on a CD-ROM.
+`CDROMEJECT`
+ Open tray.
+`CDROMCLOSETRAY`
+ Close tray.
+`CDROMEJECT_SW`
+ If *arg\not=0*, set behavior to auto-close (close
+ tray on first open) and auto-eject (eject on last release), otherwise
+ set behavior to non-moving on *open()* and *release()* calls.
+`CDROM_GET_MCN`
+ Get the Media Catalog Number from a CD.
+
+*Ioctl*s routed through *audio_ioctl()*
+---------------------------------------
+
+The following set of *ioctl()'s* are all implemented through a call to
+the *cdrom_fops* function *audio_ioctl()*. Memory checks and
+allocation are performed in *cdrom_ioctl()*, and also sanitization of
+address format (*CDROM_LBA*/*CDROM_MSF*) is done.
+
+`CDROMSUBCHNL`
+ Get sub-channel data in argument *arg* of type
+ `struct cdrom_subchnl *`.
+`CDROMREADTOCHDR`
+ Read Table of Contents header, in *arg* of type
+ `struct cdrom_tochdr *`.
+`CDROMREADTOCENTRY`
+ Read a Table of Contents entry in *arg* and specified by *arg*
+ of type `struct cdrom_tocentry *`.
+`CDROMPLAYMSF`
+ Play audio fragment specified in Minute, Second, Frame format,
+ delimited by *arg* of type `struct cdrom_msf *`.
+`CDROMPLAYTRKIND`
+ Play audio fragment in track-index format delimited by *arg*
+ of type `struct cdrom_ti *`.
+`CDROMVOLCTRL`
+ Set volume specified by *arg* of type `struct cdrom_volctrl *`.
+`CDROMVOLREAD`
+ Read volume into by *arg* of type `struct cdrom_volctrl *`.
+`CDROMSTART`
+ Spin up disc.
+`CDROMSTOP`
+ Stop playback of audio fragment.
+`CDROMPAUSE`
+ Pause playback of audio fragment.
+`CDROMRESUME`
+ Resume playing.
+
+New *ioctl()'s* in `cdrom.c`
+----------------------------
+
+The following *ioctl()'s* have been introduced to allow user programs to
+control the behavior of individual CD-ROM devices. New *ioctl*
+commands can be identified by the underscores in their names.
+
+`CDROM_SET_OPTIONS`
+ Set options specified by *arg*. Returns the option flag register
+ after modification. Use *arg = \rm0* for reading the current flags.
+`CDROM_CLEAR_OPTIONS`
+ Clear options specified by *arg*. Returns the option flag register
+ after modification.
+`CDROM_SELECT_SPEED`
+ Select head-rate speed of disc specified as by *arg* in units
+ of standard cdrom speed (176\,kB/sec raw data or
+ 150kB/sec file system data). The value 0 means `auto-select`,
+ i. e., play audio discs at real time and data discs at maximum speed.
+ The value *arg* is checked against the maximum head rate of the
+ drive found in the *cdrom_dops*.
+`CDROM_SELECT_DISC`
+ Select disc numbered *arg* from a juke-box.
+
+ First disc is numbered 0. The number *arg* is checked against the
+ maximum number of discs in the juke-box found in the *cdrom_dops*.
+`CDROM_MEDIA_CHANGED`
+ Returns 1 if a disc has been changed since the last call.
+ Note that calls to *cdrom_media_changed* by the VFS are treated
+ by an independent queue, so both mechanisms will detect a
+ media change once. For juke-boxes, an extra argument *arg*
+ specifies the slot for which the information is given. The special
+ value *CDSL_CURRENT* requests that information about the currently
+ selected slot be returned.
+`CDROM_DRIVE_STATUS`
+ Returns the status of the drive by a call to
+ *drive_status()*. Return values are defined in cdrom_drive_status_.
+ Note that this call doesn't return information on the
+ current playing activity of the drive; this can be polled through
+ an *ioctl* call to *CDROMSUBCHNL*. For juke-boxes, an extra argument
+ *arg* specifies the slot for which (possibly limited) information is
+ given. The special value *CDSL_CURRENT* requests that information
+ about the currently selected slot be returned.
+`CDROM_DISC_STATUS`
+ Returns the type of the disc currently in the drive.
+ It should be viewed as a complement to *CDROM_DRIVE_STATUS*.
+ This *ioctl* can provide *some* information about the current
+ disc that is inserted in the drive. This functionality used to be
+ implemented in the low level drivers, but is now carried out
+ entirely in Uniform CD-ROM Driver.
+
+ The history of development of the CD's use as a carrier medium for
+ various digital information has lead to many different disc types.
+ This *ioctl* is useful only in the case that CDs have \emph {only
+ one} type of data on them. While this is often the case, it is
+ also very common for CDs to have some tracks with data, and some
+ tracks with audio. Because this is an existing interface, rather
+ than fixing this interface by changing the assumptions it was made
+ under, thereby breaking all user applications that use this
+ function, the Uniform CD-ROM Driver implements this *ioctl* as
+ follows: If the CD in question has audio tracks on it, and it has
+ absolutely no CD-I, XA, or data tracks on it, it will be reported
+ as *CDS_AUDIO*. If it has both audio and data tracks, it will
+ return *CDS_MIXED*. If there are no audio tracks on the disc, and
+ if the CD in question has any CD-I tracks on it, it will be
+ reported as *CDS_XA_2_2*. Failing that, if the CD in question
+ has any XA tracks on it, it will be reported as *CDS_XA_2_1*.
+ Finally, if the CD in question has any data tracks on it,
+ it will be reported as a data CD (*CDS_DATA_1*).
+
+ This *ioctl* can return::
+
+ CDS_NO_INFO /* no information available */
+ CDS_NO_DISC /* no disc is inserted, or tray is opened */
+ CDS_AUDIO /* Audio disc (2352 audio bytes/frame) */
+ CDS_DATA_1 /* data disc, mode 1 (2048 user bytes/frame) */
+ CDS_XA_2_1 /* mixed data (XA), mode 2, form 1 (2048 user bytes) */
+ CDS_XA_2_2 /* mixed data (XA), mode 2, form 1 (2324 user bytes) */
+ CDS_MIXED /* mixed audio/data disc */
+
+ For some information concerning frame layout of the various disc
+ types, see a recent version of `cdrom.h`.
+
+`CDROM_CHANGER_NSLOTS`
+ Returns the number of slots in a juke-box.
+`CDROMRESET`
+ Reset the drive.
+`CDROM_GET_CAPABILITY`
+ Returns the *capability* flags for the drive. Refer to section
+ cdrom_capabilities_ for more information on these flags.
+`CDROM_LOCKDOOR`
+ Locks the door of the drive. `arg == 0` unlocks the door,
+ any other value locks it.
+`CDROM_DEBUG`
+ Turns on debugging info. Only root is allowed to do this.
+ Same semantics as CDROM_LOCKDOOR.
+
+
+Device dependent *ioctl()'s*
+----------------------------
+
+Finally, all other *ioctl()'s* are passed to the function *dev_ioctl()*,
+if implemented. No memory allocation or verification is carried out.
+
+How to update your driver
+=========================
+
+- Make a backup of your current driver.
+- Get hold of the files `cdrom.c` and `cdrom.h`, they should be in
+ the directory tree that came with this documentation.
+- Make sure you include `cdrom.h`.
+- Change the 3rd argument of *register_blkdev* from `&<your-drive>_fops`
+ to `&cdrom_fops`.
+- Just after that line, add the following to register with the Uniform
+ CD-ROM Driver::
+
+ register_cdrom(&<your-drive>_info);*
+
+ Similarly, add a call to *unregister_cdrom()* at the appropriate place.
+- Copy an example of the device-operations *struct* to your
+ source, e. g., from `cm206.c` *cm206_dops*, and change all
+ entries to names corresponding to your driver, or names you just
+ happen to like. If your driver doesn't support a certain function,
+ make the entry *NULL*. At the entry *capability* you should list all
+ capabilities your driver currently supports. If your driver
+ has a capability that is not listed, please send me a message.
+- Copy the *cdrom_device_info* declaration from the same example
+ driver, and modify the entries according to your needs. If your
+ driver dynamically determines the capabilities of the hardware, this
+ structure should also be declared dynamically.
+- Implement all functions in your `<device>_dops` structure,
+ according to prototypes listed in `cdrom.h`, and specifications given
+ in cdrom_api_. Most likely you have already implemented
+ the code in a large part, and you will almost certainly need to adapt the
+ prototype and return values.
+- Rename your `<device>_ioctl()` function to *audio_ioctl* and
+ change the prototype a little. Remove entries listed in the first
+ part in cdrom_ioctl_, if your code was OK, these are
+ just calls to the routines you adapted in the previous step.
+- You may remove all remaining memory checking code in the
+ *audio_ioctl()* function that deals with audio commands (these are
+ listed in the second part of cdrom_ioctl_. There is no
+ need for memory allocation either, so most *case*s in the *switch*
+ statement look similar to::
+
+ case CDROMREADTOCENTRY:
+ get_toc_entry\bigl((struct cdrom_tocentry *) arg);
+
+- All remaining *ioctl* cases must be moved to a separate
+ function, *<device>_ioctl*, the device-dependent *ioctl()'s*. Note that
+ memory checking and allocation must be kept in this code!
+- Change the prototypes of *<device>_open()* and
+ *<device>_release()*, and remove any strategic code (i. e., tray
+ movement, door locking, etc.).
+- Try to recompile the drivers. We advise you to use modules, both
+ for `cdrom.o` and your driver, as debugging is much easier this
+ way.
+
+Thanks
+======
+
+Thanks to all the people involved. First, Erik Andersen, who has
+taken over the torch in maintaining `cdrom.c` and integrating much
+CD-ROM-related code in the 2.1-kernel. Thanks to Scott Snyder and
+Gerd Knorr, who were the first to implement this interface for SCSI
+and IDE-CD drivers and added many ideas for extension of the data
+structures relative to kernel~2.0. Further thanks to Heiko Eißfeldt,
+Thomas Quinot, Jon Tombs, Ken Pizzini, Eberhard Mönkeberg and Andrew Kroll,
+the Linux CD-ROM device driver developers who were kind
+enough to give suggestions and criticisms during the writing. Finally
+of course, I want to thank Linus Torvalds for making this possible in
+the first place.
diff --git a/Documentation/cdrom/cdrom-standard.tex b/Documentation/cdrom/cdrom-standard.tex
deleted file mode 100644
index f7cd455973f7..000000000000
--- a/Documentation/cdrom/cdrom-standard.tex
+++ /dev/null
@@ -1,1026 +0,0 @@
-\documentclass{article}
-\def\version{$Id: cdrom-standard.tex,v 1.9 1997/12/28 15:42:49 david Exp $}
-\newcommand{\newsection}[1]{\newpage\section{#1}}
-
-\evensidemargin=0pt
-\oddsidemargin=0pt
-\topmargin=-\headheight \advance\topmargin by -\headsep
-\textwidth=15.99cm \textheight=24.62cm % normal A4, 1'' margin
-
-\def\linux{{\sc Linux}}
-\def\cdrom{{\sc cd-rom}}
-\def\UCD{{\sc Uniform cd-rom Driver}}
-\def\cdromc{{\tt {cdrom.c}}}
-\def\cdromh{{\tt {cdrom.h}}}
-\def\fo{\sl} % foreign words
-\def\ie{{\fo i.e.}}
-\def\eg{{\fo e.g.}}
-
-\everymath{\it} \everydisplay{\it}
-\catcode `\_=\active \def_{\_\penalty100 }
-\catcode`\<=\active \def<#1>{{\langle\hbox{\rm#1}\rangle}}
-
-\begin{document}
-\title{A \linux\ \cdrom\ standard}
-\author{David van Leeuwen\\{\normalsize\tt david@ElseWare.cistron.nl}
-\\{\footnotesize updated by Erik Andersen {\tt(andersee@debian.org)}}
-\\{\footnotesize updated by Jens Axboe {\tt(axboe@image.dk)}}}
-\date{12 March 1999}
-
-\maketitle
-
-\newsection{Introduction}
-
-\linux\ is probably the Unix-like operating system that supports
-the widest variety of hardware devices. The reasons for this are
-presumably
-\begin{itemize}
-\item
- The large list of hardware devices available for the many platforms
- that \linux\ now supports (\ie, i386-PCs, Sparc Suns, etc.)
-\item
- The open design of the operating system, such that anybody can write a
- driver for \linux.
-\item
- There is plenty of source code around as examples of how to write a driver.
-\end{itemize}
-The openness of \linux, and the many different types of available
-hardware has allowed \linux\ to support many different hardware devices.
-Unfortunately, the very openness that has allowed \linux\ to support
-all these different devices has also allowed the behavior of each
-device driver to differ significantly from one device to another.
-This divergence of behavior has been very significant for \cdrom\
-devices; the way a particular drive reacts to a `standard' $ioctl()$
-call varies greatly from one device driver to another. To avoid making
-their drivers totally inconsistent, the writers of \linux\ \cdrom\
-drivers generally created new device drivers by understanding, copying,
-and then changing an existing one. Unfortunately, this practice did not
-maintain uniform behavior across all the \linux\ \cdrom\ drivers.
-
-This document describes an effort to establish Uniform behavior across
-all the different \cdrom\ device drivers for \linux. This document also
-defines the various $ioctl$s, and how the low-level \cdrom\ device
-drivers should implement them. Currently (as of the \linux\ 2.1.$x$
-development kernels) several low-level \cdrom\ device drivers, including
-both IDE/ATAPI and SCSI, now use this Uniform interface.
-
-When the \cdrom\ was developed, the interface between the \cdrom\ drive
-and the computer was not specified in the standards. As a result, many
-different \cdrom\ interfaces were developed. Some of them had their
-own proprietary design (Sony, Mitsumi, Panasonic, Philips), other
-manufacturers adopted an existing electrical interface and changed
-the functionality (CreativeLabs/SoundBlaster, Teac, Funai) or simply
-adapted their drives to one or more of the already existing electrical
-interfaces (Aztech, Sanyo, Funai, Vertos, Longshine, Optics Storage and
-most of the `NoName' manufacturers). In cases where a new drive really
-brought its own interface or used its own command set and flow control
-scheme, either a separate driver had to be written, or an existing
-driver had to be enhanced. History has delivered us \cdrom\ support for
-many of these different interfaces. Nowadays, almost all new \cdrom\
-drives are either IDE/ATAPI or SCSI, and it is very unlikely that any
-manufacturer will create a new interface. Even finding drives for the
-old proprietary interfaces is getting difficult.
-
-When (in the 1.3.70's) I looked at the existing software interface,
-which was expressed through \cdromh, it appeared to be a rather wild
-set of commands and data formats.\footnote{I cannot recollect what
-kernel version I looked at, then, presumably 1.2.13 and 1.3.34---the
-latest kernel that I was indirectly involved in.} It seemed that many
-features of the software interface had been added to accommodate the
-capabilities of a particular drive, in an {\fo ad hoc\/} manner. More
-importantly, it appeared that the behavior of the `standard' commands
-was different for most of the different drivers: \eg, some drivers
-close the tray if an $open()$ call occurs when the tray is open, while
-others do not. Some drivers lock the door upon opening the device, to
-prevent an incoherent file system, but others don't, to allow software
-ejection. Undoubtedly, the capabilities of the different drives vary,
-but even when two drives have the same capability their drivers'
-behavior was usually different.
-
-I decided to start a discussion on how to make all the \linux\ \cdrom\
-drivers behave more uniformly. I began by contacting the developers of
-the many \cdrom\ drivers found in the \linux\ kernel. Their reactions
-encouraged me to write the \UCD\ which this document is intended to
-describe. The implementation of the \UCD\ is in the file \cdromc. This
-driver is intended to be an additional software layer that sits on top
-of the low-level device drivers for each \cdrom\ drive. By adding this
-additional layer, it is possible to have all the different \cdrom\
-devices behave {\em exactly\/} the same (insofar as the underlying
-hardware will allow).
-
-The goal of the \UCD\ is {\em not\/} to alienate driver developers who
-have not yet taken steps to support this effort. The goal of \UCD\ is
-simply to give people writing application programs for \cdrom\ drives
-{\em one\/} \linux\ \cdrom\ interface with consistent behavior for all
-\cdrom\ devices. In addition, this also provides a consistent interface
-between the low-level device driver code and the \linux\ kernel. Care
-is taken that 100\,\% compatibility exists with the data structures and
-programmer's interface defined in \cdromh. This guide was written to
-help \cdrom\ driver developers adapt their code to use the \UCD\ code
-defined in \cdromc.
-
-Personally, I think that the most important hardware interfaces are
-the IDE/ATAPI drives and, of course, the SCSI drives, but as prices
-of hardware drop continuously, it is also likely that people may have
-more than one \cdrom\ drive, possibly of mixed types. It is important
-that these drives behave in the same way. In December 1994, one of the
-cheapest \cdrom\ drives was a Philips cm206, a double-speed proprietary
-drive. In the months that I was busy writing a \linux\ driver for it,
-proprietary drives became obsolete and IDE/ATAPI drives became the
-standard. At the time of the last update to this document (November
-1997) it is becoming difficult to even {\em find} anything less than a
-16 speed \cdrom\ drive, and 24 speed drives are common.
-
-\newsection{Standardizing through another software level}
-\label{cdrom.c}
-
-At the time this document was conceived, all drivers directly
-implemented the \cdrom\ $ioctl()$ calls through their own routines. This
-led to the danger of different drivers forgetting to do important things
-like checking that the user was giving the driver valid data. More
-importantly, this led to the divergence of behavior, which has already
-been discussed.
-
-For this reason, the \UCD\ was created to enforce consistent \cdrom\
-drive behavior, and to provide a common set of services to the various
-low-level \cdrom\ device drivers. The \UCD\ now provides another
-software-level, that separates the $ioctl()$ and $open()$ implementation
-from the actual hardware implementation. Note that this effort has
-made few changes which will affect a user's application programs. The
-greatest change involved moving the contents of the various low-level
-\cdrom\ drivers' header files to the kernel's cdrom directory. This was
-done to help ensure that the user is only presented with only one cdrom
-interface, the interface defined in \cdromh.
-
-\cdrom\ drives are specific enough (\ie, different from other
-block-devices such as floppy or hard disc drives), to define a set
-of common {\em \cdrom\ device operations}, $<cdrom-device>_dops$.
-These operations are different from the classical block-device file
-operations, $<block-device>_fops$.
-
-The routines for the \UCD\ interface level are implemented in the file
-\cdromc. In this file, the \UCD\ interfaces with the kernel as a block
-device by registering the following general $struct\ file_operations$:
-$$
-\halign{$#$\ \hfil&$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr
-struct& file_operations\ cdrom_fops = \{\hidewidth\cr
- &NULL, & lseek \cr
- &block_read, & read---general block-dev read \cr
- &block_write, & write---general block-dev write \cr
- &NULL, & readdir \cr
- &NULL, & select \cr
- &cdrom_ioctl, & ioctl \cr
- &NULL, & mmap \cr
- &cdrom_open, & open \cr
- &cdrom_release, & release \cr
- &NULL, & fsync \cr
- &NULL, & fasync \cr
- &cdrom_media_changed, & media change \cr
- &NULL & revalidate \cr
-\};\cr
-}
-$$
-
-Every active \cdrom\ device shares this $struct$. The routines
-declared above are all implemented in \cdromc, since this file is the
-place where the behavior of all \cdrom-devices is defined and
-standardized. The actual interface to the various types of \cdrom\
-hardware is still performed by various low-level \cdrom-device
-drivers. These routines simply implement certain {\em capabilities\/}
-that are common to all \cdrom\ (and really, all removable-media
-devices).
-
-Registration of a low-level \cdrom\ device driver is now done through
-the general routines in \cdromc, not through the Virtual File System
-(VFS) any more. The interface implemented in \cdromc\ is carried out
-through two general structures that contain information about the
-capabilities of the driver, and the specific drives on which the
-driver operates. The structures are:
-\begin{description}
-\item[$cdrom_device_ops$]
- This structure contains information about the low-level driver for a
- \cdrom\ device. This structure is conceptually connected to the major
- number of the device (although some drivers may have different
- major numbers, as is the case for the IDE driver).
-\item[$cdrom_device_info$]
- This structure contains information about a particular \cdrom\ drive,
- such as its device name, speed, etc. This structure is conceptually
- connected to the minor number of the device.
-\end{description}
-
-Registering a particular \cdrom\ drive with the \UCD\ is done by the
-low-level device driver though a call to:
-$$register_cdrom(struct\ cdrom_device_info * <device>_info)
-$$
-The device information structure, $<device>_info$, contains all the
-information needed for the kernel to interface with the low-level
-\cdrom\ device driver. One of the most important entries in this
-structure is a pointer to the $cdrom_device_ops$ structure of the
-low-level driver.
-
-The device operations structure, $cdrom_device_ops$, contains a list
-of pointers to the functions which are implemented in the low-level
-device driver. When \cdromc\ accesses a \cdrom\ device, it does it
-through the functions in this structure. It is impossible to know all
-the capabilities of future \cdrom\ drives, so it is expected that this
-list may need to be expanded from time to time as new technologies are
-developed. For example, CD-R and CD-R/W drives are beginning to become
-popular, and support will soon need to be added for them. For now, the
-current $struct$ is:
-$$
-\halign{$#$\ \hfil&$#$\ \hfil&\hbox to 10em{$#$\hss}&
- $/*$ \rm# $*/$\hfil\cr
-struct& cdrom_device_ops\ \{ \hidewidth\cr
- &int& (* open)(struct\ cdrom_device_info *, int)\cr
- &void& (* release)(struct\ cdrom_device_info *);\cr
- &int& (* drive_status)(struct\ cdrom_device_info *, int);\cr
- &unsigned\ int& (* check_events)(struct\ cdrom_device_info *, unsigned\ int, int);\cr
- &int& (* media_changed)(struct\ cdrom_device_info *, int);\cr
- &int& (* tray_move)(struct\ cdrom_device_info *, int);\cr
- &int& (* lock_door)(struct\ cdrom_device_info *, int);\cr
- &int& (* select_speed)(struct\ cdrom_device_info *, int);\cr
- &int& (* select_disc)(struct\ cdrom_device_info *, int);\cr
- &int& (* get_last_session) (struct\ cdrom_device_info *,
- struct\ cdrom_multisession *{});\cr
- &int& (* get_mcn)(struct\ cdrom_device_info *, struct\ cdrom_mcn *{});\cr
- &int& (* reset)(struct\ cdrom_device_info *);\cr
- &int& (* audio_ioctl)(struct\ cdrom_device_info *, unsigned\ int,
- void *{});\cr
-\noalign{\medskip}
- &const\ int& capability;& capability flags \cr
- &int& (* generic_packet)(struct\ cdrom_device_info *, struct\ packet_command *{});\cr
-\};\cr
-}
-$$
-When a low-level device driver implements one of these capabilities,
-it should add a function pointer to this $struct$. When a particular
-function is not implemented, however, this $struct$ should contain a
-NULL instead. The $capability$ flags specify the capabilities of the
-\cdrom\ hardware and/or low-level \cdrom\ driver when a \cdrom\ drive
-is registered with the \UCD.
-
-Note that most functions have fewer parameters than their
-$blkdev_fops$ counterparts. This is because very little of the
-information in the structures $inode$ and $file$ is used. For most
-drivers, the main parameter is the $struct$ $cdrom_device_info$, from
-which the major and minor number can be extracted. (Most low-level
-\cdrom\ drivers don't even look at the major and minor number though,
-since many of them only support one device.) This will be available
-through $dev$ in $cdrom_device_info$ described below.
-
-The drive-specific, minor-like information that is registered with
-\cdromc, currently contains the following fields:
-$$
-\halign{$#$\ \hfil&$#$\ \hfil&\hbox to 10em{$#$\hss}&
- $/*$ \rm# $*/$\hfil\cr
-struct& cdrom_device_info\ \{ \hidewidth\cr
- & const\ struct\ cdrom_device_ops *& ops;& device operations for this major\cr
- & struct\ list_head& list;& linked list of all device_info\cr
- & struct\ gendisk *& disk;& matching block layer disk\cr
- & void *& handle;& driver-dependent data\cr
-\noalign{\medskip}
- & int& mask;& mask of capability: disables them \cr
- & int& speed;& maximum speed for reading data \cr
- & int& capacity;& number of discs in a jukebox \cr
-\noalign{\medskip}
- &unsigned\ int& options : 30;& options flags \cr
- &unsigned& mc_flags : 2;& media-change buffer flags \cr
- &unsigned\ int& vfs_events;& cached events for vfs path\cr
- &unsigned\ int& ioctl_events;& cached events for ioctl path\cr
- & int& use_count;& number of times device is opened\cr
- & char& name[20];& name of the device type\cr
-\noalign{\medskip}
- &__u8& sanyo_slot : 2;& Sanyo 3-CD changer support\cr
- &__u8& keeplocked : 1;& CDROM_LOCKDOOR status\cr
- &__u8& reserved : 5;& not used yet\cr
- & int& cdda_method;& see CDDA_* flags\cr
- &__u8& last_sense;& saves last sense key\cr
- &__u8& media_written;& dirty flag, DVD+RW bookkeeping\cr
- &unsigned\ short& mmc3_profile;& current MMC3 profile\cr
- & int& for_data;& unknown:TBD\cr
- & int\ (* exit)\ (struct\ cdrom_device_info *);&& unknown:TBD\cr
- & int& mrw_mode_page;& which MRW mode page is in use\cr
-\}\cr
-}$$
-Using this $struct$, a linked list of the registered minor devices is
-built, using the $next$ field. The device number, the device operations
-struct and specifications of properties of the drive are stored in this
-structure.
-
-The $mask$ flags can be used to mask out some of the capabilities listed
-in $ops\to capability$, if a specific drive doesn't support a feature
-of the driver. The value $speed$ specifies the maximum head-rate of the
-drive, measured in units of normal audio speed (176\,kB/sec raw data or
-150\,kB/sec file system data). The parameters are declared $const$
-because they describe properties of the drive, which don't change after
-registration.
-
-A few registers contain variables local to the \cdrom\ drive. The
-flags $options$ are used to specify how the general \cdrom\ routines
-should behave. These various flags registers should provide enough
-flexibility to adapt to the different users' wishes (and {\em not\/} the
-`arbitrary' wishes of the author of the low-level device driver, as is
-the case in the old scheme). The register $mc_flags$ is used to buffer
-the information from $media_changed()$ to two separate queues. Other
-data that is specific to a minor drive, can be accessed through $handle$,
-which can point to a data structure specific to the low-level driver.
-The fields $use_count$, $next$, $options$ and $mc_flags$ need not be
-initialized.
-
-The intermediate software layer that \cdromc\ forms will perform some
-additional bookkeeping. The use count of the device (the number of
-processes that have the device opened) is registered in $use_count$. The
-function $cdrom_ioctl()$ will verify the appropriate user-memory regions
-for read and write, and in case a location on the CD is transferred,
-it will `sanitize' the format by making requests to the low-level
-drivers in a standard format, and translating all formats between the
-user-software and low level drivers. This relieves much of the drivers'
-memory checking and format checking and translation. Also, the necessary
-structures will be declared on the program stack.
-
-The implementation of the functions should be as defined in the
-following sections. Two functions {\em must\/} be implemented, namely
-$open()$ and $release()$. Other functions may be omitted, their
-corresponding capability flags will be cleared upon registration.
-Generally, a function returns zero on success and negative on error. A
-function call should return only after the command has completed, but of
-course waiting for the device should not use processor time.
-
-\subsection{$Int\ open(struct\ cdrom_device_info * cdi, int\ purpose)$}
-
-$Open()$ should try to open the device for a specific $purpose$, which
-can be either:
-\begin{itemize}
-\item[0] Open for reading data, as done by {\tt {mount()}} (2), or the
-user commands {\tt {dd}} or {\tt {cat}}.
-\item[1] Open for $ioctl$ commands, as done by audio-CD playing
-programs.
-\end{itemize}
-Notice that any strategic code (closing tray upon $open()$, etc.)\ is
-done by the calling routine in \cdromc, so the low-level routine
-should only be concerned with proper initialization, such as spinning
-up the disc, etc. % and device-use count
-
-
-\subsection{$Void\ release(struct\ cdrom_device_info * cdi)$}
-
-
-Device-specific actions should be taken such as spinning down the device.
-However, strategic actions such as ejection of the tray, or unlocking
-the door, should be left over to the general routine $cdrom_release()$.
-This is the only function returning type $void$.
-
-\subsection{$Int\ drive_status(struct\ cdrom_device_info * cdi, int\ slot_nr)$}
-\label{drive status}
-
-The function $drive_status$, if implemented, should provide
-information on the status of the drive (not the status of the disc,
-which may or may not be in the drive). If the drive is not a changer,
-$slot_nr$ should be ignored. In \cdromh\ the possibilities are listed:
-$$
-\halign{$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr
-CDS_NO_INFO& no information available\cr
-CDS_NO_DISC& no disc is inserted, tray is closed\cr
-CDS_TRAY_OPEN& tray is opened\cr
-CDS_DRIVE_NOT_READY& something is wrong, tray is moving?\cr
-CDS_DISC_OK& a disc is loaded and everything is fine\cr
-}
-$$
-
-\subsection{$Int\ media_changed(struct\ cdrom_device_info * cdi, int\ disc_nr)$}
-
-This function is very similar to the original function in $struct\
-file_operations$. It returns 1 if the medium of the device $cdi\to
-dev$ has changed since the last call, and 0 otherwise. The parameter
-$disc_nr$ identifies a specific slot in a juke-box, it should be
-ignored for single-disc drives. Note that by `re-routing' this
-function through $cdrom_media_changed()$, we can implement separate
-queues for the VFS and a new $ioctl()$ function that can report device
-changes to software (\eg, an auto-mounting daemon).
-
-\subsection{$Int\ tray_move(struct\ cdrom_device_info * cdi, int\ position)$}
-
-This function, if implemented, should control the tray movement. (No
-other function should control this.) The parameter $position$ controls
-the desired direction of movement:
-\begin{itemize}
-\item[0] Close tray
-\item[1] Open tray
-\end{itemize}
-This function returns 0 upon success, and a non-zero value upon
-error. Note that if the tray is already in the desired position, no
-action need be taken, and the return value should be 0.
-
-\subsection{$Int\ lock_door(struct\ cdrom_device_info * cdi, int\ lock)$}
-
-This function (and no other code) controls locking of the door, if the
-drive allows this. The value of $lock$ controls the desired locking
-state:
-\begin{itemize}
-\item[0] Unlock door, manual opening is allowed
-\item[1] Lock door, tray cannot be ejected manually
-\end{itemize}
-This function returns 0 upon success, and a non-zero value upon
-error. Note that if the door is already in the requested state, no
-action need be taken, and the return value should be 0.
-
-\subsection{$Int\ select_speed(struct\ cdrom_device_info * cdi, int\ speed)$}
-
-Some \cdrom\ drives are capable of changing their head-speed. There
-are several reasons for changing the speed of a \cdrom\ drive. Badly
-pressed \cdrom s may benefit from less-than-maximum head rate. Modern
-\cdrom\ drives can obtain very high head rates (up to $24\times$ is
-common). It has been reported that these drives can make reading
-errors at these high speeds, reducing the speed can prevent data loss
-in these circumstances. Finally, some of these drives can
-make an annoyingly loud noise, which a lower speed may reduce. %Finally,
-%although the audio-low-pass filters probably aren't designed for it,
-%more than real-time playback of audio might be used for high-speed
-%copying of audio tracks.
-
-This function specifies the speed at which data is read or audio is
-played back. The value of $speed$ specifies the head-speed of the
-drive, measured in units of standard cdrom speed (176\,kB/sec raw data
-or 150\,kB/sec file system data). So to request that a \cdrom\ drive
-operate at 300\,kB/sec you would call the CDROM_SELECT_SPEED $ioctl$
-with $speed=2$. The special value `0' means `auto-selection', \ie,
-maximum data-rate or real-time audio rate. If the drive doesn't have
-this `auto-selection' capability, the decision should be made on the
-current disc loaded and the return value should be positive. A negative
-return value indicates an error.
-
-\subsection{$Int\ select_disc(struct\ cdrom_device_info * cdi, int\ number)$}
-
-If the drive can store multiple discs (a juke-box) this function
-will perform disc selection. It should return the number of the
-selected disc on success, a negative value on error. Currently, only
-the ide-cd driver supports this functionality.
-
-\subsection{$Int\ get_last_session(struct\ cdrom_device_info * cdi, struct\
- cdrom_multisession * ms_info)$}
-
-This function should implement the old corresponding $ioctl()$. For
-device $cdi\to dev$, the start of the last session of the current disc
-should be returned in the pointer argument $ms_info$. Note that
-routines in \cdromc\ have sanitized this argument: its requested
-format will {\em always\/} be of the type $CDROM_LBA$ (linear block
-addressing mode), whatever the calling software requested. But
-sanitization goes even further: the low-level implementation may
-return the requested information in $CDROM_MSF$ format if it wishes so
-(setting the $ms_info\rightarrow addr_format$ field appropriately, of
-course) and the routines in \cdromc\ will make the transformation if
-necessary. The return value is 0 upon success.
-
-\subsection{$Int\ get_mcn(struct\ cdrom_device_info * cdi, struct\
- cdrom_mcn * mcn)$}
-
-Some discs carry a `Media Catalog Number' (MCN), also called
-`Universal Product Code' (UPC). This number should reflect the number
-that is generally found in the bar-code on the product. Unfortunately,
-the few discs that carry such a number on the disc don't even use the
-same format. The return argument to this function is a pointer to a
-pre-declared memory region of type $struct\ cdrom_mcn$. The MCN is
-expected as a 13-character string, terminated by a null-character.
-
-\subsection{$Int\ reset(struct\ cdrom_device_info * cdi)$}
-
-This call should perform a hard-reset on the drive (although in
-circumstances that a hard-reset is necessary, a drive may very well not
-listen to commands anymore). Preferably, control is returned to the
-caller only after the drive has finished resetting. If the drive is no
-longer listening, it may be wise for the underlying low-level cdrom
-driver to time out.
-
-\subsection{$Int\ audio_ioctl(struct\ cdrom_device_info * cdi, unsigned\
- int\ cmd, void * arg)$}
-
-Some of the \cdrom-$ioctl$s defined in \cdromh\ can be
-implemented by the routines described above, and hence the function
-$cdrom_ioctl$ will use those. However, most $ioctl$s deal with
-audio-control. We have decided to leave these to be accessed through a
-single function, repeating the arguments $cmd$ and $arg$. Note that
-the latter is of type $void*{}$, rather than $unsigned\ long\
-int$. The routine $cdrom_ioctl()$ does do some useful things,
-though. It sanitizes the address format type to $CDROM_MSF$ (Minutes,
-Seconds, Frames) for all audio calls. It also verifies the memory
-location of $arg$, and reserves stack-memory for the argument. This
-makes implementation of the $audio_ioctl()$ much simpler than in the
-old driver scheme. For example, you may look up the function
-$cm206_audio_ioctl()$ in {\tt {cm206.c}} that should be updated with
-this documentation.
-
-An unimplemented ioctl should return $-ENOSYS$, but a harmless request
-(\eg, $CDROMSTART$) may be ignored by returning 0 (success). Other
-errors should be according to the standards, whatever they are. When
-an error is returned by the low-level driver, the \UCD\ tries whenever
-possible to return the error code to the calling program. (We may decide
-to sanitize the return value in $cdrom_ioctl()$ though, in order to
-guarantee a uniform interface to the audio-player software.)
-
-\subsection{$Int\ dev_ioctl(struct\ cdrom_device_info * cdi, unsigned\ int\
- cmd, unsigned\ long\ arg)$}
-
-Some $ioctl$s seem to be specific to certain \cdrom\ drives. That is,
-they are introduced to service some capabilities of certain drives. In
-fact, there are 6 different $ioctl$s for reading data, either in some
-particular kind of format, or audio data. Not many drives support
-reading audio tracks as data, I believe this is because of protection
-of copyrights of artists. Moreover, I think that if audio-tracks are
-supported, it should be done through the VFS and not via $ioctl$s. A
-problem here could be the fact that audio-frames are 2352 bytes long,
-so either the audio-file-system should ask for 75264 bytes at once
-(the least common multiple of 512 and 2352), or the drivers should
-bend their backs to cope with this incoherence (to which I would be
-opposed). Furthermore, it is very difficult for the hardware to find
-the exact frame boundaries, since there are no synchronization headers
-in audio frames. Once these issues are resolved, this code should be
-standardized in \cdromc.
-
-Because there are so many $ioctl$s that seem to be introduced to
-satisfy certain drivers,\footnote{Is there software around that
- actually uses these? I'd be interested!} any `non-standard' $ioctl$s
-are routed through the call $dev_ioctl()$. In principle, `private'
-$ioctl$s should be numbered after the device's major number, and not
-the general \cdrom\ $ioctl$ number, {\tt {0x53}}. Currently the
-non-supported $ioctl$s are: {\it CDROMREADMODE1, CDROMREADMODE2,
- CDROMREADAUDIO, CDROMREADRAW, CDROMREADCOOKED, CDROMSEEK,
- CDROMPLAY\-BLK and CDROM\-READALL}.
-
-
-\subsection{\cdrom\ capabilities}
-\label{capability}
-
-Instead of just implementing some $ioctl$ calls, the interface in
-\cdromc\ supplies the possibility to indicate the {\em capabilities\/}
-of a \cdrom\ drive. This can be done by ORing any number of
-capability-constants that are defined in \cdromh\ at the registration
-phase. Currently, the capabilities are any of:
-$$
-\halign{$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr
-CDC_CLOSE_TRAY& can close tray by software control\cr
-CDC_OPEN_TRAY& can open tray\cr
-CDC_LOCK& can lock and unlock the door\cr
-CDC_SELECT_SPEED& can select speed, in units of $\sim$150\,kB/s\cr
-CDC_SELECT_DISC& drive is juke-box\cr
-CDC_MULTI_SESSION& can read sessions $>\rm1$\cr
-CDC_MCN& can read Media Catalog Number\cr
-CDC_MEDIA_CHANGED& can report if disc has changed\cr
-CDC_PLAY_AUDIO& can perform audio-functions (play, pause, etc)\cr
-CDC_RESET& hard reset device\cr
-CDC_IOCTLS& driver has non-standard ioctls\cr
-CDC_DRIVE_STATUS& driver implements drive status\cr
-}
-$$
-The capability flag is declared $const$, to prevent drivers from
-accidentally tampering with the contents. The capability fags actually
-inform \cdromc\ of what the driver can do. If the drive found
-by the driver does not have the capability, is can be masked out by
-the $cdrom_device_info$ variable $mask$. For instance, the SCSI \cdrom\
-driver has implemented the code for loading and ejecting \cdrom's, and
-hence its corresponding flags in $capability$ will be set. But a SCSI
-\cdrom\ drive might be a caddy system, which can't load the tray, and
-hence for this drive the $cdrom_device_info$ struct will have set
-the $CDC_CLOSE_TRAY$ bit in $mask$.
-
-In the file \cdromc\ you will encounter many constructions of the type
-$$\it
-if\ (cdo\rightarrow capability \mathrel\& \mathord{\sim} cdi\rightarrow mask
- \mathrel{\&} CDC_<capability>) \ldots
-$$
-There is no $ioctl$ to set the mask\dots The reason is that
-I think it is better to control the {\em behavior\/} rather than the
-{\em capabilities}.
-
-\subsection{Options}
-
-A final flag register controls the {\em behavior\/} of the \cdrom\
-drives, in order to satisfy different users' wishes, hopefully
-independently of the ideas of the respective author who happened to
-have made the drive's support available to the \linux\ community. The
-current behavior options are:
-$$
-\halign{$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr
-CDO_AUTO_CLOSE& try to close tray upon device $open()$\cr
-CDO_AUTO_EJECT& try to open tray on last device $close()$\cr
-CDO_USE_FFLAGS& use $file_pointer\rightarrow f_flags$ to indicate
- purpose for $open()$\cr
-CDO_LOCK& try to lock door if device is opened\cr
-CDO_CHECK_TYPE& ensure disc type is data if opened for data\cr
-}
-$$
-
-The initial value of this register is $CDO_AUTO_CLOSE \mathrel|
-CDO_USE_FFLAGS \mathrel| CDO_LOCK$, reflecting my own view on user
-interface and software standards. Before you protest, there are two
-new $ioctl$s implemented in \cdromc, that allow you to control the
-behavior by software. These are:
-$$
-\halign{$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr
-CDROM_SET_OPTIONS& set options specified in $(int)\ arg$\cr
-CDROM_CLEAR_OPTIONS& clear options specified in $(int)\ arg$\cr
-}
-$$
-One option needs some more explanation: $CDO_USE_FFLAGS$. In the next
-newsection we explain what the need for this option is.
-
-A software package {\tt setcd}, available from the Debian distribution
-and {\tt sunsite.unc.edu}, allows user level control of these flags.
-
-\newsection{The need to know the purpose of opening the \cdrom\ device}
-
-Traditionally, Unix devices can be used in two different `modes',
-either by reading/writing to the device file, or by issuing
-controlling commands to the device, by the device's $ioctl()$
-call. The problem with \cdrom\ drives, is that they can be used for
-two entirely different purposes. One is to mount removable
-file systems, \cdrom s, the other is to play audio CD's. Audio commands
-are implemented entirely through $ioctl$s, presumably because the
-first implementation (SUN?) has been such. In principle there is
-nothing wrong with this, but a good control of the `CD player' demands
-that the device can {\em always\/} be opened in order to give the
-$ioctl$ commands, regardless of the state the drive is in.
-
-On the other hand, when used as a removable-media disc drive (what the
-original purpose of \cdrom s is) we would like to make sure that the
-disc drive is ready for operation upon opening the device. In the old
-scheme, some \cdrom\ drivers don't do any integrity checking, resulting
-in a number of i/o errors reported by the VFS to the kernel when an
-attempt for mounting a \cdrom\ on an empty drive occurs. This is not a
-particularly elegant way to find out that there is no \cdrom\ inserted;
-it more-or-less looks like the old IBM-PC trying to read an empty floppy
-drive for a couple of seconds, after which the system complains it
-can't read from it. Nowadays we can {\em sense\/} the existence of a
-removable medium in a drive, and we believe we should exploit that
-fact. An integrity check on opening of the device, that verifies the
-availability of a \cdrom\ and its correct type (data), would be
-desirable.
-
-These two ways of using a \cdrom\ drive, principally for data and
-secondarily for playing audio discs, have different demands for the
-behavior of the $open()$ call. Audio use simply wants to open the
-device in order to get a file handle which is needed for issuing
-$ioctl$ commands, while data use wants to open for correct and
-reliable data transfer. The only way user programs can indicate what
-their {\em purpose\/} of opening the device is, is through the $flags$
-parameter (see {\tt {open(2)}}). For \cdrom\ devices, these flags aren't
-implemented (some drivers implement checking for write-related flags,
-but this is not strictly necessary if the device file has correct
-permission flags). Most option flags simply don't make sense to
-\cdrom\ devices: $O_CREAT$, $O_NOCTTY$, $O_TRUNC$, $O_APPEND$, and
-$O_SYNC$ have no meaning to a \cdrom.
-
-We therefore propose to use the flag $O_NONBLOCK$ to indicate
-that the device is opened just for issuing $ioctl$
-commands. Strictly, the meaning of $O_NONBLOCK$ is that opening and
-subsequent calls to the device don't cause the calling process to
-wait. We could interpret this as ``don't wait until someone has
-inserted some valid data-\cdrom.'' Thus, our proposal of the
-implementation for the $open()$ call for \cdrom s is:
-\begin{itemize}
-\item If no other flags are set than $O_RDONLY$, the device is opened
-for data transfer, and the return value will be 0 only upon successful
-initialization of the transfer. The call may even induce some actions
-on the \cdrom, such as closing the tray.
-\item If the option flag $O_NONBLOCK$ is set, opening will always be
-successful, unless the whole device doesn't exist. The drive will take
-no actions whatsoever.
-\end{itemize}
-
-\subsection{And what about standards?}
-
-You might hesitate to accept this proposal as it comes from the
-\linux\ community, and not from some standardizing institute. What
-about SUN, SGI, HP and all those other Unix and hardware vendors?
-Well, these companies are in the lucky position that they generally
-control both the hardware and software of their supported products,
-and are large enough to set their own standard. They do not have to
-deal with a dozen or more different, competing hardware
-configurations.\footnote{Incidentally, I think that SUN's approach to
-mounting \cdrom s is very good in origin: under Solaris a
-volume-daemon automatically mounts a newly inserted \cdrom\ under {\tt
-{/cdrom/$<volume-name>$/}}. In my opinion they should have pushed this
-further and have {\em every\/} \cdrom\ on the local area network be
-mounted at the similar location, \ie, no matter in which particular
-machine you insert a \cdrom, it will always appear at the same
-position in the directory tree, on every system. When I wanted to
-implement such a user-program for \linux, I came across the
-differences in behavior of the various drivers, and the need for an
-$ioctl$ informing about media changes.}
-
-We believe that using $O_NONBLOCK$ to indicate that a device is being opened
-for $ioctl$ commands only can be easily introduced in the \linux\
-community. All the CD-player authors will have to be informed, we can
-even send in our own patches to the programs. The use of $O_NONBLOCK$
-has most likely no influence on the behavior of the CD-players on
-other operating systems than \linux. Finally, a user can always revert
-to old behavior by a call to $ioctl(file_descriptor, CDROM_CLEAR_OPTIONS,
-CDO_USE_FFLAGS)$.
-
-\subsection{The preferred strategy of $open()$}
-
-The routines in \cdromc\ are designed in such a way that run-time
-configuration of the behavior of \cdrom\ devices (of {\em any\/} type)
-can be carried out, by the $CDROM_SET/CLEAR_OPTIONS$ $ioctls$. Thus, various
-modes of operation can be set:
-\begin{description}
-\item[$CDO_AUTO_CLOSE \mathrel| CDO_USE_FFLAGS \mathrel| CDO_LOCK$] This
-is the default setting. (With $CDO_CHECK_TYPE$ it will be better, in the
-future.) If the device is not yet opened by any other process, and if
-the device is being opened for data ($O_NONBLOCK$ is not set) and the
-tray is found to be open, an attempt to close the tray is made. Then,
-it is verified that a disc is in the drive and, if $CDO_CHECK_TYPE$ is
-set, that it contains tracks of type `data mode 1.' Only if all tests
-are passed is the return value zero. The door is locked to prevent file
-system corruption. If the drive is opened for audio ($O_NONBLOCK$ is
-set), no actions are taken and a value of 0 will be returned.
-\item[$CDO_AUTO_CLOSE \mathrel| CDO_AUTO_EJECT \mathrel| CDO_LOCK$] This
-mimics the behavior of the current sbpcd-driver. The option flags are
-ignored, the tray is closed on the first open, if necessary. Similarly,
-the tray is opened on the last release, \ie, if a \cdrom\ is unmounted,
-it is automatically ejected, such that the user can replace it.
-\end{description}
-We hope that these option can convince everybody (both driver
-maintainers and user program developers) to adopt the new \cdrom\
-driver scheme and option flag interpretation.
-
-\newsection{Description of routines in \cdromc}
-
-Only a few routines in \cdromc\ are exported to the drivers. In this
-new section we will discuss these, as well as the functions that `take
-over' the \cdrom\ interface to the kernel. The header file belonging
-to \cdromc\ is called \cdromh. Formerly, some of the contents of this
-file were placed in the file {\tt {ucdrom.h}}, but this file has now been
-merged back into \cdromh.
-
-\subsection{$Struct\ file_operations\ cdrom_fops$}
-
-The contents of this structure were described in section~\ref{cdrom.c}.
-A pointer to this structure is assigned to the $fops$ field
-of the $struct gendisk$.
-
-\subsection{$Int\ register_cdrom( struct\ cdrom_device_info\ * cdi)$}
-
-This function is used in about the same way one registers $cdrom_fops$
-with the kernel, the device operations and information structures,
-as described in section~\ref{cdrom.c}, should be registered with the
-\UCD:
-$$
-register_cdrom(\&<device>_info));
-$$
-This function returns zero upon success, and non-zero upon
-failure. The structure $<device>_info$ should have a pointer to the
-driver's $<device>_dops$, as in
-$$
-\vbox{\halign{&$#$\hfil\cr
-struct\ &cdrom_device_info\ <device>_info = \{\cr
-& <device>_dops;\cr
-&\ldots\cr
-\}\cr
-}}$$
-Note that a driver must have one static structure, $<device>_dops$, while
-it may have as many structures $<device>_info$ as there are minor devices
-active. $Register_cdrom()$ builds a linked list from these.
-
-\subsection{$Void\ unregister_cdrom(struct\ cdrom_device_info * cdi)$}
-
-Unregistering device $cdi$ with minor number $MINOR(cdi\to dev)$ removes
-the minor device from the list. If it was the last registered minor for
-the low-level driver, this disconnects the registered device-operation
-routines from the \cdrom\ interface. This function returns zero upon
-success, and non-zero upon failure.
-
-\subsection{$Int\ cdrom_open(struct\ inode * ip, struct\ file * fp)$}
-
-This function is not called directly by the low-level drivers, it is
-listed in the standard $cdrom_fops$. If the VFS opens a file, this
-function becomes active. A strategy is implemented in this routine,
-taking care of all capabilities and options that are set in the
-$cdrom_device_ops$ connected to the device. Then, the program flow is
-transferred to the device_dependent $open()$ call.
-
-\subsection{$Void\ cdrom_release(struct\ inode *ip, struct\ file
-*fp)$}
-
-This function implements the reverse-logic of $cdrom_open()$, and then
-calls the device-dependent $release()$ routine. When the use-count has
-reached 0, the allocated buffers are flushed by calls to $sync_dev(dev)$
-and $invalidate_buffers(dev)$.
-
-
-\subsection{$Int\ cdrom_ioctl(struct\ inode *ip, struct\ file *fp,
-unsigned\ int\ cmd, unsigned\ long\ arg)$}
-\label{cdrom-ioctl}
-
-This function handles all the standard $ioctl$ requests for \cdrom\
-devices in a uniform way. The different calls fall into three
-categories: $ioctl$s that can be directly implemented by device
-operations, ones that are routed through the call $audio_ioctl()$, and
-the remaining ones, that are presumable device-dependent. Generally, a
-negative return value indicates an error.
-
-\subsubsection{Directly implemented $ioctl$s}
-\label{ioctl-direct}
-
-The following `old' \cdrom-$ioctl$s are implemented by directly
-calling device-operations in $cdrom_device_ops$, if implemented and
-not masked:
-\begin{description}
-\item[CDROMMULTISESSION] Requests the last session on a \cdrom.
-\item[CDROMEJECT] Open tray.
-\item[CDROMCLOSETRAY] Close tray.
-\item[CDROMEJECT_SW] If $arg\not=0$, set behavior to auto-close (close
-tray on first open) and auto-eject (eject on last release), otherwise
-set behavior to non-moving on $open()$ and $release()$ calls.
-\item[CDROM_GET_MCN] Get the Media Catalog Number from a CD.
-\end{description}
-
-\subsubsection{$Ioctl$s routed through $audio_ioctl()$}
-\label{ioctl-audio}
-
-The following set of $ioctl$s are all implemented through a call to
-the $cdrom_fops$ function $audio_ioctl()$. Memory checks and
-allocation are performed in $cdrom_ioctl()$, and also sanitization of
-address format ($CDROM_LBA$/$CDROM_MSF$) is done.
-\begin{description}
-\item[CDROMSUBCHNL] Get sub-channel data in argument $arg$ of type $struct\
-cdrom_subchnl *{}$.
-\item[CDROMREADTOCHDR] Read Table of Contents header, in $arg$ of type
-$struct\ cdrom_tochdr *{}$.
-\item[CDROMREADTOCENTRY] Read a Table of Contents entry in $arg$ and
-specified by $arg$ of type $struct\ cdrom_tocentry *{}$.
-\item[CDROMPLAYMSF] Play audio fragment specified in Minute, Second,
-Frame format, delimited by $arg$ of type $struct\ cdrom_msf *{}$.
-\item[CDROMPLAYTRKIND] Play audio fragment in track-index format
-delimited by $arg$ of type $struct\ \penalty-1000 cdrom_ti *{}$.
-\item[CDROMVOLCTRL] Set volume specified by $arg$ of type $struct\
-cdrom_volctrl *{}$.
-\item[CDROMVOLREAD] Read volume into by $arg$ of type $struct\
-cdrom_volctrl *{}$.
-\item[CDROMSTART] Spin up disc.
-\item[CDROMSTOP] Stop playback of audio fragment.
-\item[CDROMPAUSE] Pause playback of audio fragment.
-\item[CDROMRESUME] Resume playing.
-\end{description}
-
-\subsubsection{New $ioctl$s in \cdromc}
-
-The following $ioctl$s have been introduced to allow user programs to
-control the behavior of individual \cdrom\ devices. New $ioctl$
-commands can be identified by the underscores in their names.
-\begin{description}
-\item[CDROM_SET_OPTIONS] Set options specified by $arg$. Returns the
-option flag register after modification. Use $arg = \rm0$ for reading
-the current flags.
-\item[CDROM_CLEAR_OPTIONS] Clear options specified by $arg$. Returns
- the option flag register after modification.
-\item[CDROM_SELECT_SPEED] Select head-rate speed of disc specified as
- by $arg$ in units of standard cdrom speed (176\,kB/sec raw data or
- 150\,kB/sec file system data). The value 0 means `auto-select', \ie,
- play audio discs at real time and data discs at maximum speed. The value
- $arg$ is checked against the maximum head rate of the drive found in the
- $cdrom_dops$.
-\item[CDROM_SELECT_DISC] Select disc numbered $arg$ from a juke-box.
- First disc is numbered 0. The number $arg$ is checked against the
- maximum number of discs in the juke-box found in the $cdrom_dops$.
-\item[CDROM_MEDIA_CHANGED] Returns 1 if a disc has been changed since
- the last call. Note that calls to $cdrom_media_changed$ by the VFS
- are treated by an independent queue, so both mechanisms will detect
- a media change once. For juke-boxes, an extra argument $arg$
- specifies the slot for which the information is given. The special
- value $CDSL_CURRENT$ requests that information about the currently
- selected slot be returned.
-\item[CDROM_DRIVE_STATUS] Returns the status of the drive by a call to
- $drive_status()$. Return values are defined in section~\ref{drive
- status}. Note that this call doesn't return information on the
- current playing activity of the drive; this can be polled through an
- $ioctl$ call to $CDROMSUBCHNL$. For juke-boxes, an extra argument
- $arg$ specifies the slot for which (possibly limited) information is
- given. The special value $CDSL_CURRENT$ requests that information
- about the currently selected slot be returned.
-\item[CDROM_DISC_STATUS] Returns the type of the disc currently in the
- drive. It should be viewed as a complement to $CDROM_DRIVE_STATUS$.
- This $ioctl$ can provide \emph {some} information about the current
- disc that is inserted in the drive. This functionality used to be
- implemented in the low level drivers, but is now carried out
- entirely in \UCD.
-
- The history of development of the CD's use as a carrier medium for
- various digital information has lead to many different disc types.
- This $ioctl$ is useful only in the case that CDs have \emph {only
- one} type of data on them. While this is often the case, it is
- also very common for CDs to have some tracks with data, and some
- tracks with audio. Because this is an existing interface, rather
- than fixing this interface by changing the assumptions it was made
- under, thereby breaking all user applications that use this
- function, the \UCD\ implements this $ioctl$ as follows: If the CD in
- question has audio tracks on it, and it has absolutely no CD-I, XA,
- or data tracks on it, it will be reported as $CDS_AUDIO$. If it has
- both audio and data tracks, it will return $CDS_MIXED$. If there
- are no audio tracks on the disc, and if the CD in question has any
- CD-I tracks on it, it will be reported as $CDS_XA_2_2$. Failing
- that, if the CD in question has any XA tracks on it, it will be
- reported as $CDS_XA_2_1$. Finally, if the CD in question has any
- data tracks on it, it will be reported as a data CD ($CDS_DATA_1$).
-
- This $ioctl$ can return:
- $$
- \halign{$#$\ \hfil&$/*$ \rm# $*/$\hfil\cr
- CDS_NO_INFO& no information available\cr
- CDS_NO_DISC& no disc is inserted, or tray is opened\cr
- CDS_AUDIO& Audio disc (2352 audio bytes/frame)\cr
- CDS_DATA_1& data disc, mode 1 (2048 user bytes/frame)\cr
- CDS_XA_2_1& mixed data (XA), mode 2, form 1 (2048 user bytes)\cr
- CDS_XA_2_2& mixed data (XA), mode 2, form 1 (2324 user bytes)\cr
- CDS_MIXED& mixed audio/data disc\cr
- }
- $$
- For some information concerning frame layout of the various disc
- types, see a recent version of \cdromh.
-
-\item[CDROM_CHANGER_NSLOTS] Returns the number of slots in a
- juke-box.
-\item[CDROMRESET] Reset the drive.
-\item[CDROM_GET_CAPABILITY] Returns the $capability$ flags for the
- drive. Refer to section \ref{capability} for more information on
- these flags.
-\item[CDROM_LOCKDOOR] Locks the door of the drive. $arg == \rm0$
- unlocks the door, any other value locks it.
-\item[CDROM_DEBUG] Turns on debugging info. Only root is allowed
- to do this. Same semantics as CDROM_LOCKDOOR.
-\end{description}
-
-\subsubsection{Device dependent $ioctl$s}
-
-Finally, all other $ioctl$s are passed to the function $dev_ioctl()$,
-if implemented. No memory allocation or verification is carried out.
-
-\newsection{How to update your driver}
-
-\begin{enumerate}
-\item Make a backup of your current driver.
-\item Get hold of the files \cdromc\ and \cdromh, they should be in
- the directory tree that came with this documentation.
-\item Make sure you include \cdromh.
-\item Change the 3rd argument of $register_blkdev$ from
-$\&<your-drive>_fops$ to $\&cdrom_fops$.
-\item Just after that line, add the following to register with the \UCD:
- $$register_cdrom(\&<your-drive>_info);$$
- Similarly, add a call to $unregister_cdrom()$ at the appropriate place.
-\item Copy an example of the device-operations $struct$ to your
- source, \eg, from {\tt {cm206.c}} $cm206_dops$, and change all
- entries to names corresponding to your driver, or names you just
- happen to like. If your driver doesn't support a certain function,
- make the entry $NULL$. At the entry $capability$ you should list all
- capabilities your driver currently supports. If your driver
- has a capability that is not listed, please send me a message.
-\item Copy the $cdrom_device_info$ declaration from the same example
- driver, and modify the entries according to your needs. If your
- driver dynamically determines the capabilities of the hardware, this
- structure should also be declared dynamically.
-\item Implement all functions in your $<device>_dops$ structure,
- according to prototypes listed in \cdromh, and specifications given
- in section~\ref{cdrom.c}. Most likely you have already implemented
- the code in a large part, and you will almost certainly need to adapt the
- prototype and return values.
-\item Rename your $<device>_ioctl()$ function to $audio_ioctl$ and
- change the prototype a little. Remove entries listed in the first
- part in section~\ref{cdrom-ioctl}, if your code was OK, these are
- just calls to the routines you adapted in the previous step.
-\item You may remove all remaining memory checking code in the
- $audio_ioctl()$ function that deals with audio commands (these are
- listed in the second part of section~\ref{cdrom-ioctl}). There is no
- need for memory allocation either, so most $case$s in the $switch$
- statement look similar to:
- $$
- case\ CDROMREADTOCENTRY\colon get_toc_entry\bigl((struct\
- cdrom_tocentry *{})\ arg\bigr);
- $$
-\item All remaining $ioctl$ cases must be moved to a separate
- function, $<device>_ioctl$, the device-dependent $ioctl$s. Note that
- memory checking and allocation must be kept in this code!
-\item Change the prototypes of $<device>_open()$ and
- $<device>_release()$, and remove any strategic code (\ie, tray
- movement, door locking, etc.).
-\item Try to recompile the drivers. We advise you to use modules, both
- for {\tt {cdrom.o}} and your driver, as debugging is much easier this
- way.
-\end{enumerate}
-
-\newsection{Thanks}
-
-Thanks to all the people involved. First, Erik Andersen, who has
-taken over the torch in maintaining \cdromc\ and integrating much
-\cdrom-related code in the 2.1-kernel. Thanks to Scott Snyder and
-Gerd Knorr, who were the first to implement this interface for SCSI
-and IDE-CD drivers and added many ideas for extension of the data
-structures relative to kernel~2.0. Further thanks to Heiko Ei{\ss}feldt,
-Thomas Quinot, Jon Tombs, Ken Pizzini, Eberhard M\"onkeberg and Andrew
-Kroll, the \linux\ \cdrom\ device driver developers who were kind
-enough to give suggestions and criticisms during the writing. Finally
-of course, I want to thank Linus Torvalds for making this possible in
-the first place.
-
-\vfill
-$ \version\ $
-\eject
-\end{document}
diff --git a/Documentation/cdrom/ide-cd b/Documentation/cdrom/ide-cd.rst
index a5f2a7f1ff46..bdccb74fc92d 100644
--- a/Documentation/cdrom/ide-cd
+++ b/Documentation/cdrom/ide-cd.rst
@@ -1,18 +1,20 @@
IDE-CD driver documentation
-Originally by scott snyder <snyder@fnald0.fnal.gov> (19 May 1996)
-Carrying on the torch is: Erik Andersen <andersee@debian.org>
-New maintainers (19 Oct 1998): Jens Axboe <axboe@image.dk>
+===========================
+
+:Originally by: scott snyder <snyder@fnald0.fnal.gov> (19 May 1996)
+:Carrying on the torch is: Erik Andersen <andersee@debian.org>
+:New maintainers (19 Oct 1998): Jens Axboe <axboe@image.dk>
1. Introduction
---------------
-The ide-cd driver should work with all ATAPI ver 1.2 to ATAPI 2.6 compliant
+The ide-cd driver should work with all ATAPI ver 1.2 to ATAPI 2.6 compliant
CDROM drives which attach to an IDE interface. Note that some CDROM vendors
(including Mitsumi, Sony, Creative, Aztech, and Goldstar) have made
both ATAPI-compliant drives and drives which use a proprietary
interface. If your drive uses one of those proprietary interfaces,
this driver will not work with it (but one of the other CDROM drivers
-probably will). This driver will not work with `ATAPI' drives which
+probably will). This driver will not work with `ATAPI` drives which
attach to the parallel port. In addition, there is at least one drive
(CyCDROM CR520ie) which attaches to the IDE port but is not ATAPI;
this driver will not work with drives like that either (but see the
@@ -31,7 +33,7 @@ This driver provides the following features:
from audio tracks. The program cdda2wav can be used for this.
Note, however, that only some drives actually support this.
- - There is now support for CDROM changers which comply with the
+ - There is now support for CDROM changers which comply with the
ATAPI 2.6 draft standard (such as the NEC CDR-251). This additional
functionality includes a function call to query which slot is the
currently selected slot, a function call to query which slots contain
@@ -45,22 +47,22 @@ This driver provides the following features:
---------------
0. The ide-cd relies on the ide disk driver. See
- Documentation/ide/ide.txt for up-to-date information on the ide
+ Documentation/ide/ide.rst for up-to-date information on the ide
driver.
1. Make sure that the ide and ide-cd drivers are compiled into the
- kernel you're using. When configuring the kernel, in the section
- entitled "Floppy, IDE, and other block devices", say either `Y'
- (which will compile the support directly into the kernel) or `M'
+ kernel you're using. When configuring the kernel, in the section
+ entitled "Floppy, IDE, and other block devices", say either `Y`
+ (which will compile the support directly into the kernel) or `M`
(to compile support as a module which can be loaded and unloaded)
- to the options:
+ to the options::
ATA/ATAPI/MFM/RLL support
Include IDE/ATAPI CDROM support
Depending on what type of IDE interface you have, you may need to
specify additional configuration options. See
- Documentation/ide/ide.txt.
+ Documentation/ide/ide.rst.
2. You should also ensure that the iso9660 filesystem is either
compiled into the kernel or available as a loadable module. You
@@ -72,35 +74,35 @@ This driver provides the following features:
address and an IRQ number, the standard assignments being
0x1f0 and 14 for the primary interface and 0x170 and 15 for the
secondary interface. Each interface can control up to two devices,
- where each device can be a hard drive, a CDROM drive, a floppy drive,
- or a tape drive. The two devices on an interface are called `master'
- and `slave'; this is usually selectable via a jumper on the drive.
+ where each device can be a hard drive, a CDROM drive, a floppy drive,
+ or a tape drive. The two devices on an interface are called `master`
+ and `slave`; this is usually selectable via a jumper on the drive.
Linux names these devices as follows. The master and slave devices
- on the primary IDE interface are called `hda' and `hdb',
+ on the primary IDE interface are called `hda` and `hdb`,
respectively. The drives on the secondary interface are called
- `hdc' and `hdd'. (Interfaces at other locations get other letters
- in the third position; see Documentation/ide/ide.txt.)
+ `hdc` and `hdd`. (Interfaces at other locations get other letters
+ in the third position; see Documentation/ide/ide.rst.)
If you want your CDROM drive to be found automatically by the
driver, you should make sure your IDE interface uses either the
primary or secondary addresses mentioned above. In addition, if
the CDROM drive is the only device on the IDE interface, it should
- be jumpered as `master'. (If for some reason you cannot configure
+ be jumpered as `master`. (If for some reason you cannot configure
your system in this manner, you can probably still use the driver.
You may have to pass extra configuration information to the kernel
- when you boot, however. See Documentation/ide/ide.txt for more
+ when you boot, however. See Documentation/ide/ide.rst for more
information.)
4. Boot the system. If the drive is recognized, you should see a
- message which looks like
+ message which looks like::
hdb: NEC CD-ROM DRIVE:260, ATAPI CDROM drive
If you do not see this, see section 5 below.
5. You may want to create a symbolic link /dev/cdrom pointing to the
- actual device. You can do this with the command
+ actual device. You can do this with the command::
ln -s /dev/hdX /dev/cdrom
@@ -108,14 +110,14 @@ This driver provides the following features:
drive is installed.
6. You should be able to see any error messages from the driver with
- the `dmesg' command.
+ the `dmesg` command.
3. Basic usage
--------------
-An ISO 9660 CDROM can be mounted by putting the disc in the drive and
-typing (as root)
+An ISO 9660 CDROM can be mounted by putting the disc in the drive and
+typing (as root)::
mount -t iso9660 /dev/cdrom /mnt/cdrom
@@ -123,7 +125,7 @@ where it is assumed that /dev/cdrom is a link pointing to the actual
device (as described in step 5 of the last section) and /mnt/cdrom is
an empty directory. You should now be able to see the contents of the
CDROM under the /mnt/cdrom directory. If you want to eject the CDROM,
-you must first dismount it with a command like
+you must first dismount it with a command like::
umount /mnt/cdrom
@@ -148,7 +150,7 @@ such as cdda2wav. The only types of drive which I've heard support
this are Sony and Toshiba drives. You will get errors if you try to
use this function on a drive which does not support it.
-For supported changers, you can use the `cdchange' program (appended to
+For supported changers, you can use the `cdchange` program (appended to
the end of this file) to switch between changer slots. Note that the
drive should be unmounted before attempting this. The program takes
two arguments: the CDROM device, and the slot number to which you wish
@@ -161,17 +163,17 @@ to change. If the slot number is -1, the drive is unloaded.
This section discusses some common problems encountered when trying to
use the driver, and some possible solutions. Note that if you are
experiencing problems, you should probably also review
-Documentation/ide/ide.txt for current information about the underlying
+Documentation/ide/ide.rst for current information about the underlying
IDE support code. Some of these items apply only to earlier versions
of the driver, but are mentioned here for completeness.
-In most cases, you should probably check with `dmesg' for any errors
+In most cases, you should probably check with `dmesg` for any errors
from the driver.
a. Drive is not detected during booting.
- Review the configuration instructions above and in
- Documentation/ide/ide.txt, and check how your hardware is
+ Documentation/ide/ide.rst, and check how your hardware is
configured.
- If your drive is the only device on an IDE interface, it should
@@ -179,14 +181,14 @@ a. Drive is not detected during booting.
- If your IDE interface is not at the standard addresses of 0x170
or 0x1f0, you'll need to explicitly inform the driver using a
- lilo option. See Documentation/ide/ide.txt. (This feature was
+ lilo option. See Documentation/ide/ide.rst. (This feature was
added around kernel version 1.3.30.)
- If the autoprobing is not finding your drive, you can tell the
driver to assume that one exists by using a lilo option of the
- form `hdX=cdrom', where X is the drive letter corresponding to
- where your drive is installed. Note that if you do this and you
- see a boot message like
+ form `hdX=cdrom`, where X is the drive letter corresponding to
+ where your drive is installed. Note that if you do this and you
+ see a boot message like::
hdX: ATAPI cdrom (?)
@@ -205,7 +207,7 @@ a. Drive is not detected during booting.
Support for some interfaces needing extra initialization is
provided in later 1.3.x kernels. You may need to turn on
additional kernel configuration options to get them to work;
- see Documentation/ide/ide.txt.
+ see Documentation/ide/ide.rst.
Even if support is not available for your interface, you may be
able to get it to work with the following procedure. First boot
@@ -220,7 +222,7 @@ b. Timeout/IRQ errors.
probably not making it to the host.
- IRQ problems may also be indicated by the message
- `IRQ probe failed (<n>)' while booting. If <n> is zero, that
+ `IRQ probe failed (<n>)` while booting. If <n> is zero, that
means that the system did not see an interrupt from the drive when
it was expecting one (on any feasible IRQ). If <n> is negative,
that means the system saw interrupts on multiple IRQ lines, when
@@ -240,27 +242,27 @@ b. Timeout/IRQ errors.
there are hardware problems with the interrupt setup; they
apparently don't use interrupts.
- - If you own a Pioneer DR-A24X, you _will_ get nasty error messages
+ - If you own a Pioneer DR-A24X, you _will_ get nasty error messages
on boot such as "irq timeout: status=0x50 { DriveReady SeekComplete }"
The Pioneer DR-A24X CDROM drives are fairly popular these days.
Unfortunately, these drives seem to become very confused when we perform
the standard Linux ATA disk drive probe. If you own one of these drives,
- you can bypass the ATA probing which confuses these CDROM drives, by
- adding `append="hdX=noprobe hdX=cdrom"' to your lilo.conf file and running
- lilo (again where X is the drive letter corresponding to where your drive
+ you can bypass the ATA probing which confuses these CDROM drives, by
+ adding `append="hdX=noprobe hdX=cdrom"` to your lilo.conf file and running
+ lilo (again where X is the drive letter corresponding to where your drive
is installed.)
-
+
c. System hangups.
- If the system locks up when you try to access the CDROM, the most
likely cause is that you have a buggy IDE adapter which doesn't
properly handle simultaneous transactions on multiple interfaces.
The most notorious of these is the CMD640B chip. This problem can
- be worked around by specifying the `serialize' option when
+ be worked around by specifying the `serialize` option when
booting. Recent kernels should be able to detect the need for
this automatically in most cases, but the detection is not
- foolproof. See Documentation/ide/ide.txt for more information
- about the `serialize' option and the CMD640B.
+ foolproof. See Documentation/ide/ide.rst for more information
+ about the `serialize` option and the CMD640B.
- Note that many MS-DOS CDROM drivers will work with such buggy
hardware, apparently because they never attempt to overlap CDROM
@@ -269,14 +271,14 @@ c. System hangups.
d. Can't mount a CDROM.
- - If you get errors from mount, it may help to check `dmesg' to see
+ - If you get errors from mount, it may help to check `dmesg` to see
if there are any more specific errors from the driver or from the
filesystem.
- Make sure there's a CDROM loaded in the drive, and that's it's an
ISO 9660 disc. You can't mount an audio CD.
- - With the CDROM in the drive and unmounted, try something like
+ - With the CDROM in the drive and unmounted, try something like::
cat /dev/cdrom | od | more
@@ -284,9 +286,9 @@ d. Can't mount a CDROM.
OK, and the problem is at the filesystem level (i.e., the CDROM is
not ISO 9660 or has errors in the filesystem structure).
- - If you see `not a block device' errors, check that the definitions
+ - If you see `not a block device` errors, check that the definitions
of the device special files are correct. They should be as
- follows:
+ follows::
brw-rw---- 1 root disk 3, 0 Nov 11 18:48 /dev/hda
brw-rw---- 1 root disk 3, 64 Nov 11 18:48 /dev/hdb
@@ -301,7 +303,7 @@ d. Can't mount a CDROM.
If you have a /dev/cdrom symbolic link, check that it is pointing
to the correct device file.
- If you hear people talking of the devices `hd1a' and `hd1b', these
+ If you hear people talking of the devices `hd1a` and `hd1b`, these
were old names for what are now called hdc and hdd. Those names
should be considered obsolete.
@@ -311,8 +313,8 @@ d. Can't mount a CDROM.
always give meaningful error messages.
-e. Directory listings are unpredictably truncated, and `dmesg' shows
- `buffer botch' error messages from the driver.
+e. Directory listings are unpredictably truncated, and `dmesg` shows
+ `buffer botch` error messages from the driver.
- There was a bug in the version of the driver in 1.2.x kernels
which could cause this. It was fixed in 1.3.0. If you can't
@@ -335,34 +337,36 @@ f. Data corruption.
5. cdchange.c
-------------
-/*
- * cdchange.c [-v] <device> [<slot>]
- *
- * This loads a CDROM from a specified slot in a changer, and displays
- * information about the changer status. The drive should be unmounted before
- * using this program.
- *
- * Changer information is displayed if either the -v flag is specified
- * or no slot was specified.
- *
- * Based on code originally from Gerhard Zuber <zuber@berlin.snafu.de>.
- * Changer status information, and rewrite for the new Uniform CDROM driver
- * interface by Erik Andersen <andersee@debian.org>.
- */
-
-#include <stdio.h>
-#include <stdlib.h>
-#include <errno.h>
-#include <string.h>
-#include <unistd.h>
-#include <fcntl.h>
-#include <sys/ioctl.h>
-#include <linux/cdrom.h>
-
-
-int
-main (int argc, char **argv)
-{
+::
+
+ /*
+ * cdchange.c [-v] <device> [<slot>]
+ *
+ * This loads a CDROM from a specified slot in a changer, and displays
+ * information about the changer status. The drive should be unmounted before
+ * using this program.
+ *
+ * Changer information is displayed if either the -v flag is specified
+ * or no slot was specified.
+ *
+ * Based on code originally from Gerhard Zuber <zuber@berlin.snafu.de>.
+ * Changer status information, and rewrite for the new Uniform CDROM driver
+ * interface by Erik Andersen <andersee@debian.org>.
+ */
+
+ #include <stdio.h>
+ #include <stdlib.h>
+ #include <errno.h>
+ #include <string.h>
+ #include <unistd.h>
+ #include <fcntl.h>
+ #include <sys/ioctl.h>
+ #include <linux/cdrom.h>
+
+
+ int
+ main (int argc, char **argv)
+ {
char *program;
char *device;
int fd; /* file descriptor for CD-ROM device */
@@ -382,30 +386,30 @@ main (int argc, char **argv)
fprintf (stderr, " Slots are numbered 1 -- n.\n");
exit (1);
}
-
+
if (strcmp (argv[0], "-v") == 0) {
verbose = 1;
++argv;
--argc;
}
-
+
device = argv[0];
-
+
if (argc == 2)
slot = atoi (argv[1]) - 1;
- /* open device */
+ /* open device */
fd = open(device, O_RDONLY | O_NONBLOCK);
if (fd < 0) {
- fprintf (stderr, "%s: open failed for `%s': %s\n",
+ fprintf (stderr, "%s: open failed for `%s`: %s\n",
program, device, strerror (errno));
exit (1);
}
- /* Check CD player status */
+ /* Check CD player status */
total_slots_available = ioctl (fd, CDROM_CHANGER_NSLOTS);
if (total_slots_available <= 1 ) {
- fprintf (stderr, "%s: Device `%s' is not an ATAPI "
+ fprintf (stderr, "%s: Device `%s` is not an ATAPI "
"compliant CD changer.\n", program, device);
exit (1);
}
@@ -418,7 +422,7 @@ main (int argc, char **argv)
exit (1);
}
- /* load */
+ /* load */
slot=ioctl (fd, CDROM_SELECT_DISC, slot);
if (slot<0) {
fflush(stdout);
@@ -462,14 +466,14 @@ main (int argc, char **argv)
for (x_slot=0; x_slot<total_slots_available; x_slot++) {
printf ("Slot %2d: ", x_slot+1);
- status = ioctl (fd, CDROM_DRIVE_STATUS, x_slot);
- if (status<0) {
- perror(" CDROM_DRIVE_STATUS");
- } else switch(status) {
+ status = ioctl (fd, CDROM_DRIVE_STATUS, x_slot);
+ if (status<0) {
+ perror(" CDROM_DRIVE_STATUS");
+ } else switch(status) {
case CDS_DISC_OK:
printf ("Disc present.");
break;
- case CDS_NO_DISC:
+ case CDS_NO_DISC:
printf ("Empty slot.");
break;
case CDS_TRAY_OPEN:
@@ -507,11 +511,11 @@ main (int argc, char **argv)
break;
}
}
- status = ioctl (fd, CDROM_MEDIA_CHANGED, x_slot);
- if (status<0) {
+ status = ioctl (fd, CDROM_MEDIA_CHANGED, x_slot);
+ if (status<0) {
perror(" CDROM_MEDIA_CHANGED");
- }
- switch (status) {
+ }
+ switch (status) {
case 1:
printf ("Changed.\n");
break;
@@ -525,10 +529,10 @@ main (int argc, char **argv)
/* close device */
status = close (fd);
if (status != 0) {
- fprintf (stderr, "%s: close failed for `%s': %s\n",
+ fprintf (stderr, "%s: close failed for `%s`: %s\n",
program, device, strerror (errno));
exit (1);
}
-
+
exit (0);
-}
+ }
diff --git a/Documentation/cdrom/index.rst b/Documentation/cdrom/index.rst
new file mode 100644
index 000000000000..efbd5d111825
--- /dev/null
+++ b/Documentation/cdrom/index.rst
@@ -0,0 +1,19 @@
+:orphan:
+
+=====
+cdrom
+=====
+
+.. toctree::
+ :maxdepth: 1
+
+ cdrom-standard
+ ide-cd
+ packet-writing
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/cdrom/packet-writing.txt b/Documentation/cdrom/packet-writing.rst
index 2834170d821e..c5c957195a5a 100644
--- a/Documentation/cdrom/packet-writing.txt
+++ b/Documentation/cdrom/packet-writing.rst
@@ -1,3 +1,7 @@
+==============
+Packet writing
+==============
+
Getting started quick
---------------------
@@ -10,13 +14,16 @@ Getting started quick
Download from http://sourceforge.net/projects/linux-udf/
- Grab a new CD-RW disc and format it (assuming CD-RW is hdc, substitute
- as appropriate):
+ as appropriate)::
+
# cdrwtool -d /dev/hdc -q
-- Setup your writer
+- Setup your writer::
+
# pktsetup dev_name /dev/hdc
-- Now you can mount /dev/pktcdvd/dev_name and copy files to it. Enjoy!
+- Now you can mount /dev/pktcdvd/dev_name and copy files to it. Enjoy::
+
# mount /dev/pktcdvd/dev_name /cdrom -t udf -o rw,noatime
@@ -25,11 +32,11 @@ Packet writing for DVD-RW media
DVD-RW discs can be written to much like CD-RW discs if they are in
the so called "restricted overwrite" mode. To put a disc in restricted
-overwrite mode, run:
+overwrite mode, run::
# dvd+rw-format /dev/hdc
-You can then use the disc the same way you would use a CD-RW disc:
+You can then use the disc the same way you would use a CD-RW disc::
# pktsetup dev_name /dev/hdc
# mount /dev/pktcdvd/dev_name /cdrom -t udf -o rw,noatime
@@ -41,7 +48,7 @@ Packet writing for DVD+RW media
According to the DVD+RW specification, a drive supporting DVD+RW discs
shall implement "true random writes with 2KB granularity", which means
that it should be possible to put any filesystem with a block size >=
-2KB on such a disc. For example, it should be possible to do:
+2KB on such a disc. For example, it should be possible to do::
# dvd+rw-format /dev/hdc (only needed if the disc has never
been formatted)
@@ -54,7 +61,7 @@ follow the specification, but suffer bad performance problems if the
writes are not 32KB aligned.
Both problems can be solved by using the pktcdvd driver, which always
-generates aligned writes.
+generates aligned writes::
# dvd+rw-format /dev/hdc
# pktsetup dev_name /dev/hdc
@@ -83,7 +90,7 @@ Notes
- Since the pktcdvd driver makes the disc appear as a regular block
device with a 2KB block size, you can put any filesystem you like on
- the disc. For example, run:
+ the disc. For example, run::
# /sbin/mke2fs /dev/pktcdvd/dev_name
@@ -97,7 +104,7 @@ Since Linux 2.6.20, the pktcdvd module has a sysfs interface
and can be controlled by it. For example the "pktcdvd" tool uses
this interface. (see http://tom.ist-im-web.de/download/pktcdvd )
-"pktcdvd" works similar to "pktsetup", e.g.:
+"pktcdvd" works similar to "pktsetup", e.g.::
# pktcdvd -a dev_name /dev/hdc
# mkudffs /dev/pktcdvd/dev_name
@@ -115,7 +122,7 @@ For a description of the sysfs interface look into the file:
Using the pktcdvd debugfs interface
-----------------------------------
-To read pktcdvd device infos in human readable form, do:
+To read pktcdvd device infos in human readable form, do::
# cat /sys/kernel/debug/pktcdvd/pktcdvd[0-7]/info
diff --git a/Documentation/conf.py b/Documentation/conf.py
index 7ace3f8852bd..3b2397bcb565 100644
--- a/Documentation/conf.py
+++ b/Documentation/conf.py
@@ -34,7 +34,8 @@ needs_sphinx = '1.3'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
-extensions = ['kerneldoc', 'rstFlatTable', 'kernel_include', 'cdomain', 'kfigure', 'sphinx.ext.ifconfig']
+extensions = ['kerneldoc', 'rstFlatTable', 'kernel_include', 'cdomain',
+ 'kfigure', 'sphinx.ext.ifconfig', 'automarkup']
# The name of the math extension changed on Sphinx 1.4
if (major == 1 and minor > 3) or (major > 1):
@@ -200,7 +201,7 @@ html_context = {
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
-#html_use_smartypants = True
+html_use_smartypants = False
# Custom sidebar templates, maps document names to template names.
#html_sidebars = {}
diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index ee1bb8983a88..322ac954b390 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -34,6 +34,8 @@ Core utilities
timekeeping
boot-time-mm
memory-hotplug
+ protection-keys
+ ../RCU/index
Interfaces for kernel debugging
diff --git a/Documentation/core-api/kernel-api.rst b/Documentation/core-api/kernel-api.rst
index a29c99d13331..824f24ccf401 100644
--- a/Documentation/core-api/kernel-api.rst
+++ b/Documentation/core-api/kernel-api.rst
@@ -33,6 +33,9 @@ String Conversions
.. kernel-doc:: lib/kstrtox.c
:export:
+.. kernel-doc:: lib/string_helpers.c
+ :export:
+
String Manipulation
-------------------
@@ -138,6 +141,15 @@ Base 2 log and power Functions
.. kernel-doc:: include/linux/log2.h
:internal:
+Integer power Functions
+-----------------------
+
+.. kernel-doc:: lib/math/int_pow.c
+ :export:
+
+.. kernel-doc:: lib/math/int_sqrt.c
+ :export:
+
Division Functions
------------------
@@ -358,8 +370,6 @@ Read-Copy Update (RCU)
.. kernel-doc:: kernel/rcu/tree.c
-.. kernel-doc:: kernel/rcu/tree_plugin.h
-
.. kernel-doc:: kernel/rcu/tree_exp.h
.. kernel-doc:: kernel/rcu/update.c
diff --git a/Documentation/x86/protection-keys.rst b/Documentation/core-api/protection-keys.rst
index 49d9833af871..49d9833af871 100644
--- a/Documentation/x86/protection-keys.rst
+++ b/Documentation/core-api/protection-keys.rst
diff --git a/Documentation/core-api/timekeeping.rst b/Documentation/core-api/timekeeping.rst
index 20ee447a50f3..c0ffa30c7c37 100644
--- a/Documentation/core-api/timekeeping.rst
+++ b/Documentation/core-api/timekeeping.rst
@@ -115,7 +115,7 @@ Some additional variants exist for more specialized cases:
void ktime_get_coarse_clocktai_ts64( struct timespec64 * )
These are quicker than the non-coarse versions, but less accurate,
- corresponding to CLOCK_MONONOTNIC_COARSE and CLOCK_REALTIME_COARSE
+ corresponding to CLOCK_MONOTONIC_COARSE and CLOCK_REALTIME_COARSE
in user space, along with the equivalent boottime/tai/raw
timebase not available in user space.
diff --git a/Documentation/core-api/xarray.rst b/Documentation/core-api/xarray.rst
index ef6f9f98f595..fcedc5349ace 100644
--- a/Documentation/core-api/xarray.rst
+++ b/Documentation/core-api/xarray.rst
@@ -30,27 +30,27 @@ it called marks. Each mark may be set or cleared independently of
the others. You can iterate over entries which are marked.
Normal pointers may be stored in the XArray directly. They must be 4-byte
-aligned, which is true for any pointer returned from :c:func:`kmalloc` and
-:c:func:`alloc_page`. It isn't true for arbitrary user-space pointers,
+aligned, which is true for any pointer returned from kmalloc() and
+alloc_page(). It isn't true for arbitrary user-space pointers,
nor for function pointers. You can store pointers to statically allocated
objects, as long as those objects have an alignment of at least 4.
You can also store integers between 0 and ``LONG_MAX`` in the XArray.
-You must first convert it into an entry using :c:func:`xa_mk_value`.
+You must first convert it into an entry using xa_mk_value().
When you retrieve an entry from the XArray, you can check whether it is
-a value entry by calling :c:func:`xa_is_value`, and convert it back to
-an integer by calling :c:func:`xa_to_value`.
+a value entry by calling xa_is_value(), and convert it back to
+an integer by calling xa_to_value().
Some users want to store tagged pointers instead of using the marks
-described above. They can call :c:func:`xa_tag_pointer` to create an
-entry with a tag, :c:func:`xa_untag_pointer` to turn a tagged entry
-back into an untagged pointer and :c:func:`xa_pointer_tag` to retrieve
+described above. They can call xa_tag_pointer() to create an
+entry with a tag, xa_untag_pointer() to turn a tagged entry
+back into an untagged pointer and xa_pointer_tag() to retrieve
the tag of an entry. Tagged pointers use the same bits that are used
to distinguish value entries from normal pointers, so each user must
decide whether they want to store value entries or tagged pointers in
any particular XArray.
-The XArray does not support storing :c:func:`IS_ERR` pointers as some
+The XArray does not support storing IS_ERR() pointers as some
conflict with value entries or internal entries.
An unusual feature of the XArray is the ability to create entries which
@@ -64,89 +64,89 @@ entry will cause the XArray to forget about the range.
Normal API
==========
-Start by initialising an XArray, either with :c:func:`DEFINE_XARRAY`
-for statically allocated XArrays or :c:func:`xa_init` for dynamically
+Start by initialising an XArray, either with DEFINE_XARRAY()
+for statically allocated XArrays or xa_init() for dynamically
allocated ones. A freshly-initialised XArray contains a ``NULL``
pointer at every index.
-You can then set entries using :c:func:`xa_store` and get entries
-using :c:func:`xa_load`. xa_store will overwrite any entry with the
+You can then set entries using xa_store() and get entries
+using xa_load(). xa_store will overwrite any entry with the
new entry and return the previous entry stored at that index. You can
-use :c:func:`xa_erase` instead of calling :c:func:`xa_store` with a
+use xa_erase() instead of calling xa_store() with a
``NULL`` entry. There is no difference between an entry that has never
been stored to, one that has been erased and one that has most recently
had ``NULL`` stored to it.
You can conditionally replace an entry at an index by using
-:c:func:`xa_cmpxchg`. Like :c:func:`cmpxchg`, it will only succeed if
+xa_cmpxchg(). Like cmpxchg(), it will only succeed if
the entry at that index has the 'old' value. It also returns the entry
which was at that index; if it returns the same entry which was passed as
-'old', then :c:func:`xa_cmpxchg` succeeded.
+'old', then xa_cmpxchg() succeeded.
If you want to only store a new entry to an index if the current entry
-at that index is ``NULL``, you can use :c:func:`xa_insert` which
+at that index is ``NULL``, you can use xa_insert() which
returns ``-EBUSY`` if the entry is not empty.
You can enquire whether a mark is set on an entry by using
-:c:func:`xa_get_mark`. If the entry is not ``NULL``, you can set a mark
-on it by using :c:func:`xa_set_mark` and remove the mark from an entry by
-calling :c:func:`xa_clear_mark`. You can ask whether any entry in the
-XArray has a particular mark set by calling :c:func:`xa_marked`.
+xa_get_mark(). If the entry is not ``NULL``, you can set a mark
+on it by using xa_set_mark() and remove the mark from an entry by
+calling xa_clear_mark(). You can ask whether any entry in the
+XArray has a particular mark set by calling xa_marked().
You can copy entries out of the XArray into a plain array by calling
-:c:func:`xa_extract`. Or you can iterate over the present entries in
-the XArray by calling :c:func:`xa_for_each`. You may prefer to use
-:c:func:`xa_find` or :c:func:`xa_find_after` to move to the next present
+xa_extract(). Or you can iterate over the present entries in
+the XArray by calling xa_for_each(). You may prefer to use
+xa_find() or xa_find_after() to move to the next present
entry in the XArray.
-Calling :c:func:`xa_store_range` stores the same entry in a range
+Calling xa_store_range() stores the same entry in a range
of indices. If you do this, some of the other operations will behave
in a slightly odd way. For example, marking the entry at one index
may result in the entry being marked at some, but not all of the other
indices. Storing into one index may result in the entry retrieved by
some, but not all of the other indices changing.
-Sometimes you need to ensure that a subsequent call to :c:func:`xa_store`
-will not need to allocate memory. The :c:func:`xa_reserve` function
+Sometimes you need to ensure that a subsequent call to xa_store()
+will not need to allocate memory. The xa_reserve() function
will store a reserved entry at the indicated index. Users of the
normal API will see this entry as containing ``NULL``. If you do
-not need to use the reserved entry, you can call :c:func:`xa_release`
+not need to use the reserved entry, you can call xa_release()
to remove the unused entry. If another user has stored to the entry
-in the meantime, :c:func:`xa_release` will do nothing; if instead you
-want the entry to become ``NULL``, you should use :c:func:`xa_erase`.
-Using :c:func:`xa_insert` on a reserved entry will fail.
+in the meantime, xa_release() will do nothing; if instead you
+want the entry to become ``NULL``, you should use xa_erase().
+Using xa_insert() on a reserved entry will fail.
-If all entries in the array are ``NULL``, the :c:func:`xa_empty` function
+If all entries in the array are ``NULL``, the xa_empty() function
will return ``true``.
Finally, you can remove all entries from an XArray by calling
-:c:func:`xa_destroy`. If the XArray entries are pointers, you may wish
+xa_destroy(). If the XArray entries are pointers, you may wish
to free the entries first. You can do this by iterating over all present
-entries in the XArray using the :c:func:`xa_for_each` iterator.
+entries in the XArray using the xa_for_each() iterator.
Allocating XArrays
------------------
-If you use :c:func:`DEFINE_XARRAY_ALLOC` to define the XArray, or
-initialise it by passing ``XA_FLAGS_ALLOC`` to :c:func:`xa_init_flags`,
+If you use DEFINE_XARRAY_ALLOC() to define the XArray, or
+initialise it by passing ``XA_FLAGS_ALLOC`` to xa_init_flags(),
the XArray changes to track whether entries are in use or not.
-You can call :c:func:`xa_alloc` to store the entry at an unused index
+You can call xa_alloc() to store the entry at an unused index
in the XArray. If you need to modify the array from interrupt context,
-you can use :c:func:`xa_alloc_bh` or :c:func:`xa_alloc_irq` to disable
+you can use xa_alloc_bh() or xa_alloc_irq() to disable
interrupts while allocating the ID.
-Using :c:func:`xa_store`, :c:func:`xa_cmpxchg` or :c:func:`xa_insert` will
+Using xa_store(), xa_cmpxchg() or xa_insert() will
also mark the entry as being allocated. Unlike a normal XArray, storing
-``NULL`` will mark the entry as being in use, like :c:func:`xa_reserve`.
-To free an entry, use :c:func:`xa_erase` (or :c:func:`xa_release` if
+``NULL`` will mark the entry as being in use, like xa_reserve().
+To free an entry, use xa_erase() (or xa_release() if
you only want to free the entry if it's ``NULL``).
By default, the lowest free entry is allocated starting from 0. If you
want to allocate entries starting at 1, it is more efficient to use
-:c:func:`DEFINE_XARRAY_ALLOC1` or ``XA_FLAGS_ALLOC1``. If you want to
+DEFINE_XARRAY_ALLOC1() or ``XA_FLAGS_ALLOC1``. If you want to
allocate IDs up to a maximum, then wrap back around to the lowest free
-ID, you can use :c:func:`xa_alloc_cyclic`.
+ID, you can use xa_alloc_cyclic().
You cannot use ``XA_MARK_0`` with an allocating XArray as this mark
is used to track whether an entry is free or not. The other marks are
@@ -155,17 +155,17 @@ available for your use.
Memory allocation
-----------------
-The :c:func:`xa_store`, :c:func:`xa_cmpxchg`, :c:func:`xa_alloc`,
-:c:func:`xa_reserve` and :c:func:`xa_insert` functions take a gfp_t
+The xa_store(), xa_cmpxchg(), xa_alloc(),
+xa_reserve() and xa_insert() functions take a gfp_t
parameter in case the XArray needs to allocate memory to store this entry.
If the entry is being deleted, no memory allocation needs to be performed,
and the GFP flags specified will be ignored.
It is possible for no memory to be allocatable, particularly if you pass
a restrictive set of GFP flags. In that case, the functions return a
-special value which can be turned into an errno using :c:func:`xa_err`.
+special value which can be turned into an errno using xa_err().
If you don't need to know exactly which error occurred, using
-:c:func:`xa_is_err` is slightly more efficient.
+xa_is_err() is slightly more efficient.
Locking
-------
@@ -174,54 +174,54 @@ When using the Normal API, you do not have to worry about locking.
The XArray uses RCU and an internal spinlock to synchronise access:
No lock needed:
- * :c:func:`xa_empty`
- * :c:func:`xa_marked`
+ * xa_empty()
+ * xa_marked()
Takes RCU read lock:
- * :c:func:`xa_load`
- * :c:func:`xa_for_each`
- * :c:func:`xa_find`
- * :c:func:`xa_find_after`
- * :c:func:`xa_extract`
- * :c:func:`xa_get_mark`
+ * xa_load()
+ * xa_for_each()
+ * xa_find()
+ * xa_find_after()
+ * xa_extract()
+ * xa_get_mark()
Takes xa_lock internally:
- * :c:func:`xa_store`
- * :c:func:`xa_store_bh`
- * :c:func:`xa_store_irq`
- * :c:func:`xa_insert`
- * :c:func:`xa_insert_bh`
- * :c:func:`xa_insert_irq`
- * :c:func:`xa_erase`
- * :c:func:`xa_erase_bh`
- * :c:func:`xa_erase_irq`
- * :c:func:`xa_cmpxchg`
- * :c:func:`xa_cmpxchg_bh`
- * :c:func:`xa_cmpxchg_irq`
- * :c:func:`xa_store_range`
- * :c:func:`xa_alloc`
- * :c:func:`xa_alloc_bh`
- * :c:func:`xa_alloc_irq`
- * :c:func:`xa_reserve`
- * :c:func:`xa_reserve_bh`
- * :c:func:`xa_reserve_irq`
- * :c:func:`xa_destroy`
- * :c:func:`xa_set_mark`
- * :c:func:`xa_clear_mark`
+ * xa_store()
+ * xa_store_bh()
+ * xa_store_irq()
+ * xa_insert()
+ * xa_insert_bh()
+ * xa_insert_irq()
+ * xa_erase()
+ * xa_erase_bh()
+ * xa_erase_irq()
+ * xa_cmpxchg()
+ * xa_cmpxchg_bh()
+ * xa_cmpxchg_irq()
+ * xa_store_range()
+ * xa_alloc()
+ * xa_alloc_bh()
+ * xa_alloc_irq()
+ * xa_reserve()
+ * xa_reserve_bh()
+ * xa_reserve_irq()
+ * xa_destroy()
+ * xa_set_mark()
+ * xa_clear_mark()
Assumes xa_lock held on entry:
- * :c:func:`__xa_store`
- * :c:func:`__xa_insert`
- * :c:func:`__xa_erase`
- * :c:func:`__xa_cmpxchg`
- * :c:func:`__xa_alloc`
- * :c:func:`__xa_set_mark`
- * :c:func:`__xa_clear_mark`
+ * __xa_store()
+ * __xa_insert()
+ * __xa_erase()
+ * __xa_cmpxchg()
+ * __xa_alloc()
+ * __xa_set_mark()
+ * __xa_clear_mark()
If you want to take advantage of the lock to protect the data structures
-that you are storing in the XArray, you can call :c:func:`xa_lock`
-before calling :c:func:`xa_load`, then take a reference count on the
-object you have found before calling :c:func:`xa_unlock`. This will
+that you are storing in the XArray, you can call xa_lock()
+before calling xa_load(), then take a reference count on the
+object you have found before calling xa_unlock(). This will
prevent stores from removing the object from the array between looking
up the object and incrementing the refcount. You can also use RCU to
avoid dereferencing freed memory, but an explanation of that is beyond
@@ -261,7 +261,7 @@ context and then erase them in softirq context, you can do that this way::
}
If you are going to modify the XArray from interrupt or softirq context,
-you need to initialise the array using :c:func:`xa_init_flags`, passing
+you need to initialise the array using xa_init_flags(), passing
``XA_FLAGS_LOCK_IRQ`` or ``XA_FLAGS_LOCK_BH``.
The above example also shows a common pattern of wanting to extend the
@@ -269,20 +269,20 @@ coverage of the xa_lock on the store side to protect some statistics
associated with the array.
Sharing the XArray with interrupt context is also possible, either
-using :c:func:`xa_lock_irqsave` in both the interrupt handler and process
-context, or :c:func:`xa_lock_irq` in process context and :c:func:`xa_lock`
+using xa_lock_irqsave() in both the interrupt handler and process
+context, or xa_lock_irq() in process context and xa_lock()
in the interrupt handler. Some of the more common patterns have helper
-functions such as :c:func:`xa_store_bh`, :c:func:`xa_store_irq`,
-:c:func:`xa_erase_bh`, :c:func:`xa_erase_irq`, :c:func:`xa_cmpxchg_bh`
-and :c:func:`xa_cmpxchg_irq`.
+functions such as xa_store_bh(), xa_store_irq(),
+xa_erase_bh(), xa_erase_irq(), xa_cmpxchg_bh()
+and xa_cmpxchg_irq().
Sometimes you need to protect access to the XArray with a mutex because
that lock sits above another mutex in the locking hierarchy. That does
-not entitle you to use functions like :c:func:`__xa_erase` without taking
+not entitle you to use functions like __xa_erase() without taking
the xa_lock; the xa_lock is used for lockdep validation and will be used
for other purposes in the future.
-The :c:func:`__xa_set_mark` and :c:func:`__xa_clear_mark` functions are also
+The __xa_set_mark() and __xa_clear_mark() functions are also
available for situations where you look up an entry and want to atomically
set or clear a mark. It may be more efficient to use the advanced API
in this case, as it will save you from walking the tree twice.
@@ -300,27 +300,27 @@ indeed the normal API is implemented in terms of the advanced API. The
advanced API is only available to modules with a GPL-compatible license.
The advanced API is based around the xa_state. This is an opaque data
-structure which you declare on the stack using the :c:func:`XA_STATE`
+structure which you declare on the stack using the XA_STATE()
macro. This macro initialises the xa_state ready to start walking
around the XArray. It is used as a cursor to maintain the position
in the XArray and let you compose various operations together without
having to restart from the top every time.
The xa_state is also used to store errors. You can call
-:c:func:`xas_error` to retrieve the error. All operations check whether
+xas_error() to retrieve the error. All operations check whether
the xa_state is in an error state before proceeding, so there's no need
for you to check for an error after each call; you can make multiple
calls in succession and only check at a convenient point. The only
errors currently generated by the XArray code itself are ``ENOMEM`` and
``EINVAL``, but it supports arbitrary errors in case you want to call
-:c:func:`xas_set_err` yourself.
+xas_set_err() yourself.
-If the xa_state is holding an ``ENOMEM`` error, calling :c:func:`xas_nomem`
+If the xa_state is holding an ``ENOMEM`` error, calling xas_nomem()
will attempt to allocate more memory using the specified gfp flags and
cache it in the xa_state for the next attempt. The idea is that you take
the xa_lock, attempt the operation and drop the lock. The operation
attempts to allocate memory while holding the lock, but it is more
-likely to fail. Once you have dropped the lock, :c:func:`xas_nomem`
+likely to fail. Once you have dropped the lock, xas_nomem()
can try harder to allocate more memory. It will return ``true`` if it
is worth retrying the operation (i.e. that there was a memory error *and*
more memory was allocated). If it has previously allocated memory, and
@@ -333,7 +333,7 @@ Internal Entries
The XArray reserves some entries for its own purposes. These are never
exposed through the normal API, but when using the advanced API, it's
possible to see them. Usually the best way to handle them is to pass them
-to :c:func:`xas_retry`, and retry the operation if it returns ``true``.
+to xas_retry(), and retry the operation if it returns ``true``.
.. flat-table::
:widths: 1 1 6
@@ -343,89 +343,89 @@ to :c:func:`xas_retry`, and retry the operation if it returns ``true``.
- Usage
* - Node
- - :c:func:`xa_is_node`
+ - xa_is_node()
- An XArray node. May be visible when using a multi-index xa_state.
* - Sibling
- - :c:func:`xa_is_sibling`
+ - xa_is_sibling()
- A non-canonical entry for a multi-index entry. The value indicates
which slot in this node has the canonical entry.
* - Retry
- - :c:func:`xa_is_retry`
+ - xa_is_retry()
- This entry is currently being modified by a thread which has the
xa_lock. The node containing this entry may be freed at the end
of this RCU period. You should restart the lookup from the head
of the array.
* - Zero
- - :c:func:`xa_is_zero`
+ - xa_is_zero()
- Zero entries appear as ``NULL`` through the Normal API, but occupy
an entry in the XArray which can be used to reserve the index for
future use. This is used by allocating XArrays for allocated entries
which are ``NULL``.
Other internal entries may be added in the future. As far as possible, they
-will be handled by :c:func:`xas_retry`.
+will be handled by xas_retry().
Additional functionality
------------------------
-The :c:func:`xas_create_range` function allocates all the necessary memory
+The xas_create_range() function allocates all the necessary memory
to store every entry in a range. It will set ENOMEM in the xa_state if
it cannot allocate memory.
-You can use :c:func:`xas_init_marks` to reset the marks on an entry
+You can use xas_init_marks() to reset the marks on an entry
to their default state. This is usually all marks clear, unless the
XArray is marked with ``XA_FLAGS_TRACK_FREE``, in which case mark 0 is set
and all other marks are clear. Replacing one entry with another using
-:c:func:`xas_store` will not reset the marks on that entry; if you want
+xas_store() will not reset the marks on that entry; if you want
the marks reset, you should do that explicitly.
-The :c:func:`xas_load` will walk the xa_state as close to the entry
+The xas_load() will walk the xa_state as close to the entry
as it can. If you know the xa_state has already been walked to the
entry and need to check that the entry hasn't changed, you can use
-:c:func:`xas_reload` to save a function call.
+xas_reload() to save a function call.
If you need to move to a different index in the XArray, call
-:c:func:`xas_set`. This resets the cursor to the top of the tree, which
+xas_set(). This resets the cursor to the top of the tree, which
will generally make the next operation walk the cursor to the desired
spot in the tree. If you want to move to the next or previous index,
-call :c:func:`xas_next` or :c:func:`xas_prev`. Setting the index does
+call xas_next() or xas_prev(). Setting the index does
not walk the cursor around the array so does not require a lock to be
held, while moving to the next or previous index does.
-You can search for the next present entry using :c:func:`xas_find`. This
-is the equivalent of both :c:func:`xa_find` and :c:func:`xa_find_after`;
+You can search for the next present entry using xas_find(). This
+is the equivalent of both xa_find() and xa_find_after();
if the cursor has been walked to an entry, then it will find the next
entry after the one currently referenced. If not, it will return the
-entry at the index of the xa_state. Using :c:func:`xas_next_entry` to
-move to the next present entry instead of :c:func:`xas_find` will save
+entry at the index of the xa_state. Using xas_next_entry() to
+move to the next present entry instead of xas_find() will save
a function call in the majority of cases at the expense of emitting more
inline code.
-The :c:func:`xas_find_marked` function is similar. If the xa_state has
+The xas_find_marked() function is similar. If the xa_state has
not been walked, it will return the entry at the index of the xa_state,
if it is marked. Otherwise, it will return the first marked entry after
-the entry referenced by the xa_state. The :c:func:`xas_next_marked`
-function is the equivalent of :c:func:`xas_next_entry`.
+the entry referenced by the xa_state. The xas_next_marked()
+function is the equivalent of xas_next_entry().
-When iterating over a range of the XArray using :c:func:`xas_for_each`
-or :c:func:`xas_for_each_marked`, it may be necessary to temporarily stop
-the iteration. The :c:func:`xas_pause` function exists for this purpose.
+When iterating over a range of the XArray using xas_for_each()
+or xas_for_each_marked(), it may be necessary to temporarily stop
+the iteration. The xas_pause() function exists for this purpose.
After you have done the necessary work and wish to resume, the xa_state
is in an appropriate state to continue the iteration after the entry
you last processed. If you have interrupts disabled while iterating,
then it is good manners to pause the iteration and reenable interrupts
every ``XA_CHECK_SCHED`` entries.
-The :c:func:`xas_get_mark`, :c:func:`xas_set_mark` and
-:c:func:`xas_clear_mark` functions require the xa_state cursor to have
+The xas_get_mark(), xas_set_mark() and
+xas_clear_mark() functions require the xa_state cursor to have
been moved to the appropriate location in the xarray; they will do
-nothing if you have called :c:func:`xas_pause` or :c:func:`xas_set`
+nothing if you have called xas_pause() or xas_set()
immediately before.
-You can call :c:func:`xas_set_update` to have a callback function
+You can call xas_set_update() to have a callback function
called each time the XArray updates a node. This is used by the page
cache workingset code to maintain its list of nodes which contain only
shadow entries.
@@ -443,25 +443,25 @@ eg indices 64-127 may be tied together, but 2-6 may not be. This may
save substantial quantities of memory; for example tying 512 entries
together will save over 4kB.
-You can create a multi-index entry by using :c:func:`XA_STATE_ORDER`
-or :c:func:`xas_set_order` followed by a call to :c:func:`xas_store`.
-Calling :c:func:`xas_load` with a multi-index xa_state will walk the
+You can create a multi-index entry by using XA_STATE_ORDER()
+or xas_set_order() followed by a call to xas_store().
+Calling xas_load() with a multi-index xa_state will walk the
xa_state to the right location in the tree, but the return value is not
meaningful, potentially being an internal entry or ``NULL`` even when there
-is an entry stored within the range. Calling :c:func:`xas_find_conflict`
+is an entry stored within the range. Calling xas_find_conflict()
will return the first entry within the range or ``NULL`` if there are no
-entries in the range. The :c:func:`xas_for_each_conflict` iterator will
+entries in the range. The xas_for_each_conflict() iterator will
iterate over every entry which overlaps the specified range.
-If :c:func:`xas_load` encounters a multi-index entry, the xa_index
+If xas_load() encounters a multi-index entry, the xa_index
in the xa_state will not be changed. When iterating over an XArray
-or calling :c:func:`xas_find`, if the initial index is in the middle
+or calling xas_find(), if the initial index is in the middle
of a multi-index entry, it will not be altered. Subsequent calls
or iterations will move the index to the first index in the range.
Each entry will only be returned once, no matter how many indices it
occupies.
-Using :c:func:`xas_next` or :c:func:`xas_prev` with a multi-index xa_state
+Using xas_next() or xas_prev() with a multi-index xa_state
is not supported. Using either of these functions on a multi-index entry
will reveal sibling entries; these should be skipped over by the caller.
diff --git a/Documentation/device-mapper/cache-policies.txt b/Documentation/device-mapper/cache-policies.rst
index 86786d87d9a8..b17fe352fc41 100644
--- a/Documentation/device-mapper/cache-policies.txt
+++ b/Documentation/device-mapper/cache-policies.rst
@@ -1,3 +1,4 @@
+=============================
Guidance for writing policies
=============================
@@ -30,7 +31,7 @@ multiqueue (mq)
This policy is now an alias for smq (see below).
-The following tunables are accepted, but have no effect:
+The following tunables are accepted, but have no effect::
'sequential_threshold <#nr_sequential_ios>'
'random_threshold <#nr_random_ios>'
@@ -56,7 +57,9 @@ mq policy's hints to be dropped. Also, performance of the cache may
degrade slightly until smq recalculates the origin device's hotspots
that should be cached.
-Memory usage:
+Memory usage
+^^^^^^^^^^^^
+
The mq policy used a lot of memory; 88 bytes per cache block on a 64
bit machine.
@@ -69,7 +72,9 @@ cache block).
All this means smq uses ~25bytes per cache block. Still a lot of
memory, but a substantial improvement nontheless.
-Level balancing:
+Level balancing
+^^^^^^^^^^^^^^^
+
mq placed entries in different levels of the multiqueue structures
based on their hit count (~ln(hit count)). This meant the bottom
levels generally had the most entries, and the top ones had very
@@ -94,7 +99,9 @@ is used to decide which blocks to promote. If the hotspot queue is
performing badly then it starts moving entries more quickly between
levels. This lets it adapt to new IO patterns very quickly.
-Performance:
+Performance
+^^^^^^^^^^^
+
Testing smq shows substantially better performance than mq.
cleaner
@@ -105,16 +112,19 @@ The cleaner writes back all dirty blocks in a cache to decommission it.
Examples
========
-The syntax for a table is:
+The syntax for a table is::
+
cache <metadata dev> <cache dev> <origin dev> <block size>
<#feature_args> [<feature arg>]*
<policy> <#policy_args> [<policy arg>]*
-The syntax to send a message using the dmsetup command is:
+The syntax to send a message using the dmsetup command is::
+
dmsetup message <mapped device> 0 sequential_threshold 1024
dmsetup message <mapped device> 0 random_threshold 8
-Using dmsetup:
+Using dmsetup::
+
dmsetup create blah --table "0 268435456 cache /dev/sdb /dev/sdc \
/dev/sdd 512 0 mq 4 sequential_threshold 1024 random_threshold 8"
creates a 128GB large mapped device named 'blah' with the
diff --git a/Documentation/device-mapper/cache.txt b/Documentation/device-mapper/cache.rst
index 8ae1cf8e94da..f15e5254d05b 100644
--- a/Documentation/device-mapper/cache.txt
+++ b/Documentation/device-mapper/cache.rst
@@ -1,3 +1,7 @@
+=====
+Cache
+=====
+
Introduction
============
@@ -24,10 +28,13 @@ scenarios (eg. a vm image server).
Glossary
========
- Migration - Movement of the primary copy of a logical block from one
+ Migration
+ Movement of the primary copy of a logical block from one
device to the other.
- Promotion - Migration from slow device to fast device.
- Demotion - Migration from fast device to slow device.
+ Promotion
+ Migration from slow device to fast device.
+ Demotion
+ Migration from fast device to slow device.
The origin device always contains a copy of the logical block, which
may be out of date or kept in sync with the copy on the cache device
@@ -169,45 +176,53 @@ Target interface
Constructor
-----------
- cache <metadata dev> <cache dev> <origin dev> <block size>
- <#feature args> [<feature arg>]*
- <policy> <#policy args> [policy args]*
+ ::
+
+ cache <metadata dev> <cache dev> <origin dev> <block size>
+ <#feature args> [<feature arg>]*
+ <policy> <#policy args> [policy args]*
- metadata dev : fast device holding the persistent metadata
- cache dev : fast device holding cached data blocks
- origin dev : slow device holding original data blocks
- block size : cache unit size in sectors
+ ================ =======================================================
+ metadata dev fast device holding the persistent metadata
+ cache dev fast device holding cached data blocks
+ origin dev slow device holding original data blocks
+ block size cache unit size in sectors
- #feature args : number of feature arguments passed
- feature args : writethrough or passthrough (The default is writeback.)
+ #feature args number of feature arguments passed
+ feature args writethrough or passthrough (The default is writeback.)
- policy : the replacement policy to use
- #policy args : an even number of arguments corresponding to
- key/value pairs passed to the policy
- policy args : key/value pairs passed to the policy
- E.g. 'sequential_threshold 1024'
- See cache-policies.txt for details.
+ policy the replacement policy to use
+ #policy args an even number of arguments corresponding to
+ key/value pairs passed to the policy
+ policy args key/value pairs passed to the policy
+ E.g. 'sequential_threshold 1024'
+ See cache-policies.txt for details.
+ ================ =======================================================
Optional feature arguments are:
- writethrough : write through caching that prohibits cache block
- content from being different from origin block content.
- Without this argument, the default behaviour is to write
- back cache block contents later for performance reasons,
- so they may differ from the corresponding origin blocks.
-
- passthrough : a degraded mode useful for various cache coherency
- situations (e.g., rolling back snapshots of
- underlying storage). Reads and writes always go to
- the origin. If a write goes to a cached origin
- block, then the cache block is invalidated.
- To enable passthrough mode the cache must be clean.
-
- metadata2 : use version 2 of the metadata. This stores the dirty bits
- in a separate btree, which improves speed of shutting
- down the cache.
-
- no_discard_passdown : disable passing down discards from the cache
- to the origin's data device.
+
+
+ ==================== ========================================================
+ writethrough write through caching that prohibits cache block
+ content from being different from origin block content.
+ Without this argument, the default behaviour is to write
+ back cache block contents later for performance reasons,
+ so they may differ from the corresponding origin blocks.
+
+ passthrough a degraded mode useful for various cache coherency
+ situations (e.g., rolling back snapshots of
+ underlying storage). Reads and writes always go to
+ the origin. If a write goes to a cached origin
+ block, then the cache block is invalidated.
+ To enable passthrough mode the cache must be clean.
+
+ metadata2 use version 2 of the metadata. This stores the dirty
+ bits in a separate btree, which improves speed of
+ shutting down the cache.
+
+ no_discard_passdown disable passing down discards from the cache
+ to the origin's data device.
+ ==================== ========================================================
A policy called 'default' is always registered. This is an alias for
the policy we currently think is giving best all round performance.
@@ -218,54 +233,61 @@ the characteristics of a specific policy, always request it by name.
Status
------
-<metadata block size> <#used metadata blocks>/<#total metadata blocks>
-<cache block size> <#used cache blocks>/<#total cache blocks>
-<#read hits> <#read misses> <#write hits> <#write misses>
-<#demotions> <#promotions> <#dirty> <#features> <features>*
-<#core args> <core args>* <policy name> <#policy args> <policy args>*
-<cache metadata mode>
-
-metadata block size : Fixed block size for each metadata block in
- sectors
-#used metadata blocks : Number of metadata blocks used
-#total metadata blocks : Total number of metadata blocks
-cache block size : Configurable block size for the cache device
- in sectors
-#used cache blocks : Number of blocks resident in the cache
-#total cache blocks : Total number of cache blocks
-#read hits : Number of times a READ bio has been mapped
- to the cache
-#read misses : Number of times a READ bio has been mapped
- to the origin
-#write hits : Number of times a WRITE bio has been mapped
- to the cache
-#write misses : Number of times a WRITE bio has been
- mapped to the origin
-#demotions : Number of times a block has been removed
- from the cache
-#promotions : Number of times a block has been moved to
- the cache
-#dirty : Number of blocks in the cache that differ
- from the origin
-#feature args : Number of feature args to follow
-feature args : 'writethrough' (optional)
-#core args : Number of core arguments (must be even)
-core args : Key/value pairs for tuning the core
- e.g. migration_threshold
-policy name : Name of the policy
-#policy args : Number of policy arguments to follow (must be even)
-policy args : Key/value pairs e.g. sequential_threshold
-cache metadata mode : ro if read-only, rw if read-write
- In serious cases where even a read-only mode is deemed unsafe
- no further I/O will be permitted and the status will just
- contain the string 'Fail'. The userspace recovery tools
- should then be used.
-needs_check : 'needs_check' if set, '-' if not set
- A metadata operation has failed, resulting in the needs_check
- flag being set in the metadata's superblock. The metadata
- device must be deactivated and checked/repaired before the
- cache can be made fully operational again. '-' indicates
- needs_check is not set.
+::
+
+ <metadata block size> <#used metadata blocks>/<#total metadata blocks>
+ <cache block size> <#used cache blocks>/<#total cache blocks>
+ <#read hits> <#read misses> <#write hits> <#write misses>
+ <#demotions> <#promotions> <#dirty> <#features> <features>*
+ <#core args> <core args>* <policy name> <#policy args> <policy args>*
+ <cache metadata mode>
+
+
+========================= =====================================================
+metadata block size Fixed block size for each metadata block in
+ sectors
+#used metadata blocks Number of metadata blocks used
+#total metadata blocks Total number of metadata blocks
+cache block size Configurable block size for the cache device
+ in sectors
+#used cache blocks Number of blocks resident in the cache
+#total cache blocks Total number of cache blocks
+#read hits Number of times a READ bio has been mapped
+ to the cache
+#read misses Number of times a READ bio has been mapped
+ to the origin
+#write hits Number of times a WRITE bio has been mapped
+ to the cache
+#write misses Number of times a WRITE bio has been
+ mapped to the origin
+#demotions Number of times a block has been removed
+ from the cache
+#promotions Number of times a block has been moved to
+ the cache
+#dirty Number of blocks in the cache that differ
+ from the origin
+#feature args Number of feature args to follow
+feature args 'writethrough' (optional)
+#core args Number of core arguments (must be even)
+core args Key/value pairs for tuning the core
+ e.g. migration_threshold
+policy name Name of the policy
+#policy args Number of policy arguments to follow (must be even)
+policy args Key/value pairs e.g. sequential_threshold
+cache metadata mode ro if read-only, rw if read-write
+
+ In serious cases where even a read-only mode is
+ deemed unsafe no further I/O will be permitted and
+ the status will just contain the string 'Fail'.
+ The userspace recovery tools should then be used.
+needs_check 'needs_check' if set, '-' if not set
+ A metadata operation has failed, resulting in the
+ needs_check flag being set in the metadata's
+ superblock. The metadata device must be
+ deactivated and checked/repaired before the
+ cache can be made fully operational again.
+ '-' indicates needs_check is not set.
+========================= =====================================================
Messages
--------
@@ -274,11 +296,12 @@ Policies will have different tunables, specific to each one, so we
need a generic way of getting and setting these. Device-mapper
messages are used. (A sysfs interface would also be possible.)
-The message format is:
+The message format is::
<key> <value>
-E.g.
+E.g.::
+
dmsetup message my_cache 0 sequential_threshold 1024
@@ -290,11 +313,12 @@ of values from 5 to 9. Each cblock must be expressed as a decimal
value, in the future a variant message that takes cblock ranges
expressed in hexadecimal may be needed to better support efficient
invalidation of larger caches. The cache must be in passthrough mode
-when invalidate_cblocks is used.
+when invalidate_cblocks is used::
invalidate_cblocks [<cblock>|<cblock begin>-<cblock end>]*
-E.g.
+E.g.::
+
dmsetup message my_cache 0 invalidate_cblocks 2345 3456-4567 5678-6789
Examples
@@ -304,8 +328,10 @@ The test suite can be found here:
https://github.com/jthornber/device-mapper-test-suite
-dmsetup create my_cache --table '0 41943040 cache /dev/mapper/metadata \
- /dev/mapper/ssd /dev/mapper/origin 512 1 writeback default 0'
-dmsetup create my_cache --table '0 41943040 cache /dev/mapper/metadata \
- /dev/mapper/ssd /dev/mapper/origin 1024 1 writeback \
- mq 4 sequential_threshold 1024 random_threshold 8'
+::
+
+ dmsetup create my_cache --table '0 41943040 cache /dev/mapper/metadata \
+ /dev/mapper/ssd /dev/mapper/origin 512 1 writeback default 0'
+ dmsetup create my_cache --table '0 41943040 cache /dev/mapper/metadata \
+ /dev/mapper/ssd /dev/mapper/origin 1024 1 writeback \
+ mq 4 sequential_threshold 1024 random_threshold 8'
diff --git a/Documentation/device-mapper/delay.txt b/Documentation/device-mapper/delay.rst
index 6426c45273cb..917ba8c33359 100644
--- a/Documentation/device-mapper/delay.txt
+++ b/Documentation/device-mapper/delay.rst
@@ -1,10 +1,12 @@
+========
dm-delay
========
Device-Mapper's "delay" target delays reads and/or writes
and maps them to different devices.
-Parameters:
+Parameters::
+
<device> <offset> <delay> [<write_device> <write_offset> <write_delay>
[<flush_device> <flush_offset> <flush_delay>]]
@@ -14,15 +16,16 @@ Delays are specified in milliseconds.
Example scripts
===============
-[[
-#!/bin/sh
-# Create device delaying rw operation for 500ms
-echo "0 `blockdev --getsz $1` delay $1 0 500" | dmsetup create delayed
-]]
-
-[[
-#!/bin/sh
-# Create device delaying only write operation for 500ms and
-# splitting reads and writes to different devices $1 $2
-echo "0 `blockdev --getsz $1` delay $1 0 0 $2 0 500" | dmsetup create delayed
-]]
+
+::
+
+ #!/bin/sh
+ # Create device delaying rw operation for 500ms
+ echo "0 `blockdev --getsz $1` delay $1 0 500" | dmsetup create delayed
+
+::
+
+ #!/bin/sh
+ # Create device delaying only write operation for 500ms and
+ # splitting reads and writes to different devices $1 $2
+ echo "0 `blockdev --getsz $1` delay $1 0 0 $2 0 500" | dmsetup create delayed
diff --git a/Documentation/device-mapper/dm-crypt.txt b/Documentation/device-mapper/dm-crypt.rst
index 3b3e1de21c9c..8f4a3f889d43 100644
--- a/Documentation/device-mapper/dm-crypt.txt
+++ b/Documentation/device-mapper/dm-crypt.rst
@@ -1,5 +1,6 @@
+========
dm-crypt
-=========
+========
Device-Mapper's "crypt" target provides transparent encryption of block devices
using the kernel crypto API.
@@ -7,15 +8,20 @@ using the kernel crypto API.
For a more detailed description of supported parameters see:
https://gitlab.com/cryptsetup/cryptsetup/wikis/DMCrypt
-Parameters: <cipher> <key> <iv_offset> <device path> \
+Parameters::
+
+ <cipher> <key> <iv_offset> <device path> \
<offset> [<#opt_params> <opt_params>]
<cipher>
Encryption cipher, encryption mode and Initial Vector (IV) generator.
- The cipher specifications format is:
+ The cipher specifications format is::
+
cipher[:keycount]-chainmode-ivmode[:ivopts]
- Examples:
+
+ Examples::
+
aes-cbc-essiv:sha256
aes-xts-plain64
serpent-xts-plain64
@@ -25,12 +31,17 @@ Parameters: <cipher> <key> <iv_offset> <device path> \
as for the first format type.
This format is mainly used for specification of authenticated modes.
- The crypto API cipher specifications format is:
+ The crypto API cipher specifications format is::
+
capi:cipher_api_spec-ivmode[:ivopts]
- Examples:
+
+ Examples::
+
capi:cbc(aes)-essiv:sha256
capi:xts(aes)-plain64
- Examples of authenticated modes:
+
+ Examples of authenticated modes::
+
capi:gcm(aes)-random
capi:authenc(hmac(sha256),xts(aes))-random
capi:rfc7539(chacha20,poly1305)-random
@@ -142,21 +153,21 @@ LUKS (Linux Unified Key Setup) is now the preferred way to set up disk
encryption with dm-crypt using the 'cryptsetup' utility, see
https://gitlab.com/cryptsetup/cryptsetup
-[[
-#!/bin/sh
-# Create a crypt device using dmsetup
-dmsetup create crypt1 --table "0 `blockdev --getsz $1` crypt aes-cbc-essiv:sha256 babebabebabebabebabebabebabebabe 0 $1 0"
-]]
-
-[[
-#!/bin/sh
-# Create a crypt device using dmsetup when encryption key is stored in keyring service
-dmsetup create crypt2 --table "0 `blockdev --getsize $1` crypt aes-cbc-essiv:sha256 :32:logon:my_prefix:my_key 0 $1 0"
-]]
-
-[[
-#!/bin/sh
-# Create a crypt device using cryptsetup and LUKS header with default cipher
-cryptsetup luksFormat $1
-cryptsetup luksOpen $1 crypt1
-]]
+::
+
+ #!/bin/sh
+ # Create a crypt device using dmsetup
+ dmsetup create crypt1 --table "0 `blockdev --getsz $1` crypt aes-cbc-essiv:sha256 babebabebabebabebabebabebabebabe 0 $1 0"
+
+::
+
+ #!/bin/sh
+ # Create a crypt device using dmsetup when encryption key is stored in keyring service
+ dmsetup create crypt2 --table "0 `blockdev --getsize $1` crypt aes-cbc-essiv:sha256 :32:logon:my_prefix:my_key 0 $1 0"
+
+::
+
+ #!/bin/sh
+ # Create a crypt device using cryptsetup and LUKS header with default cipher
+ cryptsetup luksFormat $1
+ cryptsetup luksOpen $1 crypt1
diff --git a/Documentation/device-mapper/dm-flakey.txt b/Documentation/device-mapper/dm-flakey.rst
index 9f0e247d0877..86138735879d 100644
--- a/Documentation/device-mapper/dm-flakey.txt
+++ b/Documentation/device-mapper/dm-flakey.rst
@@ -1,3 +1,4 @@
+=========
dm-flakey
=========
@@ -15,17 +16,26 @@ underlying devices.
Table parameters
----------------
+
+::
+
<dev path> <offset> <up interval> <down interval> \
[<num_features> [<feature arguments>]]
Mandatory parameters:
- <dev path>: Full pathname to the underlying block-device, or a
- "major:minor" device-number.
- <offset>: Starting sector within the device.
- <up interval>: Number of seconds device is available.
- <down interval>: Number of seconds device returns errors.
+
+ <dev path>:
+ Full pathname to the underlying block-device, or a
+ "major:minor" device-number.
+ <offset>:
+ Starting sector within the device.
+ <up interval>:
+ Number of seconds device is available.
+ <down interval>:
+ Number of seconds device returns errors.
Optional feature parameters:
+
If no feature parameters are present, during the periods of
unreliability, all I/O returns errors.
@@ -41,17 +51,24 @@ Optional feature parameters:
During <down interval>, replace <Nth_byte> of the data of
each matching bio with <value>.
- <Nth_byte>: The offset of the byte to replace.
- Counting starts at 1, to replace the first byte.
- <direction>: Either 'r' to corrupt reads or 'w' to corrupt writes.
- 'w' is incompatible with drop_writes.
- <value>: The value (from 0-255) to write.
- <flags>: Perform the replacement only if bio->bi_opf has all the
- selected flags set.
+ <Nth_byte>:
+ The offset of the byte to replace.
+ Counting starts at 1, to replace the first byte.
+ <direction>:
+ Either 'r' to corrupt reads or 'w' to corrupt writes.
+ 'w' is incompatible with drop_writes.
+ <value>:
+ The value (from 0-255) to write.
+ <flags>:
+ Perform the replacement only if bio->bi_opf has all the
+ selected flags set.
Examples:
+
+Replaces the 32nd byte of READ bios with the value 1::
+
corrupt_bio_byte 32 r 1 0
- - replaces the 32nd byte of READ bios with the value 1
+
+Replaces the 224th byte of REQ_META (=32) bios with the value 0::
corrupt_bio_byte 224 w 0 32
- - replaces the 224th byte of REQ_META (=32) bios with the value 0
diff --git a/Documentation/device-mapper/dm-init.txt b/Documentation/device-mapper/dm-init.rst
index 8464ee7c01b8..e5242ff17e9b 100644
--- a/Documentation/device-mapper/dm-init.txt
+++ b/Documentation/device-mapper/dm-init.rst
@@ -1,5 +1,6 @@
+================================
Early creation of mapped devices
-====================================
+================================
It is possible to configure a device-mapper device to act as the root device for
your system in two ways.
@@ -12,15 +13,17 @@ The second is to create one or more device-mappers using the module parameter
The format is specified as a string of data separated by commas and optionally
semi-colons, where:
+
- a comma is used to separate fields like name, uuid, flags and table
(specifies one device)
- a semi-colon is used to separate devices.
-So the format will look like this:
+So the format will look like this::
dm-mod.create=<name>,<uuid>,<minor>,<flags>,<table>[,<table>+][;<name>,<uuid>,<minor>,<flags>,<table>[,<table>+]+]
-Where,
+Where::
+
<name> ::= The device name.
<uuid> ::= xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | ""
<minor> ::= The device minor number | ""
@@ -29,7 +32,7 @@ Where,
<target_type> ::= "verity" | "linear" | ... (see list below)
The dm line should be equivalent to the one used by the dmsetup tool with the
---concise argument.
+`--concise` argument.
Target types
============
@@ -38,32 +41,34 @@ Not all target types are available as there are serious risks in allowing
activation of certain DM targets without first using userspace tools to check
the validity of associated metadata.
- "cache": constrained, userspace should verify cache device
- "crypt": allowed
- "delay": allowed
- "era": constrained, userspace should verify metadata device
- "flakey": constrained, meant for test
- "linear": allowed
- "log-writes": constrained, userspace should verify metadata device
- "mirror": constrained, userspace should verify main/mirror device
- "raid": constrained, userspace should verify metadata device
- "snapshot": constrained, userspace should verify src/dst device
- "snapshot-origin": allowed
- "snapshot-merge": constrained, userspace should verify src/dst device
- "striped": allowed
- "switch": constrained, userspace should verify dev path
- "thin": constrained, requires dm target message from userspace
- "thin-pool": constrained, requires dm target message from userspace
- "verity": allowed
- "writecache": constrained, userspace should verify cache device
- "zero": constrained, not meant for rootfs
+======================= =======================================================
+`cache` constrained, userspace should verify cache device
+`crypt` allowed
+`delay` allowed
+`era` constrained, userspace should verify metadata device
+`flakey` constrained, meant for test
+`linear` allowed
+`log-writes` constrained, userspace should verify metadata device
+`mirror` constrained, userspace should verify main/mirror device
+`raid` constrained, userspace should verify metadata device
+`snapshot` constrained, userspace should verify src/dst device
+`snapshot-origin` allowed
+`snapshot-merge` constrained, userspace should verify src/dst device
+`striped` allowed
+`switch` constrained, userspace should verify dev path
+`thin` constrained, requires dm target message from userspace
+`thin-pool` constrained, requires dm target message from userspace
+`verity` allowed
+`writecache` constrained, userspace should verify cache device
+`zero` constrained, not meant for rootfs
+======================= =======================================================
If the target is not listed above, it is constrained by default (not tested).
Examples
========
An example of booting to a linear array made up of user-mode linux block
-devices:
+devices::
dm-mod.create="lroot,,,rw, 0 4096 linear 98:16 0, 4096 4096 linear 98:32 0" root=/dev/dm-0
@@ -71,43 +76,49 @@ This will boot to a rw dm-linear target of 8192 sectors split across two block
devices identified by their major:minor numbers. After boot, udev will rename
this target to /dev/mapper/lroot (depending on the rules). No uuid was assigned.
-An example of multiple device-mappers, with the dm-mod.create="..." contents is shown here
-split on multiple lines for readability:
+An example of multiple device-mappers, with the dm-mod.create="..." contents
+is shown here split on multiple lines for readability::
- vroot,,,ro,
- 0 1740800 verity 254:0 254:0 1740800 sha1
- 76e9be054b15884a9fa85973e9cb274c93afadb6
- 5b3549d54d6c7a3837b9b81ed72e49463a64c03680c47835bef94d768e5646fe;
- vram,,,rw,
- 0 32768 linear 1:0 0,
- 32768 32768 linear 1:1 0
+ dm-linear,,1,rw,
+ 0 32768 linear 8:1 0,
+ 32768 1024000 linear 8:2 0;
+ dm-verity,,3,ro,
+ 0 1638400 verity 1 /dev/sdc1 /dev/sdc2 4096 4096 204800 1 sha256
+ ac87db56303c9c1da433d7209b5a6ef3e4779df141200cbd7c157dcb8dd89c42
+ 5ebfe87f7df3235b80a117ebc4078e44f55045487ad4a96581d1adb564615b51
Other examples (per target):
-"crypt":
+"crypt"::
+
dm-crypt,,8,ro,
0 1048576 crypt aes-xts-plain64
babebabebabebabebabebabebabebabebabebabebabebabebabebabebabebabe 0
/dev/sda 0 1 allow_discards
-"delay":
+"delay"::
+
dm-delay,,4,ro,0 409600 delay /dev/sda1 0 500
-"linear":
+"linear"::
+
dm-linear,,,rw,
0 32768 linear /dev/sda1 0,
32768 1024000 linear /dev/sda2 0,
1056768 204800 linear /dev/sda3 0,
1261568 512000 linear /dev/sda4 0
-"snapshot-origin":
+"snapshot-origin"::
+
dm-snap-orig,,4,ro,0 409600 snapshot-origin 8:2
-"striped":
+"striped"::
+
dm-striped,,4,ro,0 1638400 striped 4 4096
/dev/sda1 0 /dev/sda2 0 /dev/sda3 0 /dev/sda4 0
-"verity":
+"verity"::
+
dm-verity,,4,ro,
0 1638400 verity 1 8:1 8:2 4096 4096 204800 1 sha256
fb1a5a0f00deb908d8b53cb270858975e76cf64105d412ce764225d53b8f3cfd
diff --git a/Documentation/device-mapper/dm-integrity.txt b/Documentation/device-mapper/dm-integrity.rst
index d63d78ffeb73..a30aa91b5fbe 100644
--- a/Documentation/device-mapper/dm-integrity.txt
+++ b/Documentation/device-mapper/dm-integrity.rst
@@ -1,3 +1,7 @@
+============
+dm-integrity
+============
+
The dm-integrity target emulates a block device that has additional
per-sector tags that can be used for storing integrity information.
@@ -35,15 +39,16 @@ zeroes. If the superblock is neither valid nor zeroed, the dm-integrity
target can't be loaded.
To use the target for the first time:
+
1. overwrite the superblock with zeroes
2. load the dm-integrity target with one-sector size, the kernel driver
- will format the device
+ will format the device
3. unload the dm-integrity target
4. read the "provided_data_sectors" value from the superblock
5. load the dm-integrity target with the the target size
- "provided_data_sectors"
+ "provided_data_sectors"
6. if you want to use dm-integrity with dm-crypt, load the dm-crypt target
- with the size "provided_data_sectors"
+ with the size "provided_data_sectors"
Target arguments:
@@ -51,17 +56,20 @@ Target arguments:
1. the underlying block device
2. the number of reserved sector at the beginning of the device - the
- dm-integrity won't read of write these sectors
+ dm-integrity won't read of write these sectors
3. the size of the integrity tag (if "-" is used, the size is taken from
- the internal-hash algorithm)
+ the internal-hash algorithm)
4. mode:
- D - direct writes (without journal) - in this mode, journaling is
+
+ D - direct writes (without journal)
+ in this mode, journaling is
not used and data sectors and integrity tags are written
separately. In case of crash, it is possible that the data
and integrity tag doesn't match.
- J - journaled writes - data and integrity tags are written to the
+ J - journaled writes
+ data and integrity tags are written to the
journal and atomicity is guaranteed. In case of crash,
either both data and tag or none of them are written. The
journaled mode degrades write throughput twice because the
@@ -178,9 +186,12 @@ and the reloaded target would be non-functional.
The layout of the formatted block device:
-* reserved sectors (they are not used by this target, they can be used for
- storing LUKS metadata or for other purpose), the size of the reserved
- area is specified in the target arguments
+
+* reserved sectors
+ (they are not used by this target, they can be used for
+ storing LUKS metadata or for other purpose), the size of the reserved
+ area is specified in the target arguments
+
* superblock (4kiB)
* magic string - identifies that the device was formatted
* version
@@ -192,40 +203,55 @@ The layout of the formatted block device:
metadata and padding). The user of this target should not send
bios that access data beyond the "provided data sectors" limit.
* flags
- SB_FLAG_HAVE_JOURNAL_MAC - a flag is set if journal_mac is used
- SB_FLAG_RECALCULATING - recalculating is in progress
- SB_FLAG_DIRTY_BITMAP - journal area contains the bitmap of dirty
- blocks
+ SB_FLAG_HAVE_JOURNAL_MAC
+ - a flag is set if journal_mac is used
+ SB_FLAG_RECALCULATING
+ - recalculating is in progress
+ SB_FLAG_DIRTY_BITMAP
+ - journal area contains the bitmap of dirty
+ blocks
* log2(sectors per block)
* a position where recalculating finished
* journal
The journal is divided into sections, each section contains:
+
* metadata area (4kiB), it contains journal entries
- every journal entry contains:
+
+ - every journal entry contains:
+
* logical sector (specifies where the data and tag should
be written)
* last 8 bytes of data
* integrity tag (the size is specified in the superblock)
- every metadata sector ends with
+
+ - every metadata sector ends with
+
* mac (8-bytes), all the macs in 8 metadata sectors form a
64-byte value. It is used to store hmac of sector
numbers in the journal section, to protect against a
possibility that the attacker tampers with sector
numbers in the journal.
* commit id
+
* data area (the size is variable; it depends on how many journal
entries fit into the metadata area)
- every sector in the data area contains:
+
+ - every sector in the data area contains:
+
* data (504 bytes of data, the last 8 bytes are stored in
the journal entry)
* commit id
+
To test if the whole journal section was written correctly, every
512-byte sector of the journal ends with 8-byte commit id. If the
commit id matches on all sectors in a journal section, then it is
assumed that the section was written correctly. If the commit id
doesn't match, the section was written partially and it should not
be replayed.
-* one or more runs of interleaved tags and data. Each run contains:
+
+* one or more runs of interleaved tags and data.
+ Each run contains:
+
* tag area - it contains integrity tags. There is one tag for each
sector in the data area
* data area - it contains data sectors. The number of data sectors
diff --git a/Documentation/device-mapper/dm-io.txt b/Documentation/device-mapper/dm-io.rst
index 3b5d9a52cdcf..d2492917a1f5 100644
--- a/Documentation/device-mapper/dm-io.txt
+++ b/Documentation/device-mapper/dm-io.rst
@@ -1,3 +1,4 @@
+=====
dm-io
=====
@@ -7,7 +8,7 @@ version.
The user must set up an io_region structure to describe the desired location
of the I/O. Each io_region indicates a block-device along with the starting
-sector and size of the region.
+sector and size of the region::
struct io_region {
struct block_device *bdev;
@@ -19,7 +20,7 @@ Dm-io can read from one io_region or write to one or more io_regions. Writes
to multiple regions are specified by an array of io_region structures.
The first I/O service type takes a list of memory pages as the data buffer for
-the I/O, along with an offset into the first page.
+the I/O, along with an offset into the first page::
struct page_list {
struct page_list *next;
@@ -35,7 +36,7 @@ the I/O, along with an offset into the first page.
The second I/O service type takes an array of bio vectors as the data buffer
for the I/O. This service can be handy if the caller has a pre-assembled bio,
-but wants to direct different portions of the bio to different devices.
+but wants to direct different portions of the bio to different devices::
int dm_io_sync_bvec(unsigned int num_regions, struct io_region *where,
int rw, struct bio_vec *bvec,
@@ -47,7 +48,7 @@ but wants to direct different portions of the bio to different devices.
The third I/O service type takes a pointer to a vmalloc'd memory buffer as the
data buffer for the I/O. This service can be handy if the caller needs to do
I/O to a large region but doesn't want to allocate a large number of individual
-memory pages.
+memory pages::
int dm_io_sync_vm(unsigned int num_regions, struct io_region *where, int rw,
void *data, unsigned long *error_bits);
@@ -55,11 +56,11 @@ memory pages.
void *data, io_notify_fn fn, void *context);
Callers of the asynchronous I/O services must include the name of a completion
-callback routine and a pointer to some context data for the I/O.
+callback routine and a pointer to some context data for the I/O::
typedef void (*io_notify_fn)(unsigned long error, void *context);
-The "error" parameter in this callback, as well as the "*error" parameter in
+The "error" parameter in this callback, as well as the `*error` parameter in
all of the synchronous versions, is a bitset (instead of a simple error value).
In the case of an write-I/O to multiple regions, this bitset allows dm-io to
indicate success or failure on each individual region.
@@ -72,4 +73,3 @@ always available in order to avoid unnecessary waiting while performing I/O.
When the user is finished using the dm-io services, they should call
dm_io_put() and specify the same number of pages that were given on the
dm_io_get() call.
-
diff --git a/Documentation/device-mapper/dm-log.txt b/Documentation/device-mapper/dm-log.rst
index c155ac569c44..ba4fce39bc27 100644
--- a/Documentation/device-mapper/dm-log.txt
+++ b/Documentation/device-mapper/dm-log.rst
@@ -1,3 +1,4 @@
+=====================
Device-Mapper Logging
=====================
The device-mapper logging code is used by some of the device-mapper
@@ -16,11 +17,13 @@ dm_dirty_log_type in include/linux/dm-dirty-log.h). Various different
logging implementations are available and provide different
capabilities. The list includes:
+============== ==============================================================
Type Files
-==== =====
+============== ==============================================================
disk drivers/md/dm-log.c
core drivers/md/dm-log.c
userspace drivers/md/dm-log-userspace* include/linux/dm-log-userspace.h
+============== ==============================================================
The "disk" log type
-------------------
diff --git a/Documentation/device-mapper/dm-queue-length.txt b/Documentation/device-mapper/dm-queue-length.rst
index f4db2562175c..d8e381c1cb02 100644
--- a/Documentation/device-mapper/dm-queue-length.txt
+++ b/Documentation/device-mapper/dm-queue-length.rst
@@ -1,3 +1,4 @@
+===============
dm-queue-length
===============
@@ -6,12 +7,18 @@ which selects a path with the least number of in-flight I/Os.
The path selector name is 'queue-length'.
Table parameters for each path: [<repeat_count>]
+
+::
+
<repeat_count>: The number of I/Os to dispatch using the selected
path before switching to the next path.
If not given, internal default is used. To check
the default value, see the activated table.
Status for each path: <status> <fail-count> <in-flight>
+
+::
+
<status>: 'A' if the path is active, 'F' if the path is failed.
<fail-count>: The number of path failures.
<in-flight>: The number of in-flight I/Os on the path.
@@ -29,11 +36,13 @@ Examples
========
In case that 2 paths (sda and sdb) are used with repeat_count == 128.
-# echo "0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128" \
- dmsetup create test
-#
-# dmsetup table
-test: 0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128
-#
-# dmsetup status
-test: 0 10 multipath 2 0 0 0 1 1 E 0 2 1 8:0 A 0 0 8:16 A 0 0
+::
+
+ # echo "0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128" \
+ dmsetup create test
+ #
+ # dmsetup table
+ test: 0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128
+ #
+ # dmsetup status
+ test: 0 10 multipath 2 0 0 0 1 1 E 0 2 1 8:0 A 0 0 8:16 A 0 0
diff --git a/Documentation/device-mapper/dm-raid.txt b/Documentation/device-mapper/dm-raid.rst
index 2355bef14653..2fe255b130fb 100644
--- a/Documentation/device-mapper/dm-raid.txt
+++ b/Documentation/device-mapper/dm-raid.rst
@@ -1,3 +1,4 @@
+=======
dm-raid
=======
@@ -8,49 +9,66 @@ interface.
Mapping Table Interface
-----------------------
-The target is named "raid" and it accepts the following parameters:
+The target is named "raid" and it accepts the following parameters::
<raid_type> <#raid_params> <raid_params> \
<#raid_devs> <metadata_dev0> <dev0> [.. <metadata_devN> <devN>]
<raid_type>:
+
+ ============= ===============================================================
raid0 RAID0 striping (no resilience)
raid1 RAID1 mirroring
raid4 RAID4 with dedicated last parity disk
raid5_n RAID5 with dedicated last parity disk supporting takeover
Same as raid4
- -Transitory layout
+
+ - Transitory layout
raid5_la RAID5 left asymmetric
+
- rotating parity 0 with data continuation
raid5_ra RAID5 right asymmetric
+
- rotating parity N with data continuation
raid5_ls RAID5 left symmetric
+
- rotating parity 0 with data restart
raid5_rs RAID5 right symmetric
+
- rotating parity N with data restart
raid6_zr RAID6 zero restart
+
- rotating parity zero (left-to-right) with data restart
raid6_nr RAID6 N restart
+
- rotating parity N (right-to-left) with data restart
raid6_nc RAID6 N continue
+
- rotating parity N (right-to-left) with data continuation
raid6_n_6 RAID6 with dedicate parity disks
+
- parity and Q-syndrome on the last 2 disks;
layout for takeover from/to raid4/raid5_n
raid6_la_6 Same as "raid_la" plus dedicated last Q-syndrome disk
+
- layout for takeover from raid5_la from/to raid6
raid6_ra_6 Same as "raid5_ra" dedicated last Q-syndrome disk
+
- layout for takeover from raid5_ra from/to raid6
raid6_ls_6 Same as "raid5_ls" dedicated last Q-syndrome disk
+
- layout for takeover from raid5_ls from/to raid6
raid6_rs_6 Same as "raid5_rs" dedicated last Q-syndrome disk
+
- layout for takeover from raid5_rs from/to raid6
raid10 Various RAID10 inspired algorithms chosen by additional params
(see raid10_format and raid10_copies below)
+
- RAID10: Striped Mirrors (aka 'Striping on top of mirrors')
- RAID1E: Integrated Adjacent Stripe Mirroring
- RAID1E: Integrated Offset Stripe Mirroring
- - and other similar RAID10 variants
+ - and other similar RAID10 variants
+ ============= ===============================================================
Reference: Chapter 4 of
http://www.snia.org/sites/default/files/SNIA_DDF_Technical_Position_v2.0.pdf
@@ -58,33 +76,41 @@ The target is named "raid" and it accepts the following parameters:
<#raid_params>: The number of parameters that follow.
<raid_params> consists of
+
Mandatory parameters:
- <chunk_size>: Chunk size in sectors. This parameter is often known as
+ <chunk_size>:
+ Chunk size in sectors. This parameter is often known as
"stripe size". It is the only mandatory parameter and
is placed first.
followed by optional parameters (in any order):
- [sync|nosync] Force or prevent RAID initialization.
+ [sync|nosync]
+ Force or prevent RAID initialization.
- [rebuild <idx>] Rebuild drive number 'idx' (first drive is 0).
+ [rebuild <idx>]
+ Rebuild drive number 'idx' (first drive is 0).
[daemon_sleep <ms>]
Interval between runs of the bitmap daemon that
clear bits. A longer interval means less bitmap I/O but
resyncing after a failure is likely to take longer.
- [min_recovery_rate <kB/sec/disk>] Throttle RAID initialization
- [max_recovery_rate <kB/sec/disk>] Throttle RAID initialization
- [write_mostly <idx>] Mark drive index 'idx' write-mostly.
- [max_write_behind <sectors>] See '--write-behind=' (man mdadm)
- [stripe_cache <sectors>] Stripe cache size (RAID 4/5/6 only)
+ [min_recovery_rate <kB/sec/disk>]
+ Throttle RAID initialization
+ [max_recovery_rate <kB/sec/disk>]
+ Throttle RAID initialization
+ [write_mostly <idx>]
+ Mark drive index 'idx' write-mostly.
+ [max_write_behind <sectors>]
+ See '--write-behind=' (man mdadm)
+ [stripe_cache <sectors>]
+ Stripe cache size (RAID 4/5/6 only)
[region_size <sectors>]
The region_size multiplied by the number of regions is the
logical size of the array. The bitmap records the device
synchronisation state for each region.
- [raid10_copies <# copies>]
- [raid10_format <near|far|offset>]
+ [raid10_copies <# copies>], [raid10_format <near|far|offset>]
These two options are used to alter the default layout of
a RAID10 configuration. The number of copies is can be
specified, but the default is 2. There are also three
@@ -93,13 +119,17 @@ The target is named "raid" and it accepts the following parameters:
respect to mirroring. If these options are left unspecified,
or 'raid10_copies 2' and/or 'raid10_format near' are given,
then the layouts for 2, 3 and 4 devices are:
+
+ ======== ========== ==============
2 drives 3 drives 4 drives
- -------- ---------- --------------
+ ======== ========== ==============
A1 A1 A1 A1 A2 A1 A1 A2 A2
A2 A2 A2 A3 A3 A3 A3 A4 A4
A3 A3 A4 A4 A5 A5 A5 A6 A6
A4 A4 A5 A6 A6 A7 A7 A8 A8
.. .. .. .. .. .. .. .. ..
+ ======== ========== ==============
+
The 2-device layout is equivalent 2-way RAID1. The 4-device
layout is what a traditional RAID10 would look like. The
3-device layout is what might be called a 'RAID1E - Integrated
@@ -107,8 +137,10 @@ The target is named "raid" and it accepts the following parameters:
If 'raid10_copies 2' and 'raid10_format far', then the layouts
for 2, 3 and 4 devices are:
+
+ ======== ============ ===================
2 drives 3 drives 4 drives
- -------- -------------- --------------------
+ ======== ============ ===================
A1 A2 A1 A2 A3 A1 A2 A3 A4
A3 A4 A4 A5 A6 A5 A6 A7 A8
A5 A6 A7 A8 A9 A9 A10 A11 A12
@@ -117,11 +149,14 @@ The target is named "raid" and it accepts the following parameters:
A4 A3 A6 A4 A5 A6 A5 A8 A7
A6 A5 A9 A7 A8 A10 A9 A12 A11
.. .. .. .. .. .. .. .. ..
+ ======== ============ ===================
If 'raid10_copies 2' and 'raid10_format offset', then the
layouts for 2, 3 and 4 devices are:
+
+ ======== ========== ================
2 drives 3 drives 4 drives
- -------- ------------ -----------------
+ ======== ========== ================
A1 A2 A1 A2 A3 A1 A2 A3 A4
A2 A1 A3 A1 A2 A2 A1 A4 A3
A3 A4 A4 A5 A6 A5 A6 A7 A8
@@ -129,6 +164,8 @@ The target is named "raid" and it accepts the following parameters:
A5 A6 A7 A8 A9 A9 A10 A11 A12
A6 A5 A9 A7 A8 A10 A9 A12 A11
.. .. .. .. .. .. .. .. ..
+ ======== ========== ================
+
Here we see layouts closely akin to 'RAID1E - Integrated
Offset Stripe Mirroring'.
@@ -190,22 +227,25 @@ The target is named "raid" and it accepts the following parameters:
Example Tables
--------------
-# RAID4 - 4 data drives, 1 parity (no metadata devices)
-# No metadata devices specified to hold superblock/bitmap info
-# Chunk size of 1MiB
-# (Lines separated for easy reading)
-0 1960893648 raid \
- raid4 1 2048 \
- 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81
+::
-# RAID4 - 4 data drives, 1 parity (with metadata devices)
-# Chunk size of 1MiB, force RAID initialization,
-# min recovery rate at 20 kiB/sec/disk
+ # RAID4 - 4 data drives, 1 parity (no metadata devices)
+ # No metadata devices specified to hold superblock/bitmap info
+ # Chunk size of 1MiB
+ # (Lines separated for easy reading)
-0 1960893648 raid \
- raid4 4 2048 sync min_recovery_rate 20 \
- 5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82
+ 0 1960893648 raid \
+ raid4 1 2048 \
+ 5 - 8:17 - 8:33 - 8:49 - 8:65 - 8:81
+
+ # RAID4 - 4 data drives, 1 parity (with metadata devices)
+ # Chunk size of 1MiB, force RAID initialization,
+ # min recovery rate at 20 kiB/sec/disk
+
+ 0 1960893648 raid \
+ raid4 4 2048 sync min_recovery_rate 20 \
+ 5 8:17 8:18 8:33 8:34 8:49 8:50 8:65 8:66 8:81 8:82
Status Output
@@ -219,41 +259,58 @@ Arguments that can be repeated are ordered by value.
'dmsetup status' yields information on the state and health of the array.
The output is as follows (normally a single line, but expanded here for
-clarity):
-1: <s> <l> raid \
-2: <raid_type> <#devices> <health_chars> \
-3: <sync_ratio> <sync_action> <mismatch_cnt>
+clarity)::
+
+ 1: <s> <l> raid \
+ 2: <raid_type> <#devices> <health_chars> \
+ 3: <sync_ratio> <sync_action> <mismatch_cnt>
Line 1 is the standard output produced by device-mapper.
-Line 2 & 3 are produced by the raid target and are best explained by example:
+
+Line 2 & 3 are produced by the raid target and are best explained by example::
+
0 1960893648 raid raid4 5 AAAAA 2/490221568 init 0
+
Here we can see the RAID type is raid4, there are 5 devices - all of
which are 'A'live, and the array is 2/490221568 complete with its initial
recovery. Here is a fuller description of the individual fields:
+
+ =============== =========================================================
<raid_type> Same as the <raid_type> used to create the array.
- <health_chars> One char for each device, indicating: 'A' = alive and
- in-sync, 'a' = alive but not in-sync, 'D' = dead/failed.
+ <health_chars> One char for each device, indicating:
+
+ - 'A' = alive and in-sync
+ - 'a' = alive but not in-sync
+ - 'D' = dead/failed.
<sync_ratio> The ratio indicating how much of the array has undergone
the process described by 'sync_action'. If the
'sync_action' is "check" or "repair", then the process
of "resync" or "recover" can be considered complete.
<sync_action> One of the following possible states:
- idle - No synchronization action is being performed.
- frozen - The current action has been halted.
- resync - Array is undergoing its initial synchronization
+
+ idle
+ - No synchronization action is being performed.
+ frozen
+ - The current action has been halted.
+ resync
+ - Array is undergoing its initial synchronization
or is resynchronizing after an unclean shutdown
(possibly aided by a bitmap).
- recover - A device in the array is being rebuilt or
+ recover
+ - A device in the array is being rebuilt or
replaced.
- check - A user-initiated full check of the array is
+ check
+ - A user-initiated full check of the array is
being performed. All blocks are read and
checked for consistency. The number of
discrepancies found are recorded in
<mismatch_cnt>. No changes are made to the
array by this action.
- repair - The same as "check", but discrepancies are
+ repair
+ - The same as "check", but discrepancies are
corrected.
- reshape - The array is undergoing a reshape.
+ reshape
+ - The array is undergoing a reshape.
<mismatch_cnt> The number of discrepancies found between mirror copies
in RAID1/10 or wrong parity values found in RAID4/5/6.
This value is valid only after a "check" of the array
@@ -261,10 +318,11 @@ recovery. Here is a fuller description of the individual fields:
<data_offset> The current data offset to the start of the user data on
each component device of a raid set (see the respective
raid parameter to support out-of-place reshaping).
- <journal_char> 'A' - active write-through journal device.
- 'a' - active write-back journal device.
- 'D' - dead journal device.
- '-' - no journal device.
+ <journal_char> - 'A' - active write-through journal device.
+ - 'a' - active write-back journal device.
+ - 'D' - dead journal device.
+ - '-' - no journal device.
+ =============== =========================================================
Message Interface
@@ -272,12 +330,15 @@ Message Interface
The dm-raid target will accept certain actions through the 'message' interface.
('man dmsetup' for more information on the message interface.) These actions
include:
- "idle" - Halt the current sync action.
- "frozen" - Freeze the current sync action.
- "resync" - Initiate/continue a resync.
- "recover"- Initiate/continue a recover process.
- "check" - Initiate a check (i.e. a "scrub") of the array.
- "repair" - Initiate a repair of the array.
+
+ ========= ================================================
+ "idle" Halt the current sync action.
+ "frozen" Freeze the current sync action.
+ "resync" Initiate/continue a resync.
+ "recover" Initiate/continue a recover process.
+ "check" Initiate a check (i.e. a "scrub") of the array.
+ "repair" Initiate a repair of the array.
+ ========= ================================================
Discard Support
@@ -307,48 +368,52 @@ increasingly whitelisted in the kernel and can thus be trusted.
For trusted devices, the following dm-raid module parameter can be set
to safely enable discard support for RAID 4/5/6:
+
'devices_handle_discards_safely'
Version History
---------------
-1.0.0 Initial version. Support for RAID 4/5/6
-1.1.0 Added support for RAID 1
-1.2.0 Handle creation of arrays that contain failed devices.
-1.3.0 Added support for RAID 10
-1.3.1 Allow device replacement/rebuild for RAID 10
-1.3.2 Fix/improve redundancy checking for RAID10
-1.4.0 Non-functional change. Removes arg from mapping function.
-1.4.1 RAID10 fix redundancy validation checks (commit 55ebbb5).
-1.4.2 Add RAID10 "far" and "offset" algorithm support.
-1.5.0 Add message interface to allow manipulation of the sync_action.
+
+::
+
+ 1.0.0 Initial version. Support for RAID 4/5/6
+ 1.1.0 Added support for RAID 1
+ 1.2.0 Handle creation of arrays that contain failed devices.
+ 1.3.0 Added support for RAID 10
+ 1.3.1 Allow device replacement/rebuild for RAID 10
+ 1.3.2 Fix/improve redundancy checking for RAID10
+ 1.4.0 Non-functional change. Removes arg from mapping function.
+ 1.4.1 RAID10 fix redundancy validation checks (commit 55ebbb5).
+ 1.4.2 Add RAID10 "far" and "offset" algorithm support.
+ 1.5.0 Add message interface to allow manipulation of the sync_action.
New status (STATUSTYPE_INFO) fields: sync_action and mismatch_cnt.
-1.5.1 Add ability to restore transiently failed devices on resume.
-1.5.2 'mismatch_cnt' is zero unless [last_]sync_action is "check".
-1.6.0 Add discard support (and devices_handle_discard_safely module param).
-1.7.0 Add support for MD RAID0 mappings.
-1.8.0 Explicitly check for compatible flags in the superblock metadata
+ 1.5.1 Add ability to restore transiently failed devices on resume.
+ 1.5.2 'mismatch_cnt' is zero unless [last_]sync_action is "check".
+ 1.6.0 Add discard support (and devices_handle_discard_safely module param).
+ 1.7.0 Add support for MD RAID0 mappings.
+ 1.8.0 Explicitly check for compatible flags in the superblock metadata
and reject to start the raid set if any are set by a newer
target version, thus avoiding data corruption on a raid set
with a reshape in progress.
-1.9.0 Add support for RAID level takeover/reshape/region size
+ 1.9.0 Add support for RAID level takeover/reshape/region size
and set size reduction.
-1.9.1 Fix activation of existing RAID 4/10 mapped devices
-1.9.2 Don't emit '- -' on the status table line in case the constructor
+ 1.9.1 Fix activation of existing RAID 4/10 mapped devices
+ 1.9.2 Don't emit '- -' on the status table line in case the constructor
fails reading a superblock. Correctly emit 'maj:min1 maj:min2' and
'D' on the status line. If '- -' is passed into the constructor, emit
'- -' on the table line and '-' as the status line health character.
-1.10.0 Add support for raid4/5/6 journal device
-1.10.1 Fix data corruption on reshape request
-1.11.0 Fix table line argument order
+ 1.10.0 Add support for raid4/5/6 journal device
+ 1.10.1 Fix data corruption on reshape request
+ 1.11.0 Fix table line argument order
(wrong raid10_copies/raid10_format sequence)
-1.11.1 Add raid4/5/6 journal write-back support via journal_mode option
-1.12.1 Fix for MD deadlock between mddev_suspend() and md_write_start() available
-1.13.0 Fix dev_health status at end of "recover" (was 'a', now 'A')
-1.13.1 Fix deadlock caused by early md_stop_writes(). Also fix size an
+ 1.11.1 Add raid4/5/6 journal write-back support via journal_mode option
+ 1.12.1 Fix for MD deadlock between mddev_suspend() and md_write_start() available
+ 1.13.0 Fix dev_health status at end of "recover" (was 'a', now 'A')
+ 1.13.1 Fix deadlock caused by early md_stop_writes(). Also fix size an
state races.
-1.13.2 Fix raid redundancy validation and avoid keeping raid set frozen
-1.14.0 Fix reshape race on small devices. Fix stripe adding reshape
+ 1.13.2 Fix raid redundancy validation and avoid keeping raid set frozen
+ 1.14.0 Fix reshape race on small devices. Fix stripe adding reshape
deadlock/potential data corruption. Update superblock when
specific devices are requested via rebuild. Fix RAID leg
rebuild errors.
diff --git a/Documentation/device-mapper/dm-service-time.txt b/Documentation/device-mapper/dm-service-time.rst
index fb1d4a0cf122..facf277fc13c 100644
--- a/Documentation/device-mapper/dm-service-time.txt
+++ b/Documentation/device-mapper/dm-service-time.rst
@@ -1,3 +1,4 @@
+===============
dm-service-time
===============
@@ -12,25 +13,34 @@ in a path-group, and it can be specified as a table argument.
The path selector name is 'service-time'.
-Table parameters for each path: [<repeat_count> [<relative_throughput>]]
- <repeat_count>: The number of I/Os to dispatch using the selected
+Table parameters for each path:
+
+ [<repeat_count> [<relative_throughput>]]
+ <repeat_count>:
+ The number of I/Os to dispatch using the selected
path before switching to the next path.
If not given, internal default is used. To check
the default value, see the activated table.
- <relative_throughput>: The relative throughput value of the path
+ <relative_throughput>:
+ The relative throughput value of the path
among all paths in the path-group.
The valid range is 0-100.
If not given, minimum value '1' is used.
If '0' is given, the path isn't selected while
other paths having a positive value are available.
-Status for each path: <status> <fail-count> <in-flight-size> \
- <relative_throughput>
- <status>: 'A' if the path is active, 'F' if the path is failed.
- <fail-count>: The number of path failures.
- <in-flight-size>: The size of in-flight I/Os on the path.
- <relative_throughput>: The relative throughput value of the path
- among all paths in the path-group.
+Status for each path:
+
+ <status> <fail-count> <in-flight-size> <relative_throughput>
+ <status>:
+ 'A' if the path is active, 'F' if the path is failed.
+ <fail-count>:
+ The number of path failures.
+ <in-flight-size>:
+ The size of in-flight I/Os on the path.
+ <relative_throughput>:
+ The relative throughput value of the path
+ among all paths in the path-group.
Algorithm
@@ -39,7 +49,7 @@ Algorithm
dm-service-time adds the I/O size to 'in-flight-size' when the I/O is
dispatched and subtracts when completed.
Basically, dm-service-time selects a path having minimum service time
-which is calculated by:
+which is calculated by::
('in-flight-size' + 'size-of-incoming-io') / 'relative_throughput'
@@ -67,25 +77,25 @@ Examples
========
In case that 2 paths (sda and sdb) are used with repeat_count == 128
and sda has an average throughput 1GB/s and sdb has 4GB/s,
-'relative_throughput' value may be '1' for sda and '4' for sdb.
-
-# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4" \
- dmsetup create test
-#
-# dmsetup table
-test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4
-#
-# dmsetup status
-test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 1 8:16 A 0 0 4
-
-
-Or '2' for sda and '8' for sdb would be also true.
-
-# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8" \
- dmsetup create test
-#
-# dmsetup table
-test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8
-#
-# dmsetup status
-test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 2 8:16 A 0 0 8
+'relative_throughput' value may be '1' for sda and '4' for sdb::
+
+ # echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4" \
+ dmsetup create test
+ #
+ # dmsetup table
+ test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4
+ #
+ # dmsetup status
+ test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 1 8:16 A 0 0 4
+
+
+Or '2' for sda and '8' for sdb would be also true::
+
+ # echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8" \
+ dmsetup create test
+ #
+ # dmsetup table
+ test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8
+ #
+ # dmsetup status
+ test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 2 8:16 A 0 0 8
diff --git a/Documentation/device-mapper/dm-uevent.rst b/Documentation/device-mapper/dm-uevent.rst
new file mode 100644
index 000000000000..4a8ee8d069c9
--- /dev/null
+++ b/Documentation/device-mapper/dm-uevent.rst
@@ -0,0 +1,110 @@
+====================
+device-mapper uevent
+====================
+
+The device-mapper uevent code adds the capability to device-mapper to create
+and send kobject uevents (uevents). Previously device-mapper events were only
+available through the ioctl interface. The advantage of the uevents interface
+is the event contains environment attributes providing increased context for
+the event avoiding the need to query the state of the device-mapper device after
+the event is received.
+
+There are two functions currently for device-mapper events. The first function
+listed creates the event and the second function sends the event(s)::
+
+ void dm_path_uevent(enum dm_uevent_type event_type, struct dm_target *ti,
+ const char *path, unsigned nr_valid_paths)
+
+ void dm_send_uevents(struct list_head *events, struct kobject *kobj)
+
+
+The variables added to the uevent environment are:
+
+Variable Name: DM_TARGET
+------------------------
+:Uevent Action(s): KOBJ_CHANGE
+:Type: string
+:Description:
+:Value: Name of device-mapper target that generated the event.
+
+Variable Name: DM_ACTION
+------------------------
+:Uevent Action(s): KOBJ_CHANGE
+:Type: string
+:Description:
+:Value: Device-mapper specific action that caused the uevent action.
+ PATH_FAILED - A path has failed;
+ PATH_REINSTATED - A path has been reinstated.
+
+Variable Name: DM_SEQNUM
+------------------------
+:Uevent Action(s): KOBJ_CHANGE
+:Type: unsigned integer
+:Description: A sequence number for this specific device-mapper device.
+:Value: Valid unsigned integer range.
+
+Variable Name: DM_PATH
+----------------------
+:Uevent Action(s): KOBJ_CHANGE
+:Type: string
+:Description: Major and minor number of the path device pertaining to this
+ event.
+:Value: Path name in the form of "Major:Minor"
+
+Variable Name: DM_NR_VALID_PATHS
+--------------------------------
+:Uevent Action(s): KOBJ_CHANGE
+:Type: unsigned integer
+:Description:
+:Value: Valid unsigned integer range.
+
+Variable Name: DM_NAME
+----------------------
+:Uevent Action(s): KOBJ_CHANGE
+:Type: string
+:Description: Name of the device-mapper device.
+:Value: Name
+
+Variable Name: DM_UUID
+----------------------
+:Uevent Action(s): KOBJ_CHANGE
+:Type: string
+:Description: UUID of the device-mapper device.
+:Value: UUID. (Empty string if there isn't one.)
+
+An example of the uevents generated as captured by udevmonitor is shown
+below
+
+1.) Path failure::
+
+ UEVENT[1192521009.711215] change@/block/dm-3
+ ACTION=change
+ DEVPATH=/block/dm-3
+ SUBSYSTEM=block
+ DM_TARGET=multipath
+ DM_ACTION=PATH_FAILED
+ DM_SEQNUM=1
+ DM_PATH=8:32
+ DM_NR_VALID_PATHS=0
+ DM_NAME=mpath2
+ DM_UUID=mpath-35333333000002328
+ MINOR=3
+ MAJOR=253
+ SEQNUM=1130
+
+2.) Path reinstate::
+
+ UEVENT[1192521132.989927] change@/block/dm-3
+ ACTION=change
+ DEVPATH=/block/dm-3
+ SUBSYSTEM=block
+ DM_TARGET=multipath
+ DM_ACTION=PATH_REINSTATED
+ DM_SEQNUM=2
+ DM_PATH=8:32
+ DM_NR_VALID_PATHS=1
+ DM_NAME=mpath2
+ DM_UUID=mpath-35333333000002328
+ MINOR=3
+ MAJOR=253
+ SEQNUM=1131
diff --git a/Documentation/device-mapper/dm-uevent.txt b/Documentation/device-mapper/dm-uevent.txt
deleted file mode 100644
index 07edbd85c714..000000000000
--- a/Documentation/device-mapper/dm-uevent.txt
+++ /dev/null
@@ -1,97 +0,0 @@
-The device-mapper uevent code adds the capability to device-mapper to create
-and send kobject uevents (uevents). Previously device-mapper events were only
-available through the ioctl interface. The advantage of the uevents interface
-is the event contains environment attributes providing increased context for
-the event avoiding the need to query the state of the device-mapper device after
-the event is received.
-
-There are two functions currently for device-mapper events. The first function
-listed creates the event and the second function sends the event(s).
-
-void dm_path_uevent(enum dm_uevent_type event_type, struct dm_target *ti,
- const char *path, unsigned nr_valid_paths)
-
-void dm_send_uevents(struct list_head *events, struct kobject *kobj)
-
-
-The variables added to the uevent environment are:
-
-Variable Name: DM_TARGET
-Uevent Action(s): KOBJ_CHANGE
-Type: string
-Description:
-Value: Name of device-mapper target that generated the event.
-
-Variable Name: DM_ACTION
-Uevent Action(s): KOBJ_CHANGE
-Type: string
-Description:
-Value: Device-mapper specific action that caused the uevent action.
- PATH_FAILED - A path has failed.
- PATH_REINSTATED - A path has been reinstated.
-
-Variable Name: DM_SEQNUM
-Uevent Action(s): KOBJ_CHANGE
-Type: unsigned integer
-Description: A sequence number for this specific device-mapper device.
-Value: Valid unsigned integer range.
-
-Variable Name: DM_PATH
-Uevent Action(s): KOBJ_CHANGE
-Type: string
-Description: Major and minor number of the path device pertaining to this
-event.
-Value: Path name in the form of "Major:Minor"
-
-Variable Name: DM_NR_VALID_PATHS
-Uevent Action(s): KOBJ_CHANGE
-Type: unsigned integer
-Description:
-Value: Valid unsigned integer range.
-
-Variable Name: DM_NAME
-Uevent Action(s): KOBJ_CHANGE
-Type: string
-Description: Name of the device-mapper device.
-Value: Name
-
-Variable Name: DM_UUID
-Uevent Action(s): KOBJ_CHANGE
-Type: string
-Description: UUID of the device-mapper device.
-Value: UUID. (Empty string if there isn't one.)
-
-An example of the uevents generated as captured by udevmonitor is shown
-below.
-
-1.) Path failure.
-UEVENT[1192521009.711215] change@/block/dm-3
-ACTION=change
-DEVPATH=/block/dm-3
-SUBSYSTEM=block
-DM_TARGET=multipath
-DM_ACTION=PATH_FAILED
-DM_SEQNUM=1
-DM_PATH=8:32
-DM_NR_VALID_PATHS=0
-DM_NAME=mpath2
-DM_UUID=mpath-35333333000002328
-MINOR=3
-MAJOR=253
-SEQNUM=1130
-
-2.) Path reinstate.
-UEVENT[1192521132.989927] change@/block/dm-3
-ACTION=change
-DEVPATH=/block/dm-3
-SUBSYSTEM=block
-DM_TARGET=multipath
-DM_ACTION=PATH_REINSTATED
-DM_SEQNUM=2
-DM_PATH=8:32
-DM_NR_VALID_PATHS=1
-DM_NAME=mpath2
-DM_UUID=mpath-35333333000002328
-MINOR=3
-MAJOR=253
-SEQNUM=1131
diff --git a/Documentation/device-mapper/dm-zoned.txt b/Documentation/device-mapper/dm-zoned.rst
index 736fcc78d193..07f56ebc1730 100644
--- a/Documentation/device-mapper/dm-zoned.txt
+++ b/Documentation/device-mapper/dm-zoned.rst
@@ -1,3 +1,4 @@
+========
dm-zoned
========
@@ -133,12 +134,13 @@ A zoned block device must first be formatted using the dmzadm tool. This
will analyze the device zone configuration, determine where to place the
metadata sets on the device and initialize the metadata sets.
-Ex:
+Ex::
-dmzadm --format /dev/sdxx
+ dmzadm --format /dev/sdxx
For a formatted device, the target can be created normally with the
dmsetup utility. The only parameter that dm-zoned requires is the
-underlying zoned block device name. Ex:
+underlying zoned block device name. Ex::
-echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | dmsetup create dmz-`basename ${dev}`
+ echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
+ dmsetup create dmz-`basename ${dev}`
diff --git a/Documentation/device-mapper/era.txt b/Documentation/device-mapper/era.rst
index 3c6d01be3560..90dd5c670b9f 100644
--- a/Documentation/device-mapper/era.txt
+++ b/Documentation/device-mapper/era.rst
@@ -1,3 +1,7 @@
+======
+dm-era
+======
+
Introduction
============
@@ -14,12 +18,14 @@ coherency after rolling back a vendor snapshot.
Constructor
===========
- era <metadata dev> <origin dev> <block size>
+era <metadata dev> <origin dev> <block size>
- metadata dev : fast device holding the persistent metadata
- origin dev : device holding data blocks that may change
- block size : block size of origin data device, granularity that is
- tracked by the target
+ ================ ======================================================
+ metadata dev fast device holding the persistent metadata
+ origin dev device holding data blocks that may change
+ block size block size of origin data device, granularity that is
+ tracked by the target
+ ================ ======================================================
Messages
========
@@ -49,14 +55,16 @@ Status
<metadata block size> <#used metadata blocks>/<#total metadata blocks>
<current era> <held metadata root | '-'>
-metadata block size : Fixed block size for each metadata block in
- sectors
-#used metadata blocks : Number of metadata blocks used
-#total metadata blocks : Total number of metadata blocks
-current era : The current era
-held metadata root : The location, in blocks, of the metadata root
- that has been 'held' for userspace read
- access. '-' indicates there is no held root
+========================= ==============================================
+metadata block size Fixed block size for each metadata block in
+ sectors
+#used metadata blocks Number of metadata blocks used
+#total metadata blocks Total number of metadata blocks
+current era The current era
+held metadata root The location, in blocks, of the metadata root
+ that has been 'held' for userspace read
+ access. '-' indicates there is no held root
+========================= ==============================================
Detailed use case
=================
@@ -88,7 +96,7 @@ Memory usage
The target uses a bitset to record writes in the current era. It also
has a spare bitset ready for switching over to a new era. Other than
-that it uses a few 4k blocks for updating metadata.
+that it uses a few 4k blocks for updating metadata::
(4 * nr_blocks) bytes + buffers
diff --git a/Documentation/device-mapper/index.rst b/Documentation/device-mapper/index.rst
new file mode 100644
index 000000000000..105e253bc231
--- /dev/null
+++ b/Documentation/device-mapper/index.rst
@@ -0,0 +1,44 @@
+:orphan:
+
+=============
+Device Mapper
+=============
+
+.. toctree::
+ :maxdepth: 1
+
+ cache-policies
+ cache
+ delay
+ dm-crypt
+ dm-flakey
+ dm-init
+ dm-integrity
+ dm-io
+ dm-log
+ dm-queue-length
+ dm-raid
+ dm-service-time
+ dm-uevent
+ dm-zoned
+ era
+ kcopyd
+ linear
+ log-writes
+ persistent-data
+ snapshot
+ statistics
+ striped
+ switch
+ thin-provisioning
+ unstriped
+ verity
+ writecache
+ zero
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/device-mapper/kcopyd.txt b/Documentation/device-mapper/kcopyd.rst
index 820382c4cecf..7651d395127f 100644
--- a/Documentation/device-mapper/kcopyd.txt
+++ b/Documentation/device-mapper/kcopyd.rst
@@ -1,3 +1,4 @@
+======
kcopyd
======
@@ -7,7 +8,7 @@ notification. It is used by dm-snapshot and dm-mirror.
Users of kcopyd must first create a client and indicate how many memory pages
to set aside for their copy jobs. This is done with a call to
-kcopyd_client_create().
+kcopyd_client_create()::
int kcopyd_client_create(unsigned int num_pages,
struct kcopyd_client **result);
@@ -16,7 +17,7 @@ To start a copy job, the user must set up io_region structures to describe
the source and destinations of the copy. Each io_region indicates a
block-device along with the starting sector and size of the region. The source
of the copy is given as one io_region structure, and the destinations of the
-copy are given as an array of io_region structures.
+copy are given as an array of io_region structures::
struct io_region {
struct block_device *bdev;
@@ -26,7 +27,7 @@ copy are given as an array of io_region structures.
To start the copy, the user calls kcopyd_copy(), passing in the client
pointer, pointers to the source and destination io_regions, the name of a
-completion callback routine, and a pointer to some context data for the copy.
+completion callback routine, and a pointer to some context data for the copy::
int kcopyd_copy(struct kcopyd_client *kc, struct io_region *from,
unsigned int num_dests, struct io_region *dests,
@@ -41,7 +42,6 @@ write error occurred during the copy.
When a user is done with all their copy jobs, they should call
kcopyd_client_destroy() to delete the kcopyd client, which will release the
-associated memory pages.
+associated memory pages::
void kcopyd_client_destroy(struct kcopyd_client *kc);
-
diff --git a/Documentation/device-mapper/linear.rst b/Documentation/device-mapper/linear.rst
new file mode 100644
index 000000000000..9d17fc6e64a9
--- /dev/null
+++ b/Documentation/device-mapper/linear.rst
@@ -0,0 +1,63 @@
+=========
+dm-linear
+=========
+
+Device-Mapper's "linear" target maps a linear range of the Device-Mapper
+device onto a linear range of another device. This is the basic building
+block of logical volume managers.
+
+Parameters: <dev path> <offset>
+ <dev path>:
+ Full pathname to the underlying block-device, or a
+ "major:minor" device-number.
+ <offset>:
+ Starting sector within the device.
+
+
+Example scripts
+===============
+
+::
+
+ #!/bin/sh
+ # Create an identity mapping for a device
+ echo "0 `blockdev --getsz $1` linear $1 0" | dmsetup create identity
+
+::
+
+ #!/bin/sh
+ # Join 2 devices together
+ size1=`blockdev --getsz $1`
+ size2=`blockdev --getsz $2`
+ echo "0 $size1 linear $1 0
+ $size1 $size2 linear $2 0" | dmsetup create joined
+
+::
+
+ #!/usr/bin/perl -w
+ # Split a device into 4M chunks and then join them together in reverse order.
+
+ my $name = "reverse";
+ my $extent_size = 4 * 1024 * 2;
+ my $dev = $ARGV[0];
+ my $table = "";
+ my $count = 0;
+
+ if (!defined($dev)) {
+ die("Please specify a device.\n");
+ }
+
+ my $dev_size = `blockdev --getsz $dev`;
+ my $extents = int($dev_size / $extent_size) -
+ (($dev_size % $extent_size) ? 1 : 0);
+
+ while ($extents > 0) {
+ my $this_start = $count * $extent_size;
+ $extents--;
+ $count++;
+ my $this_offset = $extents * $extent_size;
+
+ $table .= "$this_start $extent_size linear $dev $this_offset\n";
+ }
+
+ `echo \"$table\" | dmsetup create $name`;
diff --git a/Documentation/device-mapper/linear.txt b/Documentation/device-mapper/linear.txt
deleted file mode 100644
index 7cb98d89d3f8..000000000000
--- a/Documentation/device-mapper/linear.txt
+++ /dev/null
@@ -1,61 +0,0 @@
-dm-linear
-=========
-
-Device-Mapper's "linear" target maps a linear range of the Device-Mapper
-device onto a linear range of another device. This is the basic building
-block of logical volume managers.
-
-Parameters: <dev path> <offset>
- <dev path>: Full pathname to the underlying block-device, or a
- "major:minor" device-number.
- <offset>: Starting sector within the device.
-
-
-Example scripts
-===============
-[[
-#!/bin/sh
-# Create an identity mapping for a device
-echo "0 `blockdev --getsz $1` linear $1 0" | dmsetup create identity
-]]
-
-
-[[
-#!/bin/sh
-# Join 2 devices together
-size1=`blockdev --getsz $1`
-size2=`blockdev --getsz $2`
-echo "0 $size1 linear $1 0
-$size1 $size2 linear $2 0" | dmsetup create joined
-]]
-
-
-[[
-#!/usr/bin/perl -w
-# Split a device into 4M chunks and then join them together in reverse order.
-
-my $name = "reverse";
-my $extent_size = 4 * 1024 * 2;
-my $dev = $ARGV[0];
-my $table = "";
-my $count = 0;
-
-if (!defined($dev)) {
- die("Please specify a device.\n");
-}
-
-my $dev_size = `blockdev --getsz $dev`;
-my $extents = int($dev_size / $extent_size) -
- (($dev_size % $extent_size) ? 1 : 0);
-
-while ($extents > 0) {
- my $this_start = $count * $extent_size;
- $extents--;
- $count++;
- my $this_offset = $extents * $extent_size;
-
- $table .= "$this_start $extent_size linear $dev $this_offset\n";
-}
-
-`echo \"$table\" | dmsetup create $name`;
-]]
diff --git a/Documentation/device-mapper/log-writes.txt b/Documentation/device-mapper/log-writes.rst
index b638d124be6a..23141f2ffb7c 100644
--- a/Documentation/device-mapper/log-writes.txt
+++ b/Documentation/device-mapper/log-writes.rst
@@ -1,3 +1,4 @@
+=============
dm-log-writes
=============
@@ -25,11 +26,11 @@ completed WRITEs, at the time the REQ_PREFLUSH is issued, are added in order to
simulate the worst case scenario with regard to power failures. Consider the
following example (W means write, C means complete):
-W1,W2,W3,C3,C2,Wflush,C1,Cflush
+ W1,W2,W3,C3,C2,Wflush,C1,Cflush
-The log would show the following
+The log would show the following:
-W3,W2,flush,W1....
+ W3,W2,flush,W1....
Again this is to simulate what is actually on disk, this allows us to detect
cases where a power failure at a particular point in time would create an
@@ -42,11 +43,11 @@ Any REQ_OP_DISCARD requests are treated like WRITE requests. Otherwise we would
have all the DISCARD requests, and then the WRITE requests and then the FLUSH
request. Consider the following example:
-WRITE block 1, DISCARD block 1, FLUSH
+ WRITE block 1, DISCARD block 1, FLUSH
-If we logged DISCARD when it completed, the replay would look like this
+If we logged DISCARD when it completed, the replay would look like this:
-DISCARD 1, WRITE 1, FLUSH
+ DISCARD 1, WRITE 1, FLUSH
which isn't quite what happened and wouldn't be caught during the log replay.
@@ -57,15 +58,19 @@ i) Constructor
log-writes <dev_path> <log_dev_path>
- dev_path : Device that all of the IO will go to normally.
- log_dev_path : Device where the log entries are written to.
+ ============= ==============================================
+ dev_path Device that all of the IO will go to normally.
+ log_dev_path Device where the log entries are written to.
+ ============= ==============================================
ii) Status
<#logged entries> <highest allocated sector>
- #logged entries : Number of logged entries
- highest allocated sector : Highest allocated sector
+ =========================== ========================
+ #logged entries Number of logged entries
+ highest allocated sector Highest allocated sector
+ =========================== ========================
iii) Messages
@@ -75,15 +80,15 @@ iii) Messages
For example say you want to fsck a file system after every
write, but first you need to replay up to the mkfs to make sure
we're fsck'ing something reasonable, you would do something like
- this:
+ this::
mkfs.btrfs -f /dev/mapper/log
dmsetup message log 0 mark mkfs
<run test>
- This would allow you to replay the log up to the mkfs mark and
- then replay from that point on doing the fsck check in the
- interval that you want.
+ This would allow you to replay the log up to the mkfs mark and
+ then replay from that point on doing the fsck check in the
+ interval that you want.
Every log has a mark at the end labeled "dm-log-writes-end".
@@ -97,42 +102,42 @@ Example usage
=============
Say you want to test fsync on your file system. You would do something like
-this:
-
-TABLE="0 $(blockdev --getsz /dev/sdb) log-writes /dev/sdb /dev/sdc"
-dmsetup create log --table "$TABLE"
-mkfs.btrfs -f /dev/mapper/log
-dmsetup message log 0 mark mkfs
-
-mount /dev/mapper/log /mnt/btrfs-test
-<some test that does fsync at the end>
-dmsetup message log 0 mark fsync
-md5sum /mnt/btrfs-test/foo
-umount /mnt/btrfs-test
-
-dmsetup remove log
-replay-log --log /dev/sdc --replay /dev/sdb --end-mark fsync
-mount /dev/sdb /mnt/btrfs-test
-md5sum /mnt/btrfs-test/foo
-<verify md5sum's are correct>
-
-Another option is to do a complicated file system operation and verify the file
-system is consistent during the entire operation. You could do this with:
-
-TABLE="0 $(blockdev --getsz /dev/sdb) log-writes /dev/sdb /dev/sdc"
-dmsetup create log --table "$TABLE"
-mkfs.btrfs -f /dev/mapper/log
-dmsetup message log 0 mark mkfs
-
-mount /dev/mapper/log /mnt/btrfs-test
-<fsstress to dirty the fs>
-btrfs filesystem balance /mnt/btrfs-test
-umount /mnt/btrfs-test
-dmsetup remove log
-
-replay-log --log /dev/sdc --replay /dev/sdb --end-mark mkfs
-btrfsck /dev/sdb
-replay-log --log /dev/sdc --replay /dev/sdb --start-mark mkfs \
+this::
+
+ TABLE="0 $(blockdev --getsz /dev/sdb) log-writes /dev/sdb /dev/sdc"
+ dmsetup create log --table "$TABLE"
+ mkfs.btrfs -f /dev/mapper/log
+ dmsetup message log 0 mark mkfs
+
+ mount /dev/mapper/log /mnt/btrfs-test
+ <some test that does fsync at the end>
+ dmsetup message log 0 mark fsync
+ md5sum /mnt/btrfs-test/foo
+ umount /mnt/btrfs-test
+
+ dmsetup remove log
+ replay-log --log /dev/sdc --replay /dev/sdb --end-mark fsync
+ mount /dev/sdb /mnt/btrfs-test
+ md5sum /mnt/btrfs-test/foo
+ <verify md5sum's are correct>
+
+ Another option is to do a complicated file system operation and verify the file
+ system is consistent during the entire operation. You could do this with:
+
+ TABLE="0 $(blockdev --getsz /dev/sdb) log-writes /dev/sdb /dev/sdc"
+ dmsetup create log --table "$TABLE"
+ mkfs.btrfs -f /dev/mapper/log
+ dmsetup message log 0 mark mkfs
+
+ mount /dev/mapper/log /mnt/btrfs-test
+ <fsstress to dirty the fs>
+ btrfs filesystem balance /mnt/btrfs-test
+ umount /mnt/btrfs-test
+ dmsetup remove log
+
+ replay-log --log /dev/sdc --replay /dev/sdb --end-mark mkfs
+ btrfsck /dev/sdb
+ replay-log --log /dev/sdc --replay /dev/sdb --start-mark mkfs \
--fsck "btrfsck /dev/sdb" --check fua
And that will replay the log until it sees a FUA request, run the fsck command
diff --git a/Documentation/device-mapper/persistent-data.txt b/Documentation/device-mapper/persistent-data.rst
index a333bcb3a6c2..2065c3c5a091 100644
--- a/Documentation/device-mapper/persistent-data.txt
+++ b/Documentation/device-mapper/persistent-data.rst
@@ -1,3 +1,7 @@
+===============
+Persistent data
+===============
+
Introduction
============
diff --git a/Documentation/device-mapper/snapshot.txt b/Documentation/device-mapper/snapshot.rst
index b8bbb516f989..4c53304e72f1 100644
--- a/Documentation/device-mapper/snapshot.txt
+++ b/Documentation/device-mapper/snapshot.rst
@@ -1,15 +1,16 @@
+==============================
Device-mapper snapshot support
==============================
Device-mapper allows you, without massive data copying:
-*) To create snapshots of any block device i.e. mountable, saved states of
-the block device which are also writable without interfering with the
-original content;
-*) To create device "forks", i.e. multiple different versions of the
-same data stream.
-*) To merge a snapshot of a block device back into the snapshot's origin
-device.
+- To create snapshots of any block device i.e. mountable, saved states of
+ the block device which are also writable without interfering with the
+ original content;
+- To create device "forks", i.e. multiple different versions of the
+ same data stream.
+- To merge a snapshot of a block device back into the snapshot's origin
+ device.
In the first two cases, dm copies only the chunks of data that get
changed and uses a separate copy-on-write (COW) block device for
@@ -22,7 +23,7 @@ the origin device.
There are three dm targets available:
snapshot, snapshot-origin, and snapshot-merge.
-*) snapshot-origin <origin>
+- snapshot-origin <origin>
which will normally have one or more snapshots based on it.
Reads will be mapped directly to the backing device. For each write, the
@@ -30,7 +31,7 @@ original data will be saved in the <COW device> of each snapshot to keep
its visible content unchanged, at least until the <COW device> fills up.
-*) snapshot <origin> <COW device> <persistent?> <chunksize>
+- snapshot <origin> <COW device> <persistent?> <chunksize>
A snapshot of the <origin> block device is created. Changed chunks of
<chunksize> sectors will be stored on the <COW device>. Writes will
@@ -83,25 +84,25 @@ When you create the first LVM2 snapshot of a volume, four dm devices are used:
source volume), whose table is replaced by a "snapshot-origin" mapping
from device #1.
-A fixed naming scheme is used, so with the following commands:
+A fixed naming scheme is used, so with the following commands::
-lvcreate -L 1G -n base volumeGroup
-lvcreate -L 100M --snapshot -n snap volumeGroup/base
+ lvcreate -L 1G -n base volumeGroup
+ lvcreate -L 100M --snapshot -n snap volumeGroup/base
-we'll have this situation (with volumes in above order):
+we'll have this situation (with volumes in above order)::
-# dmsetup table|grep volumeGroup
+ # dmsetup table|grep volumeGroup
-volumeGroup-base-real: 0 2097152 linear 8:19 384
-volumeGroup-snap-cow: 0 204800 linear 8:19 2097536
-volumeGroup-snap: 0 2097152 snapshot 254:11 254:12 P 16
-volumeGroup-base: 0 2097152 snapshot-origin 254:11
+ volumeGroup-base-real: 0 2097152 linear 8:19 384
+ volumeGroup-snap-cow: 0 204800 linear 8:19 2097536
+ volumeGroup-snap: 0 2097152 snapshot 254:11 254:12 P 16
+ volumeGroup-base: 0 2097152 snapshot-origin 254:11
-# ls -lL /dev/mapper/volumeGroup-*
-brw------- 1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
-brw------- 1 root root 254, 12 29 ago 18:15 /dev/mapper/volumeGroup-snap-cow
-brw------- 1 root root 254, 13 29 ago 18:15 /dev/mapper/volumeGroup-snap
-brw------- 1 root root 254, 10 29 ago 18:14 /dev/mapper/volumeGroup-base
+ # ls -lL /dev/mapper/volumeGroup-*
+ brw------- 1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
+ brw------- 1 root root 254, 12 29 ago 18:15 /dev/mapper/volumeGroup-snap-cow
+ brw------- 1 root root 254, 13 29 ago 18:15 /dev/mapper/volumeGroup-snap
+ brw------- 1 root root 254, 10 29 ago 18:14 /dev/mapper/volumeGroup-base
How snapshot-merge is used by LVM2
@@ -114,27 +115,28 @@ merging snapshot after it completes. The "snapshot" that hands over its
COW device to the "snapshot-merge" is deactivated (unless using lvchange
--refresh); but if it is left active it will simply return I/O errors.
-A snapshot will merge into its origin with the following command:
+A snapshot will merge into its origin with the following command::
-lvconvert --merge volumeGroup/snap
+ lvconvert --merge volumeGroup/snap
-we'll now have this situation:
+we'll now have this situation::
-# dmsetup table|grep volumeGroup
+ # dmsetup table|grep volumeGroup
-volumeGroup-base-real: 0 2097152 linear 8:19 384
-volumeGroup-base-cow: 0 204800 linear 8:19 2097536
-volumeGroup-base: 0 2097152 snapshot-merge 254:11 254:12 P 16
+ volumeGroup-base-real: 0 2097152 linear 8:19 384
+ volumeGroup-base-cow: 0 204800 linear 8:19 2097536
+ volumeGroup-base: 0 2097152 snapshot-merge 254:11 254:12 P 16
-# ls -lL /dev/mapper/volumeGroup-*
-brw------- 1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
-brw------- 1 root root 254, 12 29 ago 18:16 /dev/mapper/volumeGroup-base-cow
-brw------- 1 root root 254, 10 29 ago 18:16 /dev/mapper/volumeGroup-base
+ # ls -lL /dev/mapper/volumeGroup-*
+ brw------- 1 root root 254, 11 29 ago 18:15 /dev/mapper/volumeGroup-base-real
+ brw------- 1 root root 254, 12 29 ago 18:16 /dev/mapper/volumeGroup-base-cow
+ brw------- 1 root root 254, 10 29 ago 18:16 /dev/mapper/volumeGroup-base
How to determine when a merging is complete
===========================================
The snapshot-merge and snapshot status lines end with:
+
<sectors_allocated>/<total_sectors> <metadata_sectors>
Both <sectors_allocated> and <total_sectors> include both data and metadata.
@@ -142,35 +144,37 @@ During merging, the number of sectors allocated gets smaller and
smaller. Merging has finished when the number of sectors holding data
is zero, in other words <sectors_allocated> == <metadata_sectors>.
-Here is a practical example (using a hybrid of lvm and dmsetup commands):
+Here is a practical example (using a hybrid of lvm and dmsetup commands)::
-# lvs
- LV VG Attr LSize Origin Snap% Move Log Copy% Convert
- base volumeGroup owi-a- 4.00g
- snap volumeGroup swi-a- 1.00g base 18.97
+ # lvs
+ LV VG Attr LSize Origin Snap% Move Log Copy% Convert
+ base volumeGroup owi-a- 4.00g
+ snap volumeGroup swi-a- 1.00g base 18.97
-# dmsetup status volumeGroup-snap
-0 8388608 snapshot 397896/2097152 1560
- ^^^^ metadata sectors
+ # dmsetup status volumeGroup-snap
+ 0 8388608 snapshot 397896/2097152 1560
+ ^^^^ metadata sectors
-# lvconvert --merge -b volumeGroup/snap
- Merging of volume snap started.
+ # lvconvert --merge -b volumeGroup/snap
+ Merging of volume snap started.
-# lvs volumeGroup/snap
- LV VG Attr LSize Origin Snap% Move Log Copy% Convert
- base volumeGroup Owi-a- 4.00g 17.23
+ # lvs volumeGroup/snap
+ LV VG Attr LSize Origin Snap% Move Log Copy% Convert
+ base volumeGroup Owi-a- 4.00g 17.23
-# dmsetup status volumeGroup-base
-0 8388608 snapshot-merge 281688/2097152 1104
+ # dmsetup status volumeGroup-base
+ 0 8388608 snapshot-merge 281688/2097152 1104
-# dmsetup status volumeGroup-base
-0 8388608 snapshot-merge 180480/2097152 712
+ # dmsetup status volumeGroup-base
+ 0 8388608 snapshot-merge 180480/2097152 712
-# dmsetup status volumeGroup-base
-0 8388608 snapshot-merge 16/2097152 16
+ # dmsetup status volumeGroup-base
+ 0 8388608 snapshot-merge 16/2097152 16
Merging has finished.
-# lvs
- LV VG Attr LSize Origin Snap% Move Log Copy% Convert
- base volumeGroup owi-a- 4.00g
+::
+
+ # lvs
+ LV VG Attr LSize Origin Snap% Move Log Copy% Convert
+ base volumeGroup owi-a- 4.00g
diff --git a/Documentation/device-mapper/statistics.txt b/Documentation/device-mapper/statistics.rst
index 170ac02a1f50..3d80a9f850cc 100644
--- a/Documentation/device-mapper/statistics.txt
+++ b/Documentation/device-mapper/statistics.rst
@@ -1,3 +1,4 @@
+=============
DM statistics
=============
@@ -11,7 +12,7 @@ Individual statistics will be collected for each step-sized area within
the range specified.
The I/O statistics counters for each step-sized area of a region are
-in the same format as /sys/block/*/stat or /proc/diskstats (see:
+in the same format as `/sys/block/*/stat` or `/proc/diskstats` (see:
Documentation/iostats.txt). But two extra counters (12 and 13) are
provided: total time spent reading and writing. When the histogram
argument is used, the 14th parameter is reported that represents the
@@ -32,40 +33,45 @@ on each other's data.
The creation of DM statistics will allocate memory via kmalloc or
fallback to using vmalloc space. At most, 1/4 of the overall system
memory may be allocated by DM statistics. The admin can see how much
-memory is used by reading
-/sys/module/dm_mod/parameters/stats_current_allocated_bytes
+memory is used by reading:
+
+ /sys/module/dm_mod/parameters/stats_current_allocated_bytes
Messages
========
- @stats_create <range> <step>
- [<number_of_optional_arguments> <optional_arguments>...]
- [<program_id> [<aux_data>]]
-
+ @stats_create <range> <step> [<number_of_optional_arguments> <optional_arguments>...] [<program_id> [<aux_data>]]
Create a new region and return the region_id.
<range>
- "-" - whole device
- "<start_sector>+<length>" - a range of <length> 512-byte sectors
- starting with <start_sector>.
+ "-"
+ whole device
+ "<start_sector>+<length>"
+ a range of <length> 512-byte sectors
+ starting with <start_sector>.
<step>
- "<area_size>" - the range is subdivided into areas each containing
- <area_size> sectors.
- "/<number_of_areas>" - the range is subdivided into the specified
- number of areas.
+ "<area_size>"
+ the range is subdivided into areas each containing
+ <area_size> sectors.
+ "/<number_of_areas>"
+ the range is subdivided into the specified
+ number of areas.
<number_of_optional_arguments>
The number of optional arguments
<optional_arguments>
- The following optional arguments are supported
- precise_timestamps - use precise timer with nanosecond resolution
+ The following optional arguments are supported:
+
+ precise_timestamps
+ use precise timer with nanosecond resolution
instead of the "jiffies" variable. When this argument is
used, the resulting times are in nanoseconds instead of
milliseconds. Precise timestamps are a little bit slower
to obtain than jiffies-based timestamps.
- histogram:n1,n2,n3,n4,... - collect histogram of latencies. The
+ histogram:n1,n2,n3,n4,...
+ collect histogram of latencies. The
numbers n1, n2, etc are times that represent the boundaries
of the histogram. If precise_timestamps is not used, the
times are in milliseconds, otherwise they are in
@@ -96,21 +102,18 @@ Messages
@stats_list message, but it doesn't use this value for anything.
@stats_delete <region_id>
-
Delete the region with the specified id.
<region_id>
region_id returned from @stats_create
@stats_clear <region_id>
-
Clear all the counters except the in-flight i/o counters.
<region_id>
region_id returned from @stats_create
@stats_list [<program_id>]
-
List all regions registered with @stats_create.
<program_id>
@@ -127,7 +130,6 @@ Messages
if they were specified when creating the region.
@stats_print <region_id> [<starting_line> <number_of_lines>]
-
Print counters for each step-sized area of a region.
<region_id>
@@ -143,10 +145,11 @@ Messages
Output format for each step-sized area of a region:
- <start_sector>+<length> counters
+ <start_sector>+<length>
+ counters
The first 11 counters have the same meaning as
- /sys/block/*/stat or /proc/diskstats.
+ `/sys/block/*/stat or /proc/diskstats`.
Please refer to Documentation/iostats.txt for details.
@@ -163,11 +166,11 @@ Messages
11. the weighted number of milliseconds spent doing I/Os
Additional counters:
+
12. the total time spent reading in milliseconds
13. the total time spent writing in milliseconds
@stats_print_clear <region_id> [<starting_line> <number_of_lines>]
-
Atomically print and then clear all the counters except the
in-flight i/o counters. Useful when the client consuming the
statistics does not want to lose any statistics (those updated
@@ -185,7 +188,6 @@ Messages
If omitted, all lines are printed and then cleared.
@stats_set_aux <region_id> <aux_data>
-
Store auxiliary data aux_data for the specified region.
<region_id>
@@ -201,23 +203,23 @@ Examples
========
Subdivide the DM device 'vol' into 100 pieces and start collecting
-statistics on them:
+statistics on them::
dmsetup message vol 0 @stats_create - /100
Set the auxiliary data string to "foo bar baz" (the escape for each
-space must also be escaped, otherwise the shell will consume them):
+space must also be escaped, otherwise the shell will consume them)::
dmsetup message vol 0 @stats_set_aux 0 foo\\ bar\\ baz
-List the statistics:
+List the statistics::
dmsetup message vol 0 @stats_list
-Print the statistics:
+Print the statistics::
dmsetup message vol 0 @stats_print 0
-Delete the statistics:
+Delete the statistics::
dmsetup message vol 0 @stats_delete 0
diff --git a/Documentation/device-mapper/striped.rst b/Documentation/device-mapper/striped.rst
new file mode 100644
index 000000000000..e9a8da192ae1
--- /dev/null
+++ b/Documentation/device-mapper/striped.rst
@@ -0,0 +1,61 @@
+=========
+dm-stripe
+=========
+
+Device-Mapper's "striped" target is used to create a striped (i.e. RAID-0)
+device across one or more underlying devices. Data is written in "chunks",
+with consecutive chunks rotating among the underlying devices. This can
+potentially provide improved I/O throughput by utilizing several physical
+devices in parallel.
+
+Parameters: <num devs> <chunk size> [<dev path> <offset>]+
+ <num devs>:
+ Number of underlying devices.
+ <chunk size>:
+ Size of each chunk of data. Must be at least as
+ large as the system's PAGE_SIZE.
+ <dev path>:
+ Full pathname to the underlying block-device, or a
+ "major:minor" device-number.
+ <offset>:
+ Starting sector within the device.
+
+One or more underlying devices can be specified. The striped device size must
+be a multiple of the chunk size multiplied by the number of underlying devices.
+
+
+Example scripts
+===============
+
+::
+
+ #!/usr/bin/perl -w
+ # Create a striped device across any number of underlying devices. The device
+ # will be called "stripe_dev" and have a chunk-size of 128k.
+
+ my $chunk_size = 128 * 2;
+ my $dev_name = "stripe_dev";
+ my $num_devs = @ARGV;
+ my @devs = @ARGV;
+ my ($min_dev_size, $stripe_dev_size, $i);
+
+ if (!$num_devs) {
+ die("Specify at least one device\n");
+ }
+
+ $min_dev_size = `blockdev --getsz $devs[0]`;
+ for ($i = 1; $i < $num_devs; $i++) {
+ my $this_size = `blockdev --getsz $devs[$i]`;
+ $min_dev_size = ($min_dev_size < $this_size) ?
+ $min_dev_size : $this_size;
+ }
+
+ $stripe_dev_size = $min_dev_size * $num_devs;
+ $stripe_dev_size -= $stripe_dev_size % ($chunk_size * $num_devs);
+
+ $table = "0 $stripe_dev_size striped $num_devs $chunk_size";
+ for ($i = 0; $i < $num_devs; $i++) {
+ $table .= " $devs[$i] 0";
+ }
+
+ `echo $table | dmsetup create $dev_name`;
diff --git a/Documentation/device-mapper/striped.txt b/Documentation/device-mapper/striped.txt
deleted file mode 100644
index 07ec492cceee..000000000000
--- a/Documentation/device-mapper/striped.txt
+++ /dev/null
@@ -1,57 +0,0 @@
-dm-stripe
-=========
-
-Device-Mapper's "striped" target is used to create a striped (i.e. RAID-0)
-device across one or more underlying devices. Data is written in "chunks",
-with consecutive chunks rotating among the underlying devices. This can
-potentially provide improved I/O throughput by utilizing several physical
-devices in parallel.
-
-Parameters: <num devs> <chunk size> [<dev path> <offset>]+
- <num devs>: Number of underlying devices.
- <chunk size>: Size of each chunk of data. Must be at least as
- large as the system's PAGE_SIZE.
- <dev path>: Full pathname to the underlying block-device, or a
- "major:minor" device-number.
- <offset>: Starting sector within the device.
-
-One or more underlying devices can be specified. The striped device size must
-be a multiple of the chunk size multiplied by the number of underlying devices.
-
-
-Example scripts
-===============
-
-[[
-#!/usr/bin/perl -w
-# Create a striped device across any number of underlying devices. The device
-# will be called "stripe_dev" and have a chunk-size of 128k.
-
-my $chunk_size = 128 * 2;
-my $dev_name = "stripe_dev";
-my $num_devs = @ARGV;
-my @devs = @ARGV;
-my ($min_dev_size, $stripe_dev_size, $i);
-
-if (!$num_devs) {
- die("Specify at least one device\n");
-}
-
-$min_dev_size = `blockdev --getsz $devs[0]`;
-for ($i = 1; $i < $num_devs; $i++) {
- my $this_size = `blockdev --getsz $devs[$i]`;
- $min_dev_size = ($min_dev_size < $this_size) ?
- $min_dev_size : $this_size;
-}
-
-$stripe_dev_size = $min_dev_size * $num_devs;
-$stripe_dev_size -= $stripe_dev_size % ($chunk_size * $num_devs);
-
-$table = "0 $stripe_dev_size striped $num_devs $chunk_size";
-for ($i = 0; $i < $num_devs; $i++) {
- $table .= " $devs[$i] 0";
-}
-
-`echo $table | dmsetup create $dev_name`;
-]]
-
diff --git a/Documentation/device-mapper/switch.txt b/Documentation/device-mapper/switch.rst
index 5bd4831db4a8..7dde06be1a4f 100644
--- a/Documentation/device-mapper/switch.txt
+++ b/Documentation/device-mapper/switch.rst
@@ -1,3 +1,4 @@
+=========
dm-switch
=========
@@ -67,27 +68,25 @@ b-tree can achieve.
Construction Parameters
=======================
- <num_paths> <region_size> <num_optional_args> [<optional_args>...]
- [<dev_path> <offset>]+
-
-<num_paths>
- The number of paths across which to distribute the I/O.
+ <num_paths> <region_size> <num_optional_args> [<optional_args>...] [<dev_path> <offset>]+
+ <num_paths>
+ The number of paths across which to distribute the I/O.
-<region_size>
- The number of 512-byte sectors in a region. Each region can be redirected
- to any of the available paths.
+ <region_size>
+ The number of 512-byte sectors in a region. Each region can be redirected
+ to any of the available paths.
-<num_optional_args>
- The number of optional arguments. Currently, no optional arguments
- are supported and so this must be zero.
+ <num_optional_args>
+ The number of optional arguments. Currently, no optional arguments
+ are supported and so this must be zero.
-<dev_path>
- The block device that represents a specific path to the device.
+ <dev_path>
+ The block device that represents a specific path to the device.
-<offset>
- The offset of the start of data on the specific <dev_path> (in units
- of 512-byte sectors). This number is added to the sector number when
- forwarding the request to the specific path. Typically it is zero.
+ <offset>
+ The offset of the start of data on the specific <dev_path> (in units
+ of 512-byte sectors). This number is added to the sector number when
+ forwarding the request to the specific path. Typically it is zero.
Messages
========
@@ -122,17 +121,21 @@ Example
Assume that you have volumes vg1/switch0 vg1/switch1 vg1/switch2 with
the same size.
-Create a switch device with 64kB region size:
+Create a switch device with 64kB region size::
+
dmsetup create switch --table "0 `blockdev --getsz /dev/vg1/switch0`
switch 3 128 0 /dev/vg1/switch0 0 /dev/vg1/switch1 0 /dev/vg1/switch2 0"
Set mappings for the first 7 entries to point to devices switch0, switch1,
-switch2, switch0, switch1, switch2, switch1:
+switch2, switch0, switch1, switch2, switch1::
+
dmsetup message switch 0 set_region_mappings 0:0 :1 :2 :0 :1 :2 :1
-Set repetitive mapping. This command:
+Set repetitive mapping. This command::
+
dmsetup message switch 0 set_region_mappings 1000:1 :2 R2,10
-is equivalent to:
+
+is equivalent to::
+
dmsetup message switch 0 set_region_mappings 1000:1 :2 :1 :2 :1 :2 :1 :2 \
:1 :2 :1 :2 :1 :2 :1 :2 :1 :2
-
diff --git a/Documentation/device-mapper/thin-provisioning.txt b/Documentation/device-mapper/thin-provisioning.rst
index 883e7ca5f745..bafebf79da4b 100644
--- a/Documentation/device-mapper/thin-provisioning.txt
+++ b/Documentation/device-mapper/thin-provisioning.rst
@@ -1,3 +1,7 @@
+=================
+Thin provisioning
+=================
+
Introduction
============
@@ -95,6 +99,8 @@ previously.)
Using an existing pool device
-----------------------------
+::
+
dmsetup create pool \
--table "0 20971520 thin-pool $metadata_dev $data_dev \
$data_block_size $low_water_mark"
@@ -154,7 +160,7 @@ Thin provisioning
i) Creating a new thinly-provisioned volume.
To create a new thinly- provisioned volume you must send a message to an
- active pool device, /dev/mapper/pool in this example.
+ active pool device, /dev/mapper/pool in this example::
dmsetup message /dev/mapper/pool 0 "create_thin 0"
@@ -164,7 +170,7 @@ i) Creating a new thinly-provisioned volume.
ii) Using a thinly-provisioned volume.
- Thinly-provisioned volumes are activated using the 'thin' target:
+ Thinly-provisioned volumes are activated using the 'thin' target::
dmsetup create thin --table "0 2097152 thin /dev/mapper/pool 0"
@@ -181,6 +187,8 @@ i) Creating an internal snapshot.
must suspend it before creating the snapshot to avoid corruption.
This is NOT enforced at the moment, so please be careful!
+ ::
+
dmsetup suspend /dev/mapper/thin
dmsetup message /dev/mapper/pool 0 "create_snap 1 0"
dmsetup resume /dev/mapper/thin
@@ -198,14 +206,14 @@ ii) Using an internal snapshot.
activating or removing them both. (This differs from conventional
device-mapper snapshots.)
- Activate it exactly the same way as any other thinly-provisioned volume:
+ Activate it exactly the same way as any other thinly-provisioned volume::
dmsetup create snap --table "0 2097152 thin /dev/mapper/pool 1"
External snapshots
------------------
-You can use an external _read only_ device as an origin for a
+You can use an external **read only** device as an origin for a
thinly-provisioned volume. Any read to an unprovisioned area of the
thin device will be passed through to the origin. Writes trigger
the allocation of new blocks as usual.
@@ -223,11 +231,13 @@ i) Creating a snapshot of an external device
This is the same as creating a thin device.
You don't mention the origin at this stage.
+ ::
+
dmsetup message /dev/mapper/pool 0 "create_thin 0"
ii) Using a snapshot of an external device.
- Append an extra parameter to the thin target specifying the origin:
+ Append an extra parameter to the thin target specifying the origin::
dmsetup create snap --table "0 2097152 thin /dev/mapper/pool 0 /dev/image"
@@ -240,6 +250,8 @@ Deactivation
All devices using a pool must be deactivated before the pool itself
can be.
+::
+
dmsetup remove thin
dmsetup remove snap
dmsetup remove pool
@@ -252,25 +264,32 @@ Reference
i) Constructor
- thin-pool <metadata dev> <data dev> <data block size (sectors)> \
- <low water mark (blocks)> [<number of feature args> [<arg>]*]
+ ::
+
+ thin-pool <metadata dev> <data dev> <data block size (sectors)> \
+ <low water mark (blocks)> [<number of feature args> [<arg>]*]
Optional feature arguments:
- skip_block_zeroing: Skip the zeroing of newly-provisioned blocks.
+ skip_block_zeroing:
+ Skip the zeroing of newly-provisioned blocks.
- ignore_discard: Disable discard support.
+ ignore_discard:
+ Disable discard support.
- no_discard_passdown: Don't pass discards down to the underlying
- data device, but just remove the mapping.
+ no_discard_passdown:
+ Don't pass discards down to the underlying
+ data device, but just remove the mapping.
- read_only: Don't allow any changes to be made to the pool
+ read_only:
+ Don't allow any changes to be made to the pool
metadata. This mode is only available after the
thin-pool has been created and first used in full
read/write mode. It cannot be specified on initial
thin-pool creation.
- error_if_no_space: Error IOs, instead of queueing, if no space.
+ error_if_no_space:
+ Error IOs, instead of queueing, if no space.
Data block size must be between 64KB (128 sectors) and 1GB
(2097152 sectors) inclusive.
@@ -278,10 +297,12 @@ i) Constructor
ii) Status
- <transaction id> <used metadata blocks>/<total metadata blocks>
- <used data blocks>/<total data blocks> <held metadata root>
- ro|rw|out_of_data_space [no_]discard_passdown [error|queue]_if_no_space
- needs_check|- metadata_low_watermark
+ ::
+
+ <transaction id> <used metadata blocks>/<total metadata blocks>
+ <used data blocks>/<total data blocks> <held metadata root>
+ ro|rw|out_of_data_space [no_]discard_passdown [error|queue]_if_no_space
+ needs_check|- metadata_low_watermark
transaction id:
A 64-bit number used by userspace to help synchronise with metadata
@@ -336,13 +357,11 @@ ii) Status
iii) Messages
create_thin <dev id>
-
Create a new thinly-provisioned device.
<dev id> is an arbitrary unique 24-bit identifier chosen by
the caller.
create_snap <dev id> <origin id>
-
Create a new snapshot of another thinly-provisioned device.
<dev id> is an arbitrary unique 24-bit identifier chosen by
the caller.
@@ -350,11 +369,9 @@ iii) Messages
of which the new device will be a snapshot.
delete <dev id>
-
Deletes a thin device. Irreversible.
set_transaction_id <current id> <new id>
-
Userland volume managers, such as LVM, need a way to
synchronise their external metadata with the internal metadata of the
pool target. The thin-pool target offers to store an
@@ -364,14 +381,12 @@ iii) Messages
compare-and-swap message.
reserve_metadata_snap
-
Reserve a copy of the data mapping btree for use by userland.
This allows userland to inspect the mappings as they were when
this message was executed. Use the pool's status command to
get the root block associated with the metadata snapshot.
release_metadata_snap
-
Release a previously reserved copy of the data mapping btree.
'thin' target
@@ -379,7 +394,9 @@ iii) Messages
i) Constructor
- thin <pool dev> <dev id> [<external origin dev>]
+ ::
+
+ thin <pool dev> <dev id> [<external origin dev>]
pool dev:
the thin-pool device, e.g. /dev/mapper/my_pool or 253:0
@@ -401,8 +418,7 @@ provisioned as and when needed.
ii) Status
- <nr mapped sectors> <highest mapped sector>
-
+ <nr mapped sectors> <highest mapped sector>
If the pool has encountered device errors and failed, the status
will just contain the string 'Fail'. The userspace recovery
tools should then be used.
diff --git a/Documentation/device-mapper/unstriped.txt b/Documentation/device-mapper/unstriped.rst
index 0b2a306c54ee..0a8d3eb3f072 100644
--- a/Documentation/device-mapper/unstriped.txt
+++ b/Documentation/device-mapper/unstriped.rst
@@ -1,3 +1,7 @@
+================================
+Device-mapper "unstriped" target
+================================
+
Introduction
============
@@ -34,46 +38,46 @@ striped target to combine the 4 devices into one. It then will use
the unstriped target ontop of the striped device to access the
individual backing loop devices. We write data to the newly exposed
unstriped devices and verify the data written matches the correct
-underlying device on the striped array.
+underlying device on the striped array::
-#!/bin/bash
+ #!/bin/bash
-MEMBER_SIZE=$((128 * 1024 * 1024))
-NUM=4
-SEQ_END=$((${NUM}-1))
-CHUNK=256
-BS=4096
+ MEMBER_SIZE=$((128 * 1024 * 1024))
+ NUM=4
+ SEQ_END=$((${NUM}-1))
+ CHUNK=256
+ BS=4096
-RAID_SIZE=$((${MEMBER_SIZE}*${NUM}/512))
-DM_PARMS="0 ${RAID_SIZE} striped ${NUM} ${CHUNK}"
-COUNT=$((${MEMBER_SIZE} / ${BS}))
+ RAID_SIZE=$((${MEMBER_SIZE}*${NUM}/512))
+ DM_PARMS="0 ${RAID_SIZE} striped ${NUM} ${CHUNK}"
+ COUNT=$((${MEMBER_SIZE} / ${BS}))
-for i in $(seq 0 ${SEQ_END}); do
- dd if=/dev/zero of=member-${i} bs=${MEMBER_SIZE} count=1 oflag=direct
- losetup /dev/loop${i} member-${i}
- DM_PARMS+=" /dev/loop${i} 0"
-done
+ for i in $(seq 0 ${SEQ_END}); do
+ dd if=/dev/zero of=member-${i} bs=${MEMBER_SIZE} count=1 oflag=direct
+ losetup /dev/loop${i} member-${i}
+ DM_PARMS+=" /dev/loop${i} 0"
+ done
-echo $DM_PARMS | dmsetup create raid0
-for i in $(seq 0 ${SEQ_END}); do
- echo "0 1 unstriped ${NUM} ${CHUNK} ${i} /dev/mapper/raid0 0" | dmsetup create set-${i}
-done;
+ echo $DM_PARMS | dmsetup create raid0
+ for i in $(seq 0 ${SEQ_END}); do
+ echo "0 1 unstriped ${NUM} ${CHUNK} ${i} /dev/mapper/raid0 0" | dmsetup create set-${i}
+ done;
-for i in $(seq 0 ${SEQ_END}); do
- dd if=/dev/urandom of=/dev/mapper/set-${i} bs=${BS} count=${COUNT} oflag=direct
- diff /dev/mapper/set-${i} member-${i}
-done;
+ for i in $(seq 0 ${SEQ_END}); do
+ dd if=/dev/urandom of=/dev/mapper/set-${i} bs=${BS} count=${COUNT} oflag=direct
+ diff /dev/mapper/set-${i} member-${i}
+ done;
-for i in $(seq 0 ${SEQ_END}); do
- dmsetup remove set-${i}
-done
+ for i in $(seq 0 ${SEQ_END}); do
+ dmsetup remove set-${i}
+ done
-dmsetup remove raid0
+ dmsetup remove raid0
-for i in $(seq 0 ${SEQ_END}); do
- losetup -d /dev/loop${i}
- rm -f member-${i}
-done
+ for i in $(seq 0 ${SEQ_END}); do
+ losetup -d /dev/loop${i}
+ rm -f member-${i}
+ done
Another example
---------------
@@ -81,7 +85,7 @@ Another example
Intel NVMe drives contain two cores on the physical device.
Each core of the drive has segregated access to its LBA range.
The current LBA model has a RAID 0 128k chunk on each core, resulting
-in a 256k stripe across the two cores:
+in a 256k stripe across the two cores::
Core 0: Core 1:
__________ __________
@@ -108,17 +112,24 @@ Example dmsetup usage
unstriped ontop of Intel NVMe device that has 2 cores
-----------------------------------------------------
-dmsetup create nvmset0 --table '0 512 unstriped 2 256 0 /dev/nvme0n1 0'
-dmsetup create nvmset1 --table '0 512 unstriped 2 256 1 /dev/nvme0n1 0'
+
+::
+
+ dmsetup create nvmset0 --table '0 512 unstriped 2 256 0 /dev/nvme0n1 0'
+ dmsetup create nvmset1 --table '0 512 unstriped 2 256 1 /dev/nvme0n1 0'
There will now be two devices that expose Intel NVMe core 0 and 1
-respectively:
-/dev/mapper/nvmset0
-/dev/mapper/nvmset1
+respectively::
+
+ /dev/mapper/nvmset0
+ /dev/mapper/nvmset1
unstriped ontop of striped with 4 drives using 128K chunk size
--------------------------------------------------------------
-dmsetup create raid_disk0 --table '0 512 unstriped 4 256 0 /dev/mapper/striped 0'
-dmsetup create raid_disk1 --table '0 512 unstriped 4 256 1 /dev/mapper/striped 0'
-dmsetup create raid_disk2 --table '0 512 unstriped 4 256 2 /dev/mapper/striped 0'
-dmsetup create raid_disk3 --table '0 512 unstriped 4 256 3 /dev/mapper/striped 0'
+
+::
+
+ dmsetup create raid_disk0 --table '0 512 unstriped 4 256 0 /dev/mapper/striped 0'
+ dmsetup create raid_disk1 --table '0 512 unstriped 4 256 1 /dev/mapper/striped 0'
+ dmsetup create raid_disk2 --table '0 512 unstriped 4 256 2 /dev/mapper/striped 0'
+ dmsetup create raid_disk3 --table '0 512 unstriped 4 256 3 /dev/mapper/striped 0'
diff --git a/Documentation/device-mapper/verity.txt b/Documentation/device-mapper/verity.rst
index b3d2e4a42255..a4d1c1476d72 100644
--- a/Documentation/device-mapper/verity.txt
+++ b/Documentation/device-mapper/verity.rst
@@ -1,5 +1,6 @@
+=========
dm-verity
-==========
+=========
Device-Mapper's "verity" target provides transparent integrity checking of
block devices using a cryptographic digest provided by the kernel crypto API.
@@ -7,6 +8,9 @@ This target is read-only.
Construction Parameters
=======================
+
+::
+
<version> <dev> <hash_dev>
<data_block_size> <hash_block_size>
<num_data_blocks> <hash_start_block>
@@ -160,7 +164,9 @@ calculating the parent node.
The tree looks something like:
-alg = sha256, num_blocks = 32768, block_size = 4096
+ alg = sha256, num_blocks = 32768, block_size = 4096
+
+::
[ root ]
/ . . . \
@@ -189,6 +195,7 @@ block boundary) are the hash blocks which are stored a depth at a time
The full specification of kernel parameters and on-disk metadata format
is available at the cryptsetup project's wiki page
+
https://gitlab.com/cryptsetup/cryptsetup/wikis/DMVerity
Status
@@ -198,7 +205,8 @@ If any check failed, C (for Corruption) is returned.
Example
=======
-Set up a device:
+Set up a device::
+
# dmsetup create vroot --readonly --table \
"0 2097152 verity 1 /dev/sda1 /dev/sda2 4096 4096 262144 1 sha256 "\
"4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076 "\
@@ -209,11 +217,13 @@ the hash tree or activate the kernel device. This is available from
the cryptsetup upstream repository https://gitlab.com/cryptsetup/cryptsetup/
(as a libcryptsetup extension).
-Create hash on the device:
+Create hash on the device::
+
# veritysetup format /dev/sda1 /dev/sda2
...
Root hash: 4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076
-Activate the device:
+Activate the device::
+
# veritysetup create vroot /dev/sda1 /dev/sda2 \
4392712ba01368efdf14b05c76f9e4df0d53664630b5d48632ed17a137f39076
diff --git a/Documentation/device-mapper/writecache.txt b/Documentation/device-mapper/writecache.rst
index 01532b3008ae..d3d7690f5e8d 100644
--- a/Documentation/device-mapper/writecache.txt
+++ b/Documentation/device-mapper/writecache.rst
@@ -1,3 +1,7 @@
+=================
+Writecache target
+=================
+
The writecache target caches writes on persistent memory or on SSD. It
doesn't cache reads because reads are supposed to be cached in page cache
in normal RAM.
@@ -6,15 +10,18 @@ When the device is constructed, the first sector should be zeroed or the
first sector should contain valid superblock from previous invocation.
Constructor parameters:
+
1. type of the cache device - "p" or "s"
- p - persistent memory
- s - SSD
+
+ - p - persistent memory
+ - s - SSD
2. the underlying device that will be cached
3. the cache device
4. block size (4096 is recommended; the maximum block size is the page
size)
5. the number of optional parameters (the parameters with an argument
count as two)
+
start_sector n (default: 0)
offset from the start of cache device in 512-byte sectors
high_watermark n (default: 50)
@@ -43,6 +50,7 @@ Constructor parameters:
applicable only to persistent memory - don't use the FUA
flag when writing back data and send the FLUSH request
afterwards
+
- some underlying devices perform better with fua, some
with nofua. The user should test it
@@ -60,6 +68,7 @@ Messages:
flush the cache device on next suspend. Use this message
when you are going to remove the cache device. The proper
sequence for removing the cache device is:
+
1. send the "flush_on_suspend" message
2. load an inactive table with a linear target that maps
to the underlying device
diff --git a/Documentation/device-mapper/zero.txt b/Documentation/device-mapper/zero.rst
index 20fb38e7fa7e..11fb5cf4597c 100644
--- a/Documentation/device-mapper/zero.txt
+++ b/Documentation/device-mapper/zero.rst
@@ -1,3 +1,4 @@
+=======
dm-zero
=======
@@ -18,20 +19,19 @@ filesystem limitations.
To create a sparse device, start by creating a dm-zero device that's the
desired size of the sparse device. For this example, we'll assume a 10TB
-sparse device.
+sparse device::
-TEN_TERABYTES=`expr 10 \* 1024 \* 1024 \* 1024 \* 2` # 10 TB in sectors
-echo "0 $TEN_TERABYTES zero" | dmsetup create zero1
+ TEN_TERABYTES=`expr 10 \* 1024 \* 1024 \* 1024 \* 2` # 10 TB in sectors
+ echo "0 $TEN_TERABYTES zero" | dmsetup create zero1
Then create a snapshot of the zero device, using any available block-device as
the COW device. The size of the COW device will determine the amount of real
space available to the sparse device. For this example, we'll assume /dev/sdb1
-is an available 10GB partition.
+is an available 10GB partition::
-echo "0 $TEN_TERABYTES snapshot /dev/mapper/zero1 /dev/sdb1 p 128" | \
- dmsetup create sparse1
+ echo "0 $TEN_TERABYTES snapshot /dev/mapper/zero1 /dev/sdb1 p 128" | \
+ dmsetup create sparse1
This will create a 10TB sparse device called /dev/mapper/sparse1 that has
10GB of actual storage space available. If more than 10GB of data is written
to this device, it will start returning I/O errors.
-
diff --git a/Documentation/devicetree/bindings/net/fsl-enetc.txt b/Documentation/devicetree/bindings/net/fsl-enetc.txt
index c812e25ae90f..25fc687419db 100644
--- a/Documentation/devicetree/bindings/net/fsl-enetc.txt
+++ b/Documentation/devicetree/bindings/net/fsl-enetc.txt
@@ -16,8 +16,8 @@ Required properties:
In this case, the ENETC node should include a "mdio" sub-node
that in turn should contain the "ethernet-phy" node describing the
external phy. Below properties are required, their bindings
-already defined in ethernet.txt or phy.txt, under
-Documentation/devicetree/bindings/net/*.
+already defined in Documentation/devicetree/bindings/net/ethernet.txt or
+Documentation/devicetree/bindings/net/phy.txt.
Required:
@@ -51,8 +51,7 @@ Example:
connection:
In this case, the ENETC port node defines a fixed link connection,
-as specified by "fixed-link.txt", under
-Documentation/devicetree/bindings/net/*.
+as specified by Documentation/devicetree/bindings/net/fixed-link.txt.
Required:
diff --git a/Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt b/Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt
index 12b18f82d441..efa2c8b9b85a 100644
--- a/Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt
+++ b/Documentation/devicetree/bindings/pci/amlogic,meson-pcie.txt
@@ -3,7 +3,7 @@ Amlogic Meson AXG DWC PCIE SoC controller
Amlogic Meson PCIe host controller is based on the Synopsys DesignWare PCI core.
It shares common functions with the PCIe DesignWare core driver and
inherits common properties defined in
-Documentation/devicetree/bindings/pci/designware-pci.txt.
+Documentation/devicetree/bindings/pci/designware-pcie.txt.
Additional properties are described here:
diff --git a/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt b/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
index 7ef2dbe48e8a..14d2eee96b3d 100644
--- a/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
+++ b/Documentation/devicetree/bindings/regulator/qcom,rpmh-regulator.txt
@@ -97,7 +97,7 @@ Second Level Nodes - Regulators
sent for this regulator including those which are for a
strictly lower power state.
-Other properties defined in Documentation/devicetree/bindings/regulator.txt
+Other properties defined in Documentation/devicetree/bindings/regulator/regulator.txt
may also be used. regulator-initial-mode and regulator-allowed-modes may be
specified for VRM regulators using mode values from
include/dt-bindings/regulator/qcom,rpmh-regulator.h. regulator-allow-bypass
diff --git a/Documentation/devicetree/booting-without-of.txt b/Documentation/devicetree/booting-without-of.txt
index e86bd2f64117..60f8640f2b2f 100644
--- a/Documentation/devicetree/booting-without-of.txt
+++ b/Documentation/devicetree/booting-without-of.txt
@@ -277,7 +277,7 @@ it with special cases.
the decompressor (the real mode entry point goes to the same 32bit
entry point once it switched into protected mode). That entry point
supports one calling convention which is documented in
- Documentation/x86/boot.txt
+ Documentation/x86/boot.rst
The physical pointer to the device-tree block (defined in chapter II)
is passed via setup_data which requires at least boot protocol 2.09.
The type filed is defined as
diff --git a/Documentation/doc-guide/kernel-doc.rst b/Documentation/doc-guide/kernel-doc.rst
index f96059767c8c..192c36af39e2 100644
--- a/Documentation/doc-guide/kernel-doc.rst
+++ b/Documentation/doc-guide/kernel-doc.rst
@@ -359,7 +359,7 @@ Domain`_ references.
``monospaced font``.
Useful if you need to use special characters that would otherwise have some
- meaning either by kernel-doc script of by reStructuredText.
+ meaning either by kernel-doc script or by reStructuredText.
This is particularly useful if you need to use things like ``%ph`` inside
a function description.
diff --git a/Documentation/doc-guide/sphinx.rst b/Documentation/doc-guide/sphinx.rst
index c039224b404e..f71ddd592aaa 100644
--- a/Documentation/doc-guide/sphinx.rst
+++ b/Documentation/doc-guide/sphinx.rst
@@ -27,8 +27,7 @@ Sphinx Install
==============
The ReST markups currently used by the Documentation/ files are meant to be
-built with ``Sphinx`` version 1.3 or higher. If you desire to build
-PDF output, it is recommended to use version 1.4.6 or higher.
+built with ``Sphinx`` version 1.3 or higher.
There's a script that checks for the Sphinx requirements. Please see
:ref:`sphinx-pre-install` for further details.
@@ -56,13 +55,13 @@ or ``virtualenv``, depending on how your distribution packaged Python 3.
those expressions are written using LaTeX notation. It needs texlive
installed with amdfonts and amsmath in order to evaluate them.
-In summary, if you want to install Sphinx version 1.4.9, you should do::
+In summary, if you want to install Sphinx version 1.7.9, you should do::
- $ virtualenv sphinx_1.4
- $ . sphinx_1.4/bin/activate
- (sphinx_1.4) $ pip install -r Documentation/sphinx/requirements.txt
+ $ virtualenv sphinx_1.7.9
+ $ . sphinx_1.7.9/bin/activate
+ (sphinx_1.7.9) $ pip install -r Documentation/sphinx/requirements.txt
-After running ``. sphinx_1.4/bin/activate``, the prompt will change,
+After running ``. sphinx_1.7.9/bin/activate``, the prompt will change,
in order to indicate that you're using the new environment. If you
open a new shell, you need to rerun this command to enter again at
the virtual environment before building the documentation.
@@ -105,8 +104,8 @@ command line options for your distro::
You should run:
sudo dnf install -y texlive-luatex85
- /usr/bin/virtualenv sphinx_1.4
- . sphinx_1.4/bin/activate
+ /usr/bin/virtualenv sphinx_1.7.9
+ . sphinx_1.7.9/bin/activate
pip install -r Documentation/sphinx/requirements.txt
Can't build as 1 mandatory dependency is missing at ./scripts/sphinx-pre-install line 468.
@@ -218,7 +217,7 @@ Here are some specific guidelines for the kernel documentation:
examples, etc.), use ``::`` for anything that doesn't really benefit
from syntax highlighting, especially short snippets. Use
``.. code-block:: <language>`` for longer code blocks that benefit
- from highlighting.
+ from highlighting. For a short snippet of code embedded in the text, use \`\`.
the C domain
@@ -242,11 +241,14 @@ The C domain of the kernel-doc has some additional features. E.g. you can
The func-name (e.g. ioctl) remains in the output but the ref-name changed from
``ioctl`` to ``VIDIOC_LOG_STATUS``. The index entry for this function is also
-changed to ``VIDIOC_LOG_STATUS`` and the function can now referenced by:
-
-.. code-block:: rst
-
- :c:func:`VIDIOC_LOG_STATUS`
+changed to ``VIDIOC_LOG_STATUS``.
+
+Please note that there is no need to use ``c:func:`` to generate cross
+references to function documentation. Due to some Sphinx extension magic,
+the documentation build system will automatically turn a reference to
+``function()`` into a cross reference if an index entry for the given
+function name exists. If you see ``c:func:`` use in a kernel document,
+please feel free to remove it.
list tables
diff --git a/Documentation/docutils.conf b/Documentation/docutils.conf
index 2830772264c8..f1a180b97dec 100644
--- a/Documentation/docutils.conf
+++ b/Documentation/docutils.conf
@@ -4,4 +4,4 @@
# http://docutils.sourceforge.net/docs/user/config.html
[general]
-halt_level: severe \ No newline at end of file
+halt_level: severe
diff --git a/Documentation/driver-api/basics.rst b/Documentation/driver-api/basics.rst
index e970fadf4d1a..1ba88c7b3984 100644
--- a/Documentation/driver-api/basics.rst
+++ b/Documentation/driver-api/basics.rst
@@ -115,9 +115,6 @@ Kernel utility functions
.. kernel-doc:: kernel/rcu/tree.c
:export:
-.. kernel-doc:: kernel/rcu/tree_plugin.h
- :export:
-
.. kernel-doc:: kernel/rcu/update.c
:export:
diff --git a/Documentation/driver-api/clk.rst b/Documentation/driver-api/clk.rst
index 593cca5058b1..3cad45d14187 100644
--- a/Documentation/driver-api/clk.rst
+++ b/Documentation/driver-api/clk.rst
@@ -175,9 +175,9 @@ the following::
To take advantage of your data you'll need to support valid operations
for your clk::
- struct clk_ops clk_foo_ops {
- .enable = &clk_foo_enable;
- .disable = &clk_foo_disable;
+ struct clk_ops clk_foo_ops = {
+ .enable = &clk_foo_enable,
+ .disable = &clk_foo_disable,
};
Implement the above functions using container_of::
diff --git a/Documentation/driver-api/firmware/other_interfaces.rst b/Documentation/driver-api/firmware/other_interfaces.rst
index a4ac54b5fd79..b81794e0cfbb 100644
--- a/Documentation/driver-api/firmware/other_interfaces.rst
+++ b/Documentation/driver-api/firmware/other_interfaces.rst
@@ -33,7 +33,7 @@ of the requests on to a secure monitor (EL3).
:functions: stratix10_svc_client_msg
.. kernel-doc:: include/linux/firmware/intel/stratix10-svc-client.h
- :functions: stratix10_svc_command_reconfig_payload
+ :functions: stratix10_svc_command_config_type
.. kernel-doc:: include/linux/firmware/intel/stratix10-svc-client.h
:functions: stratix10_svc_cb_data
diff --git a/Documentation/driver-api/gpio/board.rst b/Documentation/driver-api/gpio/board.rst
index b37f3f7b8926..ce91518bf9f4 100644
--- a/Documentation/driver-api/gpio/board.rst
+++ b/Documentation/driver-api/gpio/board.rst
@@ -101,7 +101,7 @@ with the help of _DSD (Device Specific Data), introduced in ACPI 5.1::
}
For more information about the ACPI GPIO bindings see
-Documentation/acpi/gpio-properties.txt.
+Documentation/firmware-guide/acpi/gpio-properties.rst.
Platform Data
-------------
diff --git a/Documentation/driver-api/gpio/consumer.rst b/Documentation/driver-api/gpio/consumer.rst
index 9559aa3cbcef..423492d125b9 100644
--- a/Documentation/driver-api/gpio/consumer.rst
+++ b/Documentation/driver-api/gpio/consumer.rst
@@ -435,7 +435,7 @@ case, it will be handled by the GPIO subsystem automatically. However, if the
_DSD is not present, the mappings between GpioIo()/GpioInt() resources and GPIO
connection IDs need to be provided by device drivers.
-For details refer to Documentation/acpi/gpio-properties.txt
+For details refer to Documentation/firmware-guide/acpi/gpio-properties.rst
Interacting With the Legacy GPIO Subsystem
diff --git a/Documentation/driver-api/iio/hw-consumer.rst b/Documentation/driver-api/iio/hw-consumer.rst
index e0fe0b98230e..819fb9edc005 100644
--- a/Documentation/driver-api/iio/hw-consumer.rst
+++ b/Documentation/driver-api/iio/hw-consumer.rst
@@ -45,7 +45,6 @@ A typical IIO HW consumer setup looks like this::
More details
============
-.. kernel-doc:: include/linux/iio/hw-consumer.h
.. kernel-doc:: drivers/iio/buffer/industrialio-hw-consumer.c
:export:
diff --git a/Documentation/pps/pps.txt b/Documentation/driver-api/pps.rst
index 99f5d8c4c652..1456d2c32ebd 100644
--- a/Documentation/pps/pps.txt
+++ b/Documentation/driver-api/pps.rst
@@ -1,8 +1,10 @@
+:orphan:
- PPS - Pulse Per Second
- ----------------------
+======================
+PPS - Pulse Per Second
+======================
-(C) Copyright 2007 Rodolfo Giometti <giometti@enneenne.com>
+Copyright (C) 2007 Rodolfo Giometti <giometti@enneenne.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@@ -88,7 +90,7 @@ Coding example
--------------
To register a PPS source into the kernel you should define a struct
-pps_source_info as follows:
+pps_source_info as follows::
static struct pps_source_info pps_ktimer_info = {
.name = "ktimer",
@@ -101,12 +103,12 @@ pps_source_info as follows:
};
and then calling the function pps_register_source() in your
-initialization routine as follows:
+initialization routine as follows::
source = pps_register_source(&pps_ktimer_info,
PPS_CAPTUREASSERT | PPS_OFFSETASSERT);
-The pps_register_source() prototype is:
+The pps_register_source() prototype is::
int pps_register_source(struct pps_source_info *info, int default_params)
@@ -118,7 +120,7 @@ pps_source_info which describe the capabilities of the driver).
Once you have registered a new PPS source into the system you can
signal an assert event (for example in the interrupt handler routine)
-just using:
+just using::
pps_event(source, &ts, PPS_CAPTUREASSERT, ptr)
@@ -134,13 +136,13 @@ Please see the file drivers/pps/clients/pps-ktimer.c for example code.
SYSFS support
-------------
-If the SYSFS filesystem is enabled in the kernel it provides a new class:
+If the SYSFS filesystem is enabled in the kernel it provides a new class::
$ ls /sys/class/pps/
pps0/ pps1/ pps2/
Every directory is the ID of a PPS sources defined in the system and
-inside you find several files:
+inside you find several files::
$ ls -F /sys/class/pps/pps0/
assert dev mode path subsystem@
@@ -148,7 +150,7 @@ inside you find several files:
Inside each "assert" and "clear" file you can find the timestamp and a
-sequence number:
+sequence number::
$ cat /sys/class/pps/pps0/assert
1170026870.983207967#8
@@ -175,11 +177,11 @@ and the userland tools available in your distribution's pps-tools package,
http://linuxpps.org , or https://github.com/redlab-i/pps-tools.
Once you have enabled the compilation of pps-ktimer just modprobe it (if
-not statically compiled):
+not statically compiled)::
# modprobe pps-ktimer
-and the run ppstest as follow:
+and the run ppstest as follow::
$ ./ppstest /dev/pps1
trying PPS source "/dev/pps1"
@@ -204,26 +206,27 @@ nor affordable. The cheap way is to load a PPS generator on one of the
computers (master) and PPS clients on others (slaves), and use very simple
cables to deliver signals using parallel ports, for example.
-Parallel port cable pinout:
-pin name master slave
-1 STROBE *------ *
-2 D0 * | *
-3 D1 * | *
-4 D2 * | *
-5 D3 * | *
-6 D4 * | *
-7 D5 * | *
-8 D6 * | *
-9 D7 * | *
-10 ACK * ------*
-11 BUSY * *
-12 PE * *
-13 SEL * *
-14 AUTOFD * *
-15 ERROR * *
-16 INIT * *
-17 SELIN * *
-18-25 GND *-----------*
+Parallel port cable pinout::
+
+ pin name master slave
+ 1 STROBE *------ *
+ 2 D0 * | *
+ 3 D1 * | *
+ 4 D2 * | *
+ 5 D3 * | *
+ 6 D4 * | *
+ 7 D5 * | *
+ 8 D6 * | *
+ 9 D7 * | *
+ 10 ACK * ------*
+ 11 BUSY * *
+ 12 PE * *
+ 13 SEL * *
+ 14 AUTOFD * *
+ 15 ERROR * *
+ 16 INIT * *
+ 17 SELIN * *
+ 18-25 GND *-----------*
Please note that parallel port interrupt occurs only on high->low transition,
so it is used for PPS assert edge. PPS clear edge can be determined only
diff --git a/Documentation/ptp/ptp.txt b/Documentation/driver-api/ptp.rst
index 11e904ee073f..b6e65d66d37a 100644
--- a/Documentation/ptp/ptp.txt
+++ b/Documentation/driver-api/ptp.rst
@@ -1,5 +1,8 @@
+:orphan:
-* PTP hardware clock infrastructure for Linux
+===========================================
+PTP hardware clock infrastructure for Linux
+===========================================
This patch set introduces support for IEEE 1588 PTP clocks in
Linux. Together with the SO_TIMESTAMPING socket options, this
@@ -22,7 +25,8 @@
- Period output signals configurable from user space
- Synchronization of the Linux system time via the PPS subsystem
-** PTP hardware clock kernel API
+PTP hardware clock kernel API
+=============================
A PTP clock driver registers itself with the class driver. The
class driver handles all of the dealings with user space. The
@@ -36,7 +40,8 @@
development, it can be useful to have more than one clock in a
single system, in order to allow performance comparisons.
-** PTP hardware clock user space API
+PTP hardware clock user space API
+=================================
The class driver also creates a character device for each
registered clock. User space can use an open file descriptor from
@@ -49,7 +54,8 @@
ancillary clock features. User space can receive time stamped
events via blocking read() and poll().
-** Writing clock drivers
+Writing clock drivers
+=====================
Clock drivers include include/linux/ptp_clock_kernel.h and register
themselves by presenting a 'struct ptp_clock_info' to the
@@ -66,14 +72,17 @@
class driver, since the lock may also be needed by the clock
driver's interrupt service routine.
-** Supported hardware
+Supported hardware
+==================
+
+ * Freescale eTSEC gianfar
- + Freescale eTSEC gianfar
- 2 Time stamp external triggers, programmable polarity (opt. interrupt)
- 2 Alarm registers (optional interrupt)
- 3 Periodic signals (optional interrupt)
- + National DP83640
+ * National DP83640
+
- 6 GPIOs programmable as inputs or outputs
- 6 GPIOs with dedicated functions (LED/JTAG/clock) can also be
used as general inputs or outputs
@@ -81,6 +90,7 @@
- GPIO outputs can produce periodic signals
- 1 interrupt pin
- + Intel IXP465
+ * Intel IXP465
+
- Auxiliary Slave/Master Mode Snapshot (optional interrupt)
- Target Time (optional interrupt)
diff --git a/Documentation/driver-api/target.rst b/Documentation/driver-api/target.rst
index 4363611dd86d..620ec6173a93 100644
--- a/Documentation/driver-api/target.rst
+++ b/Documentation/driver-api/target.rst
@@ -10,8 +10,8 @@ TBD
Target core device interfaces
=============================
-.. kernel-doc:: drivers/target/target_core_device.c
- :export:
+This section is blank because no kerneldoc comments have been added to
+drivers/target/target_core_device.c.
Target core transport interfaces
================================
diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.rst
index a17517a083c3..f51bb21d20e4 100644
--- a/Documentation/fault-injection/fault-injection.txt
+++ b/Documentation/fault-injection/fault-injection.rst
@@ -1,3 +1,4 @@
+===========================================
Fault injection capabilities infrastructure
===========================================
@@ -7,36 +8,36 @@ See also drivers/md/md-faulty.c and "every_nth" module option for scsi_debug.
Available fault injection capabilities
--------------------------------------
-o failslab
+- failslab
injects slab allocation failures. (kmalloc(), kmem_cache_alloc(), ...)
-o fail_page_alloc
+- fail_page_alloc
injects page allocation failures. (alloc_pages(), get_free_pages(), ...)
-o fail_futex
+- fail_futex
injects futex deadlock and uaddr fault errors.
-o fail_make_request
+- fail_make_request
injects disk IO errors on devices permitted by setting
/sys/block/<device>/make-it-fail or
/sys/block/<device>/<partition>/make-it-fail. (generic_make_request())
-o fail_mmc_request
+- fail_mmc_request
injects MMC data errors on devices permitted by setting
debugfs entries under /sys/kernel/debug/mmc0/fail_mmc_request
-o fail_function
+- fail_function
injects error return on specific functions, which are marked by
ALLOW_ERROR_INJECTION() macro, by setting debugfs entries
under /sys/kernel/debug/fail_function. No boot option supported.
-o NVMe fault injection
+- NVMe fault injection
inject NVMe status code and retry flag on devices permitted by setting
debugfs entries under /sys/kernel/debug/nvme*/fault_inject. The default
@@ -47,7 +48,8 @@ o NVMe fault injection
Configure fault-injection capabilities behavior
-----------------------------------------------
-o debugfs entries
+debugfs entries
+^^^^^^^^^^^^^^^
fault-inject-debugfs kernel module provides some debugfs entries for runtime
configuration of fault-injection capabilities.
@@ -55,6 +57,7 @@ configuration of fault-injection capabilities.
- /sys/kernel/debug/fail*/probability:
likelihood of failure injection, in percent.
+
Format: <percent>
Note that one-failure-per-hundred is a very high error rate
@@ -83,6 +86,7 @@ configuration of fault-injection capabilities.
- /sys/kernel/debug/fail*/verbose
Format: { 0 | 1 | 2 }
+
specifies the verbosity of the messages when failure is
injected. '0' means no messages; '1' will print only a single
log line per failure; '2' will print a call trace too -- useful
@@ -91,14 +95,15 @@ configuration of fault-injection capabilities.
- /sys/kernel/debug/fail*/task-filter:
Format: { 'Y' | 'N' }
+
A value of 'N' disables filtering by process (default).
Any positive value limits failures to only processes indicated by
/proc/<pid>/make-it-fail==1.
-- /sys/kernel/debug/fail*/require-start:
-- /sys/kernel/debug/fail*/require-end:
-- /sys/kernel/debug/fail*/reject-start:
-- /sys/kernel/debug/fail*/reject-end:
+- /sys/kernel/debug/fail*/require-start,
+ /sys/kernel/debug/fail*/require-end,
+ /sys/kernel/debug/fail*/reject-start,
+ /sys/kernel/debug/fail*/reject-end:
specifies the range of virtual addresses tested during
stacktrace walking. Failure is injected only if some caller
@@ -116,6 +121,7 @@ configuration of fault-injection capabilities.
- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
Format: { 'Y' | 'N' }
+
default is 'N', setting it to 'Y' won't inject failures into
highmem/user allocations.
@@ -123,6 +129,7 @@ configuration of fault-injection capabilities.
- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
Format: { 'Y' | 'N' }
+
default is 'N', setting it to 'Y' will inject failures
only into non-sleep allocations (GFP_ATOMIC allocations).
@@ -134,12 +141,14 @@ configuration of fault-injection capabilities.
- /sys/kernel/debug/fail_futex/ignore-private:
Format: { 'Y' | 'N' }
+
default is 'N', setting it to 'Y' will disable failure injections
when dealing with private (address space) futexes.
- /sys/kernel/debug/fail_function/inject:
Format: { 'function-name' | '!function-name' | '' }
+
specifies the target function of error injection by name.
If the function name leads '!' prefix, given function is
removed from injection list. If nothing specified ('')
@@ -160,10 +169,11 @@ configuration of fault-injection capabilities.
function for given function. This will be created when
user specifies new injection entry.
-o Boot option
+Boot option
+^^^^^^^^^^^
In order to inject faults while debugfs is not available (early boot time),
-use the boot option:
+use the boot option::
failslab=
fail_page_alloc=
@@ -171,10 +181,11 @@ use the boot option:
fail_futex=
mmc_core.fail_request=<interval>,<probability>,<space>,<times>
-o proc entries
+proc entries
+^^^^^^^^^^^^
-- /proc/<pid>/fail-nth:
-- /proc/self/task/<tid>/fail-nth:
+- /proc/<pid>/fail-nth,
+ /proc/self/task/<tid>/fail-nth:
Write to this file of integer N makes N-th call in the task fail.
Read from this file returns a integer value. A value of '0' indicates
@@ -191,16 +202,16 @@ o proc entries
How to add new fault injection capability
-----------------------------------------
-o #include <linux/fault-inject.h>
+- #include <linux/fault-inject.h>
-o define the fault attributes
+- define the fault attributes
DECLARE_FAULT_ATTR(name);
Please see the definition of struct fault_attr in fault-inject.h
for details.
-o provide a way to configure fault attributes
+- provide a way to configure fault attributes
- boot option
@@ -222,126 +233,126 @@ o provide a way to configure fault attributes
single kernel module, it is better to provide module parameters to
configure the fault attributes.
-o add a hook to insert failures
+- add a hook to insert failures
- Upon should_fail() returning true, client code should inject a failure.
+ Upon should_fail() returning true, client code should inject a failure:
should_fail(attr, size);
Application Examples
--------------------
-o Inject slab allocation failures into module init/exit code
+- Inject slab allocation failures into module init/exit code::
-#!/bin/bash
+ #!/bin/bash
-FAILTYPE=failslab
-echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
-echo 10 > /sys/kernel/debug/$FAILTYPE/probability
-echo 100 > /sys/kernel/debug/$FAILTYPE/interval
-echo -1 > /sys/kernel/debug/$FAILTYPE/times
-echo 0 > /sys/kernel/debug/$FAILTYPE/space
-echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
-echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
+ FAILTYPE=failslab
+ echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
+ echo 10 > /sys/kernel/debug/$FAILTYPE/probability
+ echo 100 > /sys/kernel/debug/$FAILTYPE/interval
+ echo -1 > /sys/kernel/debug/$FAILTYPE/times
+ echo 0 > /sys/kernel/debug/$FAILTYPE/space
+ echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
+ echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
-faulty_system()
-{
+ faulty_system()
+ {
bash -c "echo 1 > /proc/self/make-it-fail && exec $*"
-}
+ }
-if [ $# -eq 0 ]
-then
+ if [ $# -eq 0 ]
+ then
echo "Usage: $0 modulename [ modulename ... ]"
exit 1
-fi
+ fi
-for m in $*
-do
+ for m in $*
+ do
echo inserting $m...
faulty_system modprobe $m
echo removing $m...
faulty_system modprobe -r $m
-done
+ done
------------------------------------------------------------------------------
-o Inject page allocation failures only for a specific module
+- Inject page allocation failures only for a specific module::
-#!/bin/bash
+ #!/bin/bash
-FAILTYPE=fail_page_alloc
-module=$1
+ FAILTYPE=fail_page_alloc
+ module=$1
-if [ -z $module ]
-then
+ if [ -z $module ]
+ then
echo "Usage: $0 <modulename>"
exit 1
-fi
+ fi
-modprobe $module
+ modprobe $module
-if [ ! -d /sys/module/$module/sections ]
-then
+ if [ ! -d /sys/module/$module/sections ]
+ then
echo Module $module is not loaded
exit 1
-fi
+ fi
-cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
-cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
+ cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
+ cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
-echo N > /sys/kernel/debug/$FAILTYPE/task-filter
-echo 10 > /sys/kernel/debug/$FAILTYPE/probability
-echo 100 > /sys/kernel/debug/$FAILTYPE/interval
-echo -1 > /sys/kernel/debug/$FAILTYPE/times
-echo 0 > /sys/kernel/debug/$FAILTYPE/space
-echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
-echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
-echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
-echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
+ echo N > /sys/kernel/debug/$FAILTYPE/task-filter
+ echo 10 > /sys/kernel/debug/$FAILTYPE/probability
+ echo 100 > /sys/kernel/debug/$FAILTYPE/interval
+ echo -1 > /sys/kernel/debug/$FAILTYPE/times
+ echo 0 > /sys/kernel/debug/$FAILTYPE/space
+ echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
+ echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
+ echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
+ echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
-trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
+ trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
-echo "Injecting errors into the module $module... (interrupt to stop)"
-sleep 1000000
+ echo "Injecting errors into the module $module... (interrupt to stop)"
+ sleep 1000000
------------------------------------------------------------------------------
-o Inject open_ctree error while btrfs mount
-
-#!/bin/bash
-
-rm -f testfile.img
-dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
-DEVICE=$(losetup --show -f testfile.img)
-mkfs.btrfs -f $DEVICE
-mkdir -p tmpmnt
-
-FAILTYPE=fail_function
-FAILFUNC=open_ctree
-echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
-echo -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
-echo N > /sys/kernel/debug/$FAILTYPE/task-filter
-echo 100 > /sys/kernel/debug/$FAILTYPE/probability
-echo 0 > /sys/kernel/debug/$FAILTYPE/interval
-echo -1 > /sys/kernel/debug/$FAILTYPE/times
-echo 0 > /sys/kernel/debug/$FAILTYPE/space
-echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
-
-mount -t btrfs $DEVICE tmpmnt
-if [ $? -ne 0 ]
-then
+- Inject open_ctree error while btrfs mount::
+
+ #!/bin/bash
+
+ rm -f testfile.img
+ dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1
+ DEVICE=$(losetup --show -f testfile.img)
+ mkfs.btrfs -f $DEVICE
+ mkdir -p tmpmnt
+
+ FAILTYPE=fail_function
+ FAILFUNC=open_ctree
+ echo $FAILFUNC > /sys/kernel/debug/$FAILTYPE/inject
+ echo -12 > /sys/kernel/debug/$FAILTYPE/$FAILFUNC/retval
+ echo N > /sys/kernel/debug/$FAILTYPE/task-filter
+ echo 100 > /sys/kernel/debug/$FAILTYPE/probability
+ echo 0 > /sys/kernel/debug/$FAILTYPE/interval
+ echo -1 > /sys/kernel/debug/$FAILTYPE/times
+ echo 0 > /sys/kernel/debug/$FAILTYPE/space
+ echo 1 > /sys/kernel/debug/$FAILTYPE/verbose
+
+ mount -t btrfs $DEVICE tmpmnt
+ if [ $? -ne 0 ]
+ then
echo "SUCCESS!"
-else
+ else
echo "FAILED!"
umount tmpmnt
-fi
+ fi
-echo > /sys/kernel/debug/$FAILTYPE/inject
+ echo > /sys/kernel/debug/$FAILTYPE/inject
-rmdir tmpmnt
-losetup -d $DEVICE
-rm testfile.img
+ rmdir tmpmnt
+ losetup -d $DEVICE
+ rm testfile.img
Tool to run command with failslab or fail_page_alloc
@@ -354,43 +365,43 @@ see the following examples.
Examples:
Run a command "make -C tools/testing/selftests/ run_tests" with injecting slab
-allocation failure.
+allocation failure::
# ./tools/testing/fault-injection/failcmd.sh \
-- make -C tools/testing/selftests/ run_tests
Same as above except to specify 100 times failures at most instead of one time
-at most by default.
+at most by default::
# ./tools/testing/fault-injection/failcmd.sh --times=100 \
-- make -C tools/testing/selftests/ run_tests
Same as above except to inject page allocation failure instead of slab
-allocation failure.
+allocation failure::
# env FAILCMD_TYPE=fail_page_alloc \
./tools/testing/fault-injection/failcmd.sh --times=100 \
- -- make -C tools/testing/selftests/ run_tests
+ -- make -C tools/testing/selftests/ run_tests
Systematic faults using fail-nth
---------------------------------
The following code systematically faults 0-th, 1-st, 2-nd and so on
-capabilities in the socketpair() system call.
-
-#include <sys/types.h>
-#include <sys/stat.h>
-#include <sys/socket.h>
-#include <sys/syscall.h>
-#include <fcntl.h>
-#include <unistd.h>
-#include <string.h>
-#include <stdlib.h>
-#include <stdio.h>
-#include <errno.h>
-
-int main()
-{
+capabilities in the socketpair() system call::
+
+ #include <sys/types.h>
+ #include <sys/stat.h>
+ #include <sys/socket.h>
+ #include <sys/syscall.h>
+ #include <fcntl.h>
+ #include <unistd.h>
+ #include <string.h>
+ #include <stdlib.h>
+ #include <stdio.h>
+ #include <errno.h>
+
+ int main()
+ {
int i, err, res, fail_nth, fds[2];
char buf[128];
@@ -413,23 +424,23 @@ int main()
break;
}
return 0;
-}
-
-An example output:
-
-1-th fault Y: res=-1/23
-2-th fault Y: res=-1/23
-3-th fault Y: res=-1/12
-4-th fault Y: res=-1/12
-5-th fault Y: res=-1/23
-6-th fault Y: res=-1/23
-7-th fault Y: res=-1/23
-8-th fault Y: res=-1/12
-9-th fault Y: res=-1/12
-10-th fault Y: res=-1/12
-11-th fault Y: res=-1/12
-12-th fault Y: res=-1/12
-13-th fault Y: res=-1/12
-14-th fault Y: res=-1/12
-15-th fault Y: res=-1/12
-16-th fault N: res=0/12
+ }
+
+An example output::
+
+ 1-th fault Y: res=-1/23
+ 2-th fault Y: res=-1/23
+ 3-th fault Y: res=-1/12
+ 4-th fault Y: res=-1/12
+ 5-th fault Y: res=-1/23
+ 6-th fault Y: res=-1/23
+ 7-th fault Y: res=-1/23
+ 8-th fault Y: res=-1/12
+ 9-th fault Y: res=-1/12
+ 10-th fault Y: res=-1/12
+ 11-th fault Y: res=-1/12
+ 12-th fault Y: res=-1/12
+ 13-th fault Y: res=-1/12
+ 14-th fault Y: res=-1/12
+ 15-th fault Y: res=-1/12
+ 16-th fault N: res=0/12
diff --git a/Documentation/fault-injection/index.rst b/Documentation/fault-injection/index.rst
new file mode 100644
index 000000000000..92b5639ed07a
--- /dev/null
+++ b/Documentation/fault-injection/index.rst
@@ -0,0 +1,20 @@
+:orphan:
+
+===============
+fault-injection
+===============
+
+.. toctree::
+ :maxdepth: 1
+
+ fault-injection
+ notifier-error-inject
+ nvme-fault-injection
+ provoke-crashes
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/fault-injection/notifier-error-inject.txt b/Documentation/fault-injection/notifier-error-inject.rst
index e861d761de24..1668b6e48d3a 100644
--- a/Documentation/fault-injection/notifier-error-inject.txt
+++ b/Documentation/fault-injection/notifier-error-inject.rst
@@ -14,7 +14,8 @@ modules that can be used to test the following notifiers.
PM notifier error injection module
----------------------------------
This feature is controlled through debugfs interface
-/sys/kernel/debug/notifier-error-inject/pm/actions/<notifier event>/error
+
+ /sys/kernel/debug/notifier-error-inject/pm/actions/<notifier event>/error
Possible PM notifier events to be failed are:
@@ -22,7 +23,7 @@ Possible PM notifier events to be failed are:
* PM_SUSPEND_PREPARE
* PM_RESTORE_PREPARE
-Example: Inject PM suspend error (-12 = -ENOMEM)
+Example: Inject PM suspend error (-12 = -ENOMEM)::
# cd /sys/kernel/debug/notifier-error-inject/pm/
# echo -12 > actions/PM_SUSPEND_PREPARE/error
@@ -32,14 +33,15 @@ Example: Inject PM suspend error (-12 = -ENOMEM)
Memory hotplug notifier error injection module
----------------------------------------------
This feature is controlled through debugfs interface
-/sys/kernel/debug/notifier-error-inject/memory/actions/<notifier event>/error
+
+ /sys/kernel/debug/notifier-error-inject/memory/actions/<notifier event>/error
Possible memory notifier events to be failed are:
* MEM_GOING_ONLINE
* MEM_GOING_OFFLINE
-Example: Inject memory hotplug offline error (-12 == -ENOMEM)
+Example: Inject memory hotplug offline error (-12 == -ENOMEM)::
# cd /sys/kernel/debug/notifier-error-inject/memory
# echo -12 > actions/MEM_GOING_OFFLINE/error
@@ -49,7 +51,8 @@ Example: Inject memory hotplug offline error (-12 == -ENOMEM)
powerpc pSeries reconfig notifier error injection module
--------------------------------------------------------
This feature is controlled through debugfs interface
-/sys/kernel/debug/notifier-error-inject/pSeries-reconfig/actions/<notifier event>/error
+
+ /sys/kernel/debug/notifier-error-inject/pSeries-reconfig/actions/<notifier event>/error
Possible pSeries reconfig notifier events to be failed are:
@@ -61,7 +64,8 @@ Possible pSeries reconfig notifier events to be failed are:
Netdevice notifier error injection module
----------------------------------------------
This feature is controlled through debugfs interface
-/sys/kernel/debug/notifier-error-inject/netdev/actions/<notifier event>/error
+
+ /sys/kernel/debug/notifier-error-inject/netdev/actions/<notifier event>/error
Netdevice notifier events which can be failed are:
@@ -75,7 +79,7 @@ Netdevice notifier events which can be failed are:
* NETDEV_PRECHANGEUPPER
* NETDEV_CHANGEUPPER
-Example: Inject netdevice mtu change error (-22 == -EINVAL)
+Example: Inject netdevice mtu change error (-22 == -EINVAL)::
# cd /sys/kernel/debug/notifier-error-inject/netdev
# echo -22 > actions/NETDEV_CHANGEMTU/error
diff --git a/Documentation/fault-injection/nvme-fault-injection.rst b/Documentation/fault-injection/nvme-fault-injection.rst
new file mode 100644
index 000000000000..cdb2e829228e
--- /dev/null
+++ b/Documentation/fault-injection/nvme-fault-injection.rst
@@ -0,0 +1,178 @@
+NVMe Fault Injection
+====================
+Linux's fault injection framework provides a systematic way to support
+error injection via debugfs in the /sys/kernel/debug directory. When
+enabled, the default NVME_SC_INVALID_OPCODE with no retry will be
+injected into the nvme_end_request. Users can change the default status
+code and no retry flag via the debugfs. The list of Generic Command
+Status can be found in include/linux/nvme.h
+
+Following examples show how to inject an error into the nvme.
+
+First, enable CONFIG_FAULT_INJECTION_DEBUG_FS kernel config,
+recompile the kernel. After booting up the kernel, do the
+following.
+
+Example 1: Inject default status code with no retry
+---------------------------------------------------
+
+::
+
+ mount /dev/nvme0n1 /mnt
+ echo 1 > /sys/kernel/debug/nvme0n1/fault_inject/times
+ echo 100 > /sys/kernel/debug/nvme0n1/fault_inject/probability
+ cp a.file /mnt
+
+Expected Result::
+
+ cp: cannot stat ‘/mnt/a.file’: Input/output error
+
+Message from dmesg::
+
+ FAULT_INJECTION: forcing a failure.
+ name fault_inject, interval 1, probability 100, space 0, times 1
+ CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc8+ #2
+ Hardware name: innotek GmbH VirtualBox/VirtualBox,
+ BIOS VirtualBox 12/01/2006
+ Call Trace:
+ <IRQ>
+ dump_stack+0x5c/0x7d
+ should_fail+0x148/0x170
+ nvme_should_fail+0x2f/0x50 [nvme_core]
+ nvme_process_cq+0xe7/0x1d0 [nvme]
+ nvme_irq+0x1e/0x40 [nvme]
+ __handle_irq_event_percpu+0x3a/0x190
+ handle_irq_event_percpu+0x30/0x70
+ handle_irq_event+0x36/0x60
+ handle_fasteoi_irq+0x78/0x120
+ handle_irq+0xa7/0x130
+ ? tick_irq_enter+0xa8/0xc0
+ do_IRQ+0x43/0xc0
+ common_interrupt+0xa2/0xa2
+ </IRQ>
+ RIP: 0010:native_safe_halt+0x2/0x10
+ RSP: 0018:ffffffff82003e90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd
+ RAX: ffffffff817a10c0 RBX: ffffffff82012480 RCX: 0000000000000000
+ RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
+ RBP: 0000000000000000 R08: 000000008e38ce64 R09: 0000000000000000
+ R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82012480
+ R13: ffffffff82012480 R14: 0000000000000000 R15: 0000000000000000
+ ? __sched_text_end+0x4/0x4
+ default_idle+0x18/0xf0
+ do_idle+0x150/0x1d0
+ cpu_startup_entry+0x6f/0x80
+ start_kernel+0x4c4/0x4e4
+ ? set_init_arg+0x55/0x55
+ secondary_startup_64+0xa5/0xb0
+ print_req_error: I/O error, dev nvme0n1, sector 9240
+ EXT4-fs error (device nvme0n1): ext4_find_entry:1436:
+ inode #2: comm cp: reading directory lblock 0
+
+Example 2: Inject default status code with retry
+------------------------------------------------
+
+::
+
+ mount /dev/nvme0n1 /mnt
+ echo 1 > /sys/kernel/debug/nvme0n1/fault_inject/times
+ echo 100 > /sys/kernel/debug/nvme0n1/fault_inject/probability
+ echo 1 > /sys/kernel/debug/nvme0n1/fault_inject/status
+ echo 0 > /sys/kernel/debug/nvme0n1/fault_inject/dont_retry
+
+ cp a.file /mnt
+
+Expected Result::
+
+ command success without error
+
+Message from dmesg::
+
+ FAULT_INJECTION: forcing a failure.
+ name fault_inject, interval 1, probability 100, space 0, times 1
+ CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.15.0-rc8+ #4
+ Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
+ Call Trace:
+ <IRQ>
+ dump_stack+0x5c/0x7d
+ should_fail+0x148/0x170
+ nvme_should_fail+0x30/0x60 [nvme_core]
+ nvme_loop_queue_response+0x84/0x110 [nvme_loop]
+ nvmet_req_complete+0x11/0x40 [nvmet]
+ nvmet_bio_done+0x28/0x40 [nvmet]
+ blk_update_request+0xb0/0x310
+ blk_mq_end_request+0x18/0x60
+ flush_smp_call_function_queue+0x3d/0xf0
+ smp_call_function_single_interrupt+0x2c/0xc0
+ call_function_single_interrupt+0xa2/0xb0
+ </IRQ>
+ RIP: 0010:native_safe_halt+0x2/0x10
+ RSP: 0018:ffffc9000068bec0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff04
+ RAX: ffffffff817a10c0 RBX: ffff88011a3c9680 RCX: 0000000000000000
+ RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
+ RBP: 0000000000000001 R08: 000000008e38c131 R09: 0000000000000000
+ R10: 0000000000000000 R11: 0000000000000000 R12: ffff88011a3c9680
+ R13: ffff88011a3c9680 R14: 0000000000000000 R15: 0000000000000000
+ ? __sched_text_end+0x4/0x4
+ default_idle+0x18/0xf0
+ do_idle+0x150/0x1d0
+ cpu_startup_entry+0x6f/0x80
+ start_secondary+0x187/0x1e0
+ secondary_startup_64+0xa5/0xb0
+
+Example 3: Inject an error into the 10th admin command
+------------------------------------------------------
+
+::
+
+ echo 100 > /sys/kernel/debug/nvme0/fault_inject/probability
+ echo 10 > /sys/kernel/debug/nvme0/fault_inject/space
+ echo 1 > /sys/kernel/debug/nvme0/fault_inject/times
+ nvme reset /dev/nvme0
+
+Expected Result::
+
+ After NVMe controller reset, the reinitialization may or may not succeed.
+ It depends on which admin command is actually forced to fail.
+
+Message from dmesg::
+
+ nvme nvme0: resetting controller
+ FAULT_INJECTION: forcing a failure.
+ name fault_inject, interval 1, probability 100, space 1, times 1
+ CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0-rc2+ #2
+ Hardware name: MSI MS-7A45/B150M MORTAR ARCTIC (MS-7A45), BIOS 1.50 04/25/2017
+ Call Trace:
+ <IRQ>
+ dump_stack+0x63/0x85
+ should_fail+0x14a/0x170
+ nvme_should_fail+0x38/0x80 [nvme_core]
+ nvme_irq+0x129/0x280 [nvme]
+ ? blk_mq_end_request+0xb3/0x120
+ __handle_irq_event_percpu+0x84/0x1a0
+ handle_irq_event_percpu+0x32/0x80
+ handle_irq_event+0x3b/0x60
+ handle_edge_irq+0x7f/0x1a0
+ handle_irq+0x20/0x30
+ do_IRQ+0x4e/0xe0
+ common_interrupt+0xf/0xf
+ </IRQ>
+ RIP: 0010:cpuidle_enter_state+0xc5/0x460
+ Code: ff e8 8f 5f 86 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 69 03 00 00 31 ff e8 62 aa 8c ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 37 03 00 00 4c 8b 45 d0 4c 2b 45 b8 48 ba cf f7 53
+ RSP: 0018:ffffffff88c03dd0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdc
+ RAX: ffff9dac25a2ac80 RBX: ffffffff88d53760 RCX: 000000000000001f
+ RDX: 0000000000000000 RSI: 000000002d958403 RDI: 0000000000000000
+ RBP: ffffffff88c03e18 R08: fffffff75e35ffb7 R09: 00000a49a56c0b48
+ R10: ffffffff88c03da0 R11: 0000000000001b0c R12: ffff9dac25a34d00
+ R13: 0000000000000006 R14: 0000000000000006 R15: ffffffff88d53760
+ cpuidle_enter+0x2e/0x40
+ call_cpuidle+0x23/0x40
+ do_idle+0x201/0x280
+ cpu_startup_entry+0x1d/0x20
+ rest_init+0xaa/0xb0
+ arch_call_rest_init+0xe/0x1b
+ start_kernel+0x51c/0x53b
+ x86_64_start_reservations+0x24/0x26
+ x86_64_start_kernel+0x74/0x77
+ secondary_startup_64+0xa4/0xb0
+ nvme nvme0: Could not set queue count (16385)
+ nvme nvme0: IO queues not created
diff --git a/Documentation/fault-injection/nvme-fault-injection.txt b/Documentation/fault-injection/nvme-fault-injection.txt
deleted file mode 100644
index efcb339a3add..000000000000
--- a/Documentation/fault-injection/nvme-fault-injection.txt
+++ /dev/null
@@ -1,172 +0,0 @@
-NVMe Fault Injection
-====================
-Linux's fault injection framework provides a systematic way to support
-error injection via debugfs in the /sys/kernel/debug directory. When
-enabled, the default NVME_SC_INVALID_OPCODE with no retry will be
-injected into the nvme_end_request. Users can change the default status
-code and no retry flag via the debugfs. The list of Generic Command
-Status can be found in include/linux/nvme.h
-
-Following examples show how to inject an error into the nvme.
-
-First, enable CONFIG_FAULT_INJECTION_DEBUG_FS kernel config,
-recompile the kernel. After booting up the kernel, do the
-following.
-
-Example 1: Inject default status code with no retry
----------------------------------------------------
-
-mount /dev/nvme0n1 /mnt
-echo 1 > /sys/kernel/debug/nvme0n1/fault_inject/times
-echo 100 > /sys/kernel/debug/nvme0n1/fault_inject/probability
-cp a.file /mnt
-
-Expected Result:
-
-cp: cannot stat ‘/mnt/a.file’: Input/output error
-
-Message from dmesg:
-
-FAULT_INJECTION: forcing a failure.
-name fault_inject, interval 1, probability 100, space 0, times 1
-CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-rc8+ #2
-Hardware name: innotek GmbH VirtualBox/VirtualBox,
-BIOS VirtualBox 12/01/2006
-Call Trace:
- <IRQ>
- dump_stack+0x5c/0x7d
- should_fail+0x148/0x170
- nvme_should_fail+0x2f/0x50 [nvme_core]
- nvme_process_cq+0xe7/0x1d0 [nvme]
- nvme_irq+0x1e/0x40 [nvme]
- __handle_irq_event_percpu+0x3a/0x190
- handle_irq_event_percpu+0x30/0x70
- handle_irq_event+0x36/0x60
- handle_fasteoi_irq+0x78/0x120
- handle_irq+0xa7/0x130
- ? tick_irq_enter+0xa8/0xc0
- do_IRQ+0x43/0xc0
- common_interrupt+0xa2/0xa2
- </IRQ>
-RIP: 0010:native_safe_halt+0x2/0x10
-RSP: 0018:ffffffff82003e90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd
-RAX: ffffffff817a10c0 RBX: ffffffff82012480 RCX: 0000000000000000
-RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
-RBP: 0000000000000000 R08: 000000008e38ce64 R09: 0000000000000000
-R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82012480
-R13: ffffffff82012480 R14: 0000000000000000 R15: 0000000000000000
- ? __sched_text_end+0x4/0x4
- default_idle+0x18/0xf0
- do_idle+0x150/0x1d0
- cpu_startup_entry+0x6f/0x80
- start_kernel+0x4c4/0x4e4
- ? set_init_arg+0x55/0x55
- secondary_startup_64+0xa5/0xb0
- print_req_error: I/O error, dev nvme0n1, sector 9240
-EXT4-fs error (device nvme0n1): ext4_find_entry:1436:
-inode #2: comm cp: reading directory lblock 0
-
-Example 2: Inject default status code with retry
-------------------------------------------------
-
-mount /dev/nvme0n1 /mnt
-echo 1 > /sys/kernel/debug/nvme0n1/fault_inject/times
-echo 100 > /sys/kernel/debug/nvme0n1/fault_inject/probability
-echo 1 > /sys/kernel/debug/nvme0n1/fault_inject/status
-echo 0 > /sys/kernel/debug/nvme0n1/fault_inject/dont_retry
-
-cp a.file /mnt
-
-Expected Result:
-
-command success without error
-
-Message from dmesg:
-
-FAULT_INJECTION: forcing a failure.
-name fault_inject, interval 1, probability 100, space 0, times 1
-CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.15.0-rc8+ #4
-Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
-Call Trace:
- <IRQ>
- dump_stack+0x5c/0x7d
- should_fail+0x148/0x170
- nvme_should_fail+0x30/0x60 [nvme_core]
- nvme_loop_queue_response+0x84/0x110 [nvme_loop]
- nvmet_req_complete+0x11/0x40 [nvmet]
- nvmet_bio_done+0x28/0x40 [nvmet]
- blk_update_request+0xb0/0x310
- blk_mq_end_request+0x18/0x60
- flush_smp_call_function_queue+0x3d/0xf0
- smp_call_function_single_interrupt+0x2c/0xc0
- call_function_single_interrupt+0xa2/0xb0
- </IRQ>
-RIP: 0010:native_safe_halt+0x2/0x10
-RSP: 0018:ffffc9000068bec0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff04
-RAX: ffffffff817a10c0 RBX: ffff88011a3c9680 RCX: 0000000000000000
-RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
-RBP: 0000000000000001 R08: 000000008e38c131 R09: 0000000000000000
-R10: 0000000000000000 R11: 0000000000000000 R12: ffff88011a3c9680
-R13: ffff88011a3c9680 R14: 0000000000000000 R15: 0000000000000000
- ? __sched_text_end+0x4/0x4
- default_idle+0x18/0xf0
- do_idle+0x150/0x1d0
- cpu_startup_entry+0x6f/0x80
- start_secondary+0x187/0x1e0
- secondary_startup_64+0xa5/0xb0
-
-Example 3: Inject an error into the 10th admin command
-------------------------------------------------------
-
-echo 100 > /sys/kernel/debug/nvme0/fault_inject/probability
-echo 10 > /sys/kernel/debug/nvme0/fault_inject/space
-echo 1 > /sys/kernel/debug/nvme0/fault_inject/times
-nvme reset /dev/nvme0
-
-Expected Result:
-
-After NVMe controller reset, the reinitialization may or may not succeed.
-It depends on which admin command is actually forced to fail.
-
-Message from dmesg:
-
-nvme nvme0: resetting controller
-FAULT_INJECTION: forcing a failure.
-name fault_inject, interval 1, probability 100, space 1, times 1
-CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0-rc2+ #2
-Hardware name: MSI MS-7A45/B150M MORTAR ARCTIC (MS-7A45), BIOS 1.50 04/25/2017
-Call Trace:
- <IRQ>
- dump_stack+0x63/0x85
- should_fail+0x14a/0x170
- nvme_should_fail+0x38/0x80 [nvme_core]
- nvme_irq+0x129/0x280 [nvme]
- ? blk_mq_end_request+0xb3/0x120
- __handle_irq_event_percpu+0x84/0x1a0
- handle_irq_event_percpu+0x32/0x80
- handle_irq_event+0x3b/0x60
- handle_edge_irq+0x7f/0x1a0
- handle_irq+0x20/0x30
- do_IRQ+0x4e/0xe0
- common_interrupt+0xf/0xf
- </IRQ>
-RIP: 0010:cpuidle_enter_state+0xc5/0x460
-Code: ff e8 8f 5f 86 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 69 03 00 00 31 ff e8 62 aa 8c ff fb 66 0f 1f 44 00 00 <45> 85 ed 0f 88 37 03 00 00 4c 8b 45 d0 4c 2b 45 b8 48 ba cf f7 53
-RSP: 0018:ffffffff88c03dd0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdc
-RAX: ffff9dac25a2ac80 RBX: ffffffff88d53760 RCX: 000000000000001f
-RDX: 0000000000000000 RSI: 000000002d958403 RDI: 0000000000000000
-RBP: ffffffff88c03e18 R08: fffffff75e35ffb7 R09: 00000a49a56c0b48
-R10: ffffffff88c03da0 R11: 0000000000001b0c R12: ffff9dac25a34d00
-R13: 0000000000000006 R14: 0000000000000006 R15: ffffffff88d53760
- cpuidle_enter+0x2e/0x40
- call_cpuidle+0x23/0x40
- do_idle+0x201/0x280
- cpu_startup_entry+0x1d/0x20
- rest_init+0xaa/0xb0
- arch_call_rest_init+0xe/0x1b
- start_kernel+0x51c/0x53b
- x86_64_start_reservations+0x24/0x26
- x86_64_start_kernel+0x74/0x77
- secondary_startup_64+0xa4/0xb0
-nvme nvme0: Could not set queue count (16385)
-nvme nvme0: IO queues not created
diff --git a/Documentation/fault-injection/provoke-crashes.rst b/Documentation/fault-injection/provoke-crashes.rst
new file mode 100644
index 000000000000..9279a3e12278
--- /dev/null
+++ b/Documentation/fault-injection/provoke-crashes.rst
@@ -0,0 +1,48 @@
+===============
+Provoke crashes
+===============
+
+The lkdtm module provides an interface to crash or injure the kernel at
+predefined crashpoints to evaluate the reliability of crash dumps obtained
+using different dumping solutions. The module uses KPROBEs to instrument
+crashing points, but can also crash the kernel directly without KRPOBE
+support.
+
+
+You can provide the way either through module arguments when inserting
+the module, or through a debugfs interface.
+
+Usage::
+
+ insmod lkdtm.ko [recur_count={>0}] cpoint_name=<> cpoint_type=<>
+ [cpoint_count={>0}]
+
+recur_count
+ Recursion level for the stack overflow test. Default is 10.
+
+cpoint_name
+ Crash point where the kernel is to be crashed. It can be
+ one of INT_HARDWARE_ENTRY, INT_HW_IRQ_EN, INT_TASKLET_ENTRY,
+ FS_DEVRW, MEM_SWAPOUT, TIMERADD, SCSI_DISPATCH_CMD,
+ IDE_CORE_CP, DIRECT
+
+cpoint_type
+ Indicates the action to be taken on hitting the crash point.
+ It can be one of PANIC, BUG, EXCEPTION, LOOP, OVERFLOW,
+ CORRUPT_STACK, UNALIGNED_LOAD_STORE_WRITE, OVERWRITE_ALLOCATION,
+ WRITE_AFTER_FREE,
+
+cpoint_count
+ Indicates the number of times the crash point is to be hit
+ to trigger an action. The default is 10.
+
+You can also induce failures by mounting debugfs and writing the type to
+<mountpoint>/provoke-crash/<crashpoint>. E.g.::
+
+ mount -t debugfs debugfs /mnt
+ echo EXCEPTION > /mnt/provoke-crash/INT_HARDWARE_ENTRY
+
+
+A special file is `DIRECT` which will induce the crash directly without
+KPROBE instrumentation. This mode is the only one available when the module
+is built on a kernel without KPROBEs support.
diff --git a/Documentation/fault-injection/provoke-crashes.txt b/Documentation/fault-injection/provoke-crashes.txt
deleted file mode 100644
index 7a9d3d81525b..000000000000
--- a/Documentation/fault-injection/provoke-crashes.txt
+++ /dev/null
@@ -1,38 +0,0 @@
-The lkdtm module provides an interface to crash or injure the kernel at
-predefined crashpoints to evaluate the reliability of crash dumps obtained
-using different dumping solutions. The module uses KPROBEs to instrument
-crashing points, but can also crash the kernel directly without KRPOBE
-support.
-
-
-You can provide the way either through module arguments when inserting
-the module, or through a debugfs interface.
-
-Usage: insmod lkdtm.ko [recur_count={>0}] cpoint_name=<> cpoint_type=<>
- [cpoint_count={>0}]
-
- recur_count : Recursion level for the stack overflow test. Default is 10.
-
- cpoint_name : Crash point where the kernel is to be crashed. It can be
- one of INT_HARDWARE_ENTRY, INT_HW_IRQ_EN, INT_TASKLET_ENTRY,
- FS_DEVRW, MEM_SWAPOUT, TIMERADD, SCSI_DISPATCH_CMD,
- IDE_CORE_CP, DIRECT
-
- cpoint_type : Indicates the action to be taken on hitting the crash point.
- It can be one of PANIC, BUG, EXCEPTION, LOOP, OVERFLOW,
- CORRUPT_STACK, UNALIGNED_LOAD_STORE_WRITE, OVERWRITE_ALLOCATION,
- WRITE_AFTER_FREE,
-
- cpoint_count : Indicates the number of times the crash point is to be hit
- to trigger an action. The default is 10.
-
-You can also induce failures by mounting debugfs and writing the type to
-<mountpoint>/provoke-crash/<crashpoint>. E.g.,
-
- mount -t debugfs debugfs /mnt
- echo EXCEPTION > /mnt/provoke-crash/INT_HARDWARE_ENTRY
-
-
-A special file is `DIRECT' which will induce the crash directly without
-KPROBE instrumentation. This mode is the only one available when the module
-is built on a kernel without KPROBEs support.
diff --git a/Documentation/fb/api.txt b/Documentation/fb/api.rst
index d52cf1e3b975..79ec33dded74 100644
--- a/Documentation/fb/api.txt
+++ b/Documentation/fb/api.rst
@@ -1,5 +1,6 @@
- The Frame Buffer Device API
- ---------------------------
+===========================
+The Frame Buffer Device API
+===========================
Last revised: June 21, 2011
@@ -21,13 +22,13 @@ deal with different behaviours.
---------------
Device and driver capabilities are reported in the fixed screen information
-capabilities field.
+capabilities field::
-struct fb_fix_screeninfo {
+ struct fb_fix_screeninfo {
...
__u16 capabilities; /* see FB_CAP_* */
...
-};
+ };
Application should use those capabilities to find out what features they can
expect from the device and driver.
@@ -151,9 +152,9 @@ fb_fix_screeninfo and fb_var_screeninfo structure respectively.
struct fb_fix_screeninfo stores device independent unchangeable information
about the frame buffer device and the current format. Those information can't
be directly modified by applications, but can be changed by the driver when an
-application modifies the format.
+application modifies the format::
-struct fb_fix_screeninfo {
+ struct fb_fix_screeninfo {
char id[16]; /* identification string eg "TT Builtin" */
unsigned long smem_start; /* Start of frame buffer mem */
/* (physical address) */
@@ -172,13 +173,13 @@ struct fb_fix_screeninfo {
/* specific chip/card we have */
__u16 capabilities; /* see FB_CAP_* */
__u16 reserved[2]; /* Reserved for future compatibility */
-};
+ };
struct fb_var_screeninfo stores device independent changeable information
about a frame buffer device, its current format and video mode, as well as
-other miscellaneous parameters.
+other miscellaneous parameters::
-struct fb_var_screeninfo {
+ struct fb_var_screeninfo {
__u32 xres; /* visible resolution */
__u32 yres;
__u32 xres_virtual; /* virtual resolution */
@@ -216,7 +217,7 @@ struct fb_var_screeninfo {
__u32 rotate; /* angle we rotate counter clockwise */
__u32 colorspace; /* colorspace for FOURCC-based modes */
__u32 reserved[4]; /* Reserved for future compatibility */
-};
+ };
To modify variable information, applications call the FBIOPUT_VSCREENINFO
ioctl with a pointer to a fb_var_screeninfo structure. If the call is
@@ -255,14 +256,14 @@ monochrome, grayscale or pseudocolor visuals, although this is not required.
- For truecolor and directcolor formats, applications set the grayscale field
to zero, and the red, blue, green and transp fields to describe the layout of
- color components in memory.
+ color components in memory::
-struct fb_bitfield {
+ struct fb_bitfield {
__u32 offset; /* beginning of bitfield */
__u32 length; /* length of bitfield */
__u32 msb_right; /* != 0 : Most significant bit is */
/* right */
-};
+ };
Pixel values are bits_per_pixel wide and are split in non-overlapping red,
green, blue and alpha (transparency) components. Location and size of each
diff --git a/Documentation/fb/arkfb.txt b/Documentation/fb/arkfb.rst
index e8487a9d6a05..aeca8773dd7e 100644
--- a/Documentation/fb/arkfb.txt
+++ b/Documentation/fb/arkfb.rst
@@ -1,6 +1,6 @@
-
- arkfb - fbdev driver for ARK Logic chips
- ========================================
+========================================
+arkfb - fbdev driver for ARK Logic chips
+========================================
Supported Hardware
@@ -47,7 +47,7 @@ Missing Features
(alias TODO list)
* secondary (not initialized by BIOS) device support
- * big endian support
+ * big endian support
* DPMS support
* MMIO support
* interlaced mode variant
diff --git a/Documentation/fb/aty128fb.txt b/Documentation/fb/aty128fb.rst
index b605204fcfe1..3f107718f933 100644
--- a/Documentation/fb/aty128fb.txt
+++ b/Documentation/fb/aty128fb.rst
@@ -1,8 +1,9 @@
-[This file is cloned from VesaFB/matroxfb]
-
+=================
What is aty128fb?
=================
+.. [This file is cloned from VesaFB/matroxfb]
+
This is a driver for a graphic framebuffer for ATI Rage128 based devices
on Intel and PPC boxes.
@@ -24,15 +25,15 @@ How to use it?
==============
Switching modes is done using the video=aty128fb:<resolution>... modedb
-boot parameter or using `fbset' program.
+boot parameter or using `fbset` program.
-See Documentation/fb/modedb.txt for more information on modedb
+See Documentation/fb/modedb.rst for more information on modedb
resolutions.
You should compile in both vgacon (to boot if you remove your Rage128 from
box) and aty128fb (for graphics mode). You should not compile-in vesafb
-unless you have primary display on non-Rage128 VBE2.0 device (see
-Documentation/fb/vesafb.txt for details).
+unless you have primary display on non-Rage128 VBE2.0 device (see
+Documentation/fb/vesafb.rst for details).
X11
@@ -48,16 +49,18 @@ Configuration
=============
You can pass kernel command line options to vesafb with
-`video=aty128fb:option1,option2:value2,option3' (multiple options should
-be separated by comma, values are separated from options by `:').
+`video=aty128fb:option1,option2:value2,option3` (multiple options should
+be separated by comma, values are separated from options by `:`).
Accepted options:
-noaccel - do not use acceleration engine. It is default.
-accel - use acceleration engine. Not finished.
-vmode:x - chooses PowerMacintosh video mode <x>. Deprecated.
-cmode:x - chooses PowerMacintosh colour mode <x>. Deprecated.
-<XxX@X> - selects startup videomode. See modedb.txt for detailed
- explanation. Default is 640x480x8bpp.
+========= =======================================================
+noaccel do not use acceleration engine. It is default.
+accel use acceleration engine. Not finished.
+vmode:x chooses PowerMacintosh video mode <x>. Deprecated.
+cmode:x chooses PowerMacintosh colour mode <x>. Deprecated.
+<XxX@X> selects startup videomode. See modedb.txt for detailed
+ explanation. Default is 640x480x8bpp.
+========= =======================================================
Limitations
@@ -65,8 +68,8 @@ Limitations
There are known and unknown bugs, features and misfeatures.
Currently there are following known bugs:
- + This driver is still experimental and is not finished. Too many
+
+ - This driver is still experimental and is not finished. Too many
bugs/errata to list here.
---
Brad Douglas <brad@neruo.com>
diff --git a/Documentation/fb/cirrusfb.txt b/Documentation/fb/cirrusfb.rst
index f75950d330a4..8c3e6c6cb114 100644
--- a/Documentation/fb/cirrusfb.txt
+++ b/Documentation/fb/cirrusfb.rst
@@ -1,32 +1,32 @@
+============================================
+Framebuffer driver for Cirrus Logic chipsets
+============================================
- Framebuffer driver for Cirrus Logic chipsets
- Copyright 1999 Jeff Garzik <jgarzik@pobox.com>
+Copyright 1999 Jeff Garzik <jgarzik@pobox.com>
-
-{ just a little something to get people going; contributors welcome! }
-
+.. just a little something to get people going; contributors welcome!
Chip families supported:
- SD64
- Piccolo
- Picasso
- Spectrum
- Alpine (GD-543x/4x)
- Picasso4 (GD-5446)
- GD-5480
- Laguna (GD-546x)
+ - SD64
+ - Piccolo
+ - Picasso
+ - Spectrum
+ - Alpine (GD-543x/4x)
+ - Picasso4 (GD-5446)
+ - GD-5480
+ - Laguna (GD-546x)
Bus's supported:
- PCI
- Zorro
+ - PCI
+ - Zorro
Architectures supported:
- i386
- Alpha
- PPC (Motorola Powerstack)
- m68k (Amiga)
+ - i386
+ - Alpha
+ - PPC (Motorola Powerstack)
+ - m68k (Amiga)
@@ -34,10 +34,9 @@ Default video modes
-------------------
At the moment, there are two kernel command line arguments supported:
-mode:640x480
-mode:800x600
- or
-mode:1024x768
+- mode:640x480
+- mode:800x600
+- mode:1024x768
Full support for startup video modes (modedb) will be integrated soon.
@@ -93,5 +92,3 @@ Version 1.9.4
Version 1.9.3
-------------
* Bundled with kernel 2.3.14-pre1 or later.
-
-
diff --git a/Documentation/fb/cmap_xfbdev.txt b/Documentation/fb/cmap_xfbdev.rst
index 55e1f0a3d2b4..5db5e9787361 100644
--- a/Documentation/fb/cmap_xfbdev.txt
+++ b/Documentation/fb/cmap_xfbdev.rst
@@ -1,26 +1,29 @@
+==========================
Understanding fbdev's cmap
---------------------------
+==========================
These notes explain how X's dix layer uses fbdev's cmap structures.
-*. example of relevant structures in fbdev as used for a 3-bit grayscale cmap
-struct fb_var_screeninfo {
- .bits_per_pixel = 8,
- .grayscale = 1,
- .red = { 4, 3, 0 },
- .green = { 0, 0, 0 },
- .blue = { 0, 0, 0 },
-}
-struct fb_fix_screeninfo {
- .visual = FB_VISUAL_STATIC_PSEUDOCOLOR,
-}
-for (i = 0; i < 8; i++)
+- example of relevant structures in fbdev as used for a 3-bit grayscale cmap::
+
+ struct fb_var_screeninfo {
+ .bits_per_pixel = 8,
+ .grayscale = 1,
+ .red = { 4, 3, 0 },
+ .green = { 0, 0, 0 },
+ .blue = { 0, 0, 0 },
+ }
+ struct fb_fix_screeninfo {
+ .visual = FB_VISUAL_STATIC_PSEUDOCOLOR,
+ }
+ for (i = 0; i < 8; i++)
info->cmap.red[i] = (((2*i)+1)*(0xFFFF))/16;
-memcpy(info->cmap.green, info->cmap.red, sizeof(u16)*8);
-memcpy(info->cmap.blue, info->cmap.red, sizeof(u16)*8);
+ memcpy(info->cmap.green, info->cmap.red, sizeof(u16)*8);
+ memcpy(info->cmap.blue, info->cmap.red, sizeof(u16)*8);
-*. X11 apps do something like the following when trying to use grayscale.
-for (i=0; i < 8; i++) {
+- X11 apps do something like the following when trying to use grayscale::
+
+ for (i=0; i < 8; i++) {
char colorspec[64];
memset(colorspec,0,64);
sprintf(colorspec, "rgb:%x/%x/%x", i*36,i*36,i*36);
@@ -28,26 +31,26 @@ for (i=0; i < 8; i++) {
printf("Can't get color %s\n",colorspec);
XAllocColor(outputDisplay, testColormap, &wantedColor);
grays[i] = wantedColor;
-}
+ }
+
There's also named equivalents like gray1..x provided you have an rgb.txt.
Somewhere in X's callchain, this results in a call to X code that handles the
colormap. For example, Xfbdev hits the following:
-xc-011010/programs/Xserver/dix/colormap.c:
+xc-011010/programs/Xserver/dix/colormap.c::
-FindBestPixel(pentFirst, size, prgb, channel)
+ FindBestPixel(pentFirst, size, prgb, channel)
-dr = (long) pent->co.local.red - prgb->red;
-dg = (long) pent->co.local.green - prgb->green;
-db = (long) pent->co.local.blue - prgb->blue;
-sq = dr * dr;
-UnsignedToBigNum (sq, &sum);
-BigNumAdd (&sum, &temp, &sum);
+ dr = (long) pent->co.local.red - prgb->red;
+ dg = (long) pent->co.local.green - prgb->green;
+ db = (long) pent->co.local.blue - prgb->blue;
+ sq = dr * dr;
+ UnsignedToBigNum (sq, &sum);
+ BigNumAdd (&sum, &temp, &sum);
co.local.red are entries that were brought in through FBIOGETCMAP which come
directly from the info->cmap.red that was listed above. The prgb is the rgb
that the app wants to match to. The above code is doing what looks like a least
squares matching function. That's why the cmap entries can't be set to the left
hand side boundaries of a color range.
-
diff --git a/Documentation/fb/deferred_io.txt b/Documentation/fb/deferred_io.rst
index 748328370250..7300cff255a3 100644
--- a/Documentation/fb/deferred_io.txt
+++ b/Documentation/fb/deferred_io.rst
@@ -1,5 +1,6 @@
+===========
Deferred IO
------------
+===========
Deferred IO is a way to delay and repurpose IO. It uses host memory as a
buffer and the MMU pagefault as a pretrigger for when to perform the device
@@ -16,7 +17,7 @@ works:
- app continues writing to that page with no additional cost. this is
the key benefit.
- the workqueue task comes in and mkcleans the pages on the list, then
- completes the work associated with updating the framebuffer. this is
+ completes the work associated with updating the framebuffer. this is
the real work talking to the device.
- app tries to write to the address (that has now been mkcleaned)
- get pagefault and the above sequence occurs again
@@ -47,29 +48,32 @@ How to use it: (for fbdev drivers)
----------------------------------
The following example may be helpful.
-1. Setup your structure. Eg:
+1. Setup your structure. Eg::
-static struct fb_deferred_io hecubafb_defio = {
- .delay = HZ,
- .deferred_io = hecubafb_dpy_deferred_io,
-};
+ static struct fb_deferred_io hecubafb_defio = {
+ .delay = HZ,
+ .deferred_io = hecubafb_dpy_deferred_io,
+ };
The delay is the minimum delay between when the page_mkwrite trigger occurs
and when the deferred_io callback is called. The deferred_io callback is
explained below.
-2. Setup your deferred IO callback. Eg:
-static void hecubafb_dpy_deferred_io(struct fb_info *info,
- struct list_head *pagelist)
+2. Setup your deferred IO callback. Eg::
+
+ static void hecubafb_dpy_deferred_io(struct fb_info *info,
+ struct list_head *pagelist)
The deferred_io callback is where you would perform all your IO to the display
device. You receive the pagelist which is the list of pages that were written
to during the delay. You must not modify this list. This callback is called
from a workqueue.
-3. Call init
+3. Call init::
+
info->fbdefio = &hecubafb_defio;
fb_deferred_io_init(info);
-4. Call cleanup
+4. Call cleanup::
+
fb_deferred_io_cleanup(info);
diff --git a/Documentation/fb/efifb.txt b/Documentation/fb/efifb.rst
index 1a85c1bdaf38..04840331a00e 100644
--- a/Documentation/fb/efifb.txt
+++ b/Documentation/fb/efifb.rst
@@ -1,6 +1,6 @@
-
+==============
What is efifb?
-===============
+==============
This is a generic EFI platform driver for Intel based Apple computers.
efifb is only for EFI booted Intel Macs.
@@ -8,16 +8,17 @@ efifb is only for EFI booted Intel Macs.
Supported Hardware
==================
-iMac 17"/20"
-Macbook
-Macbook Pro 15"/17"
-MacMini
+- iMac 17"/20"
+- Macbook
+- Macbook Pro 15"/17"
+- MacMini
How to use it?
==============
efifb does not have any kind of autodetection of your machine.
-You have to add the following kernel parameters in your elilo.conf:
+You have to add the following kernel parameters in your elilo.conf::
+
Macbook :
video=efifb:macbook
MacMini :
@@ -29,9 +30,10 @@ You have to add the following kernel parameters in your elilo.conf:
Accepted options:
+======= ===========================================================
nowc Don't map the framebuffer write combined. This can be used
to workaround side-effects and slowdowns on other CPU cores
when large amounts of console data are written.
+======= ===========================================================
---
Edgar Hucek <gimli@dark-green.com>
diff --git a/Documentation/fb/ep93xx-fb.txt b/Documentation/fb/ep93xx-fb.rst
index 5af1bd9effae..6f7767926d1a 100644
--- a/Documentation/fb/ep93xx-fb.txt
+++ b/Documentation/fb/ep93xx-fb.rst
@@ -4,7 +4,7 @@ Driver for EP93xx LCD controller
The EP93xx LCD controller can drive both standard desktop monitors and
embedded LCD displays. If you have a standard desktop monitor then you
-can use the standard Linux video mode database. In your board file:
+can use the standard Linux video mode database. In your board file::
static struct ep93xxfb_mach_info some_board_fb_info = {
.num_modes = EP93XXFB_USE_MODEDB,
@@ -12,7 +12,7 @@ can use the standard Linux video mode database. In your board file:
};
If you have an embedded LCD display then you need to define a video
-mode for it as follows:
+mode for it as follows::
static struct fb_videomode some_board_video_modes[] = {
{
@@ -23,11 +23,11 @@ mode for it as follows:
Note that the pixel clock value is in pico-seconds. You can use the
KHZ2PICOS macro to convert the pixel clock value. Most other values
-are in pixel clocks. See Documentation/fb/framebuffer.txt for further
+are in pixel clocks. See Documentation/fb/framebuffer.rst for further
details.
The ep93xxfb_mach_info structure for your board should look like the
-following:
+following::
static struct ep93xxfb_mach_info some_board_fb_info = {
.num_modes = ARRAY_SIZE(some_board_video_modes),
@@ -37,7 +37,7 @@ following:
};
The framebuffer device can be registered by adding the following to
-your board initialisation function:
+your board initialisation function::
ep93xx_register_fb(&some_board_fb_info);
@@ -50,6 +50,7 @@ to configure the controller. The video attributes flags are fully
documented in section 7 of the EP93xx users' guide. The following
flags are available:
+=============================== ==========================================
EP93XXFB_PCLK_FALLING Clock data on the falling edge of the
pixel clock. The default is to clock
data on the rising edge.
@@ -62,10 +63,12 @@ EP93XXFB_SYNC_HORIZ_HIGH Horizontal sync is active high. By
EP93XXFB_SYNC_VERT_HIGH Vertical sync is active high. By
default the vertical sync is active high.
+=============================== ==========================================
The physical address of the framebuffer can be controlled using the
following flags:
+=============================== ======================================
EP93XXFB_USE_SDCSN0 Use SDCSn[0] for the framebuffer. This
is the default setting.
@@ -74,6 +77,7 @@ EP93XXFB_USE_SDCSN1 Use SDCSn[1] for the framebuffer.
EP93XXFB_USE_SDCSN2 Use SDCSn[2] for the framebuffer.
EP93XXFB_USE_SDCSN3 Use SDCSn[3] for the framebuffer.
+=============================== ======================================
==================
Platform callbacks
@@ -87,7 +91,7 @@ blanked or unblanked.
The setup and teardown devices pass the platform_device structure as
an argument. The fb_info and ep93xxfb_mach_info structures can be
-obtained as follows:
+obtained as follows::
static int some_board_fb_setup(struct platform_device *pdev)
{
@@ -101,17 +105,17 @@ obtained as follows:
Setting the video mode
======================
-The video mode is set using the following syntax:
+The video mode is set using the following syntax::
video=XRESxYRES[-BPP][@REFRESH]
If the EP93xx video driver is built-in then the video mode is set on
-the Linux kernel command line, for example:
+the Linux kernel command line, for example::
video=ep93xx-fb:800x600-16@60
If the EP93xx video driver is built as a module then the video mode is
-set when the module is installed:
+set when the module is installed::
modprobe ep93xx-fb video=320x240
@@ -121,13 +125,14 @@ Screenpage bug
At least on the EP9315 there is a silicon bug which causes bit 27 of
the VIDSCRNPAGE (framebuffer physical offset) to be tied low. There is
-an unofficial errata for this bug at:
+an unofficial errata for this bug at::
+
http://marc.info/?l=linux-arm-kernel&m=110061245502000&w=2
By default the EP93xx framebuffer driver checks if the allocated physical
address has bit 27 set. If it does, then the memory is freed and an
error is returned. The check can be disabled by adding the following
-option when loading the driver:
+option when loading the driver::
ep93xx-fb.check_screenpage_bug=0
diff --git a/Documentation/fb/fbcon.txt b/Documentation/fb/fbcon.rst
index 5a865437b33f..1da65b9000de 100644
--- a/Documentation/fb/fbcon.txt
+++ b/Documentation/fb/fbcon.rst
@@ -1,39 +1,41 @@
+=======================
The Framebuffer Console
=======================
- The framebuffer console (fbcon), as its name implies, is a text
+The framebuffer console (fbcon), as its name implies, is a text
console running on top of the framebuffer device. It has the functionality of
any standard text console driver, such as the VGA console, with the added
features that can be attributed to the graphical nature of the framebuffer.
- In the x86 architecture, the framebuffer console is optional, and
+In the x86 architecture, the framebuffer console is optional, and
some even treat it as a toy. For other architectures, it is the only available
display device, text or graphical.
- What are the features of fbcon? The framebuffer console supports
+What are the features of fbcon? The framebuffer console supports
high resolutions, varying font types, display rotation, primitive multihead,
etc. Theoretically, multi-colored fonts, blending, aliasing, and any feature
made available by the underlying graphics card are also possible.
A. Configuration
+================
- The framebuffer console can be enabled by using your favorite kernel
+The framebuffer console can be enabled by using your favorite kernel
configuration tool. It is under Device Drivers->Graphics Support->Frame
buffer Devices->Console display driver support->Framebuffer Console Support.
Select 'y' to compile support statically or 'm' for module support. The
module will be fbcon.
- In order for fbcon to activate, at least one framebuffer driver is
+In order for fbcon to activate, at least one framebuffer driver is
required, so choose from any of the numerous drivers available. For x86
systems, they almost universally have VGA cards, so vga16fb and vesafb will
always be available. However, using a chipset-specific driver will give you
more speed and features, such as the ability to change the video mode
dynamically.
- To display the penguin logo, choose any logo available in Graphics
+To display the penguin logo, choose any logo available in Graphics
support->Bootup logo.
- Also, you will need to select at least one compiled-in font, but if
+Also, you will need to select at least one compiled-in font, but if
you don't do anything, the kernel configuration tool will select one for you,
usually an 8x16 font.
@@ -44,6 +46,7 @@ fortunate to have a driver that does not alter the graphics chip, then you
will still get a VGA console.
B. Loading
+==========
Possible scenarios:
@@ -72,33 +75,33 @@ Possible scenarios:
C. Boot options
- The framebuffer console has several, largely unknown, boot options
- that can change its behavior.
+ The framebuffer console has several, largely unknown, boot options
+ that can change its behavior.
1. fbcon=font:<name>
- Select the initial font to use. The value 'name' can be any of the
- compiled-in fonts: 10x18, 6x10, 7x14, Acorn8x8, MINI4x6,
- PEARL8x8, ProFont6x11, SUN12x22, SUN8x16, TER16x32, VGA8x16, VGA8x8.
+ Select the initial font to use. The value 'name' can be any of the
+ compiled-in fonts: 10x18, 6x10, 7x14, Acorn8x8, MINI4x6,
+ PEARL8x8, ProFont6x11, SUN12x22, SUN8x16, TER16x32, VGA8x16, VGA8x8.
Note, not all drivers can handle font with widths not divisible by 8,
- such as vga16fb.
+ such as vga16fb.
2. fbcon=scrollback:<value>[k]
- The scrollback buffer is memory that is used to preserve display
- contents that has already scrolled past your view. This is accessed
- by using the Shift-PageUp key combination. The value 'value' is any
- integer. It defaults to 32KB. The 'k' suffix is optional, and will
- multiply the 'value' by 1024.
+ The scrollback buffer is memory that is used to preserve display
+ contents that has already scrolled past your view. This is accessed
+ by using the Shift-PageUp key combination. The value 'value' is any
+ integer. It defaults to 32KB. The 'k' suffix is optional, and will
+ multiply the 'value' by 1024.
3. fbcon=map:<0123>
- This is an interesting option. It tells which driver gets mapped to
- which console. The value '0123' is a sequence that gets repeated until
- the total length is 64 which is the number of consoles available. In
- the above example, it is expanded to 012301230123... and the mapping
- will be:
+ This is an interesting option. It tells which driver gets mapped to
+ which console. The value '0123' is a sequence that gets repeated until
+ the total length is 64 which is the number of consoles available. In
+ the above example, it is expanded to 012301230123... and the mapping
+ will be::
tty | 1 2 3 4 5 6 7 8 9 ...
fb | 0 1 2 3 0 1 2 3 0 ...
@@ -126,20 +129,20 @@ C. Boot options
4. fbcon=rotate:<n>
- This option changes the orientation angle of the console display. The
- value 'n' accepts the following:
+ This option changes the orientation angle of the console display. The
+ value 'n' accepts the following:
- 0 - normal orientation (0 degree)
- 1 - clockwise orientation (90 degrees)
- 2 - upside down orientation (180 degrees)
- 3 - counterclockwise orientation (270 degrees)
+ - 0 - normal orientation (0 degree)
+ - 1 - clockwise orientation (90 degrees)
+ - 2 - upside down orientation (180 degrees)
+ - 3 - counterclockwise orientation (270 degrees)
The angle can be changed anytime afterwards by 'echoing' the same
numbers to any one of the 2 attributes found in
/sys/class/graphics/fbcon:
- rotate - rotate the display of the active console
- rotate_all - rotate the display of all consoles
+ - rotate - rotate the display of the active console
+ - rotate_all - rotate the display of all consoles
Console rotation will only become available if Framebuffer Console
Rotation support is compiled in your kernel.
@@ -177,9 +180,9 @@ Before going on to how to attach, detach and unload the framebuffer console, an
illustration of the dependencies may help.
The console layer, as with most subsystems, needs a driver that interfaces with
-the hardware. Thus, in a VGA console:
+the hardware. Thus, in a VGA console::
-console ---> VGA driver ---> hardware.
+ console ---> VGA driver ---> hardware.
Assuming the VGA driver can be unloaded, one must first unbind the VGA driver
from the console layer before unloading the driver. The VGA driver cannot be
@@ -187,9 +190,9 @@ unloaded if it is still bound to the console layer. (See
Documentation/console/console.txt for more information).
This is more complicated in the case of the framebuffer console (fbcon),
-because fbcon is an intermediate layer between the console and the drivers:
+because fbcon is an intermediate layer between the console and the drivers::
-console ---> fbcon ---> fbdev drivers ---> hardware
+ console ---> fbcon ---> fbdev drivers ---> hardware
The fbdev drivers cannot be unloaded if bound to fbcon, and fbcon cannot
be unloaded if it's bound to the console layer.
@@ -204,12 +207,12 @@ So, how do we unbind fbcon from the console? Part of the answer is in
Documentation/console/console.txt. To summarize:
Echo a value to the bind file that represents the framebuffer console
-driver. So assuming vtcon1 represents fbcon, then:
+driver. So assuming vtcon1 represents fbcon, then::
-echo 1 > sys/class/vtconsole/vtcon1/bind - attach framebuffer console to
- console layer
-echo 0 > sys/class/vtconsole/vtcon1/bind - detach framebuffer console from
- console layer
+ echo 1 > sys/class/vtconsole/vtcon1/bind - attach framebuffer console to
+ console layer
+ echo 0 > sys/class/vtconsole/vtcon1/bind - detach framebuffer console from
+ console layer
If fbcon is detached from the console layer, your boot console driver (which is
usually VGA text mode) will take over. A few drivers (rivafb and i810fb) will
@@ -223,19 +226,19 @@ restored properly. The following is one of the several methods that you can do:
2. In your kernel configuration, ensure that CONFIG_FRAMEBUFFER_CONSOLE is set
to 'y' or 'm'. Enable one or more of your favorite framebuffer drivers.
-3. Boot into text mode and as root run:
+3. Boot into text mode and as root run::
vbetool vbestate save > <vga state file>
- The above command saves the register contents of your graphics
- hardware to <vga state file>. You need to do this step only once as
- the state file can be reused.
+ The above command saves the register contents of your graphics
+ hardware to <vga state file>. You need to do this step only once as
+ the state file can be reused.
-4. If fbcon is compiled as a module, load fbcon by doing:
+4. If fbcon is compiled as a module, load fbcon by doing::
modprobe fbcon
-5. Now to detach fbcon:
+5. Now to detach fbcon::
vbetool vbestate restore < <vga state file> && \
echo 0 > /sys/class/vtconsole/vtcon1/bind
@@ -243,7 +246,7 @@ restored properly. The following is one of the several methods that you can do:
6. That's it, you're back to VGA mode. And if you compiled fbcon as a module,
you can unload it by 'rmmod fbcon'.
-7. To reattach fbcon:
+7. To reattach fbcon::
echo 1 > /sys/class/vtconsole/vtcon1/bind
@@ -266,82 +269,82 @@ the following:
Variation 1:
- a. Before detaching fbcon, do
+ a. Before detaching fbcon, do::
- vbetool vbemode save > <vesa state file> # do once for each vesafb mode,
- # the file can be reused
+ vbetool vbemode save > <vesa state file> # do once for each vesafb mode,
+ # the file can be reused
b. Detach fbcon as in step 5.
- c. Attach fbcon
+ c. Attach fbcon::
- vbetool vbestate restore < <vesa state file> && \
+ vbetool vbestate restore < <vesa state file> && \
echo 1 > /sys/class/vtconsole/vtcon1/bind
Variation 2:
- a. Before detaching fbcon, do:
- echo <ID> > /sys/class/tty/console/bind
+ a. Before detaching fbcon, do::
+ echo <ID> > /sys/class/tty/console/bind
- vbetool vbemode get
+ vbetool vbemode get
b. Take note of the mode number
b. Detach fbcon as in step 5.
- c. Attach fbcon:
+ c. Attach fbcon::
- vbetool vbemode set <mode number> && \
- echo 1 > /sys/class/vtconsole/vtcon1/bind
+ vbetool vbemode set <mode number> && \
+ echo 1 > /sys/class/vtconsole/vtcon1/bind
Samples:
========
Here are 2 sample bash scripts that you can use to bind or unbind the
-framebuffer console driver if you are on an X86 box:
+framebuffer console driver if you are on an X86 box::
----------------------------------------------------------------------------
-#!/bin/bash
-# Unbind fbcon
+ #!/bin/bash
+ # Unbind fbcon
-# Change this to where your actual vgastate file is located
-# Or Use VGASTATE=$1 to indicate the state file at runtime
-VGASTATE=/tmp/vgastate
+ # Change this to where your actual vgastate file is located
+ # Or Use VGASTATE=$1 to indicate the state file at runtime
+ VGASTATE=/tmp/vgastate