path: root/kernel
diff options
authorAndrew Morgan <morgan@kernel.org>2007-10-18 03:05:59 -0700
committerLinus Torvalds <torvalds@woody.linux-foundation.org>2007-10-18 14:37:24 -0700
commit72c2d5823fc7be799a12184974c3bdc57acea3c4 (patch)
tree5c17418efb57cd5b2cdc0d751f577b2c64012423 /kernel
parent7058cb02ddab4bce70a46e519804fccb7ac0a060 (diff)
V3 file capabilities: alter behavior of cap_setpcap
The non-filesystem capability meaning of CAP_SETPCAP is that a process, p1, can change the capabilities of another process, p2. This is not the meaning that was intended for this capability at all, and this implementation came about purely because, without filesystem capabilities, there was no way to use capabilities without one process bestowing them on another. Since we now have a filesystem support for capabilities we can fix the implementation of CAP_SETPCAP. The most significant thing about this change is that, with it in effect, no process can set the capabilities of another process. The capabilities of a program are set via the capability convolution rules: pI(post-exec) = pI(pre-exec) pP(post-exec) = (X(aka cap_bset) & fP) | (pI(post-exec) & fI) pE(post-exec) = fE ? pP(post-exec) : 0 at exec() time. As such, the only influence the pre-exec() program can have on the post-exec() program's capabilities are through the pI capability set. The correct implementation for CAP_SETPCAP (and that enabled by this patch) is that it can be used to add extra pI capabilities to the current process - to be picked up by subsequent exec()s when the above convolution rules are applied. Here is how it works: Let's say we have a process, p. It has capability sets, pE, pP and pI. Generally, p, can change the value of its own pI to pI' where (pI' & ~pI) & ~pP = 0. That is, the only new things in pI' that were not present in pI need to be present in pP. The role of CAP_SETPCAP is basically to permit changes to pI beyond the above: if (pE & CAP_SETPCAP) { pI' = anything; /* ie., even (pI' & ~pI) & ~pP != 0 */ } This capability is useful for things like login, which (say, via pam_cap) might want to raise certain inheritable capabilities for use by the children of the logged-in user's shell, but those capabilities are not useful to or needed by the login program itself. One such use might be to limit who can run ping. You set the capabilities of the 'ping' program to be "= cap_net_raw+i", and then only shells that have (pI & CAP_NET_RAW) will be able to run it. Without CAP_SETPCAP implemented as described above, login(pam_cap) would have to also have (pP & CAP_NET_RAW) in order to raise this capability and pass it on through the inheritable set. Signed-off-by: Andrew Morgan <morgan@kernel.org> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Diffstat (limited to 'kernel')
3 files changed, 12 insertions, 6 deletions
diff --git a/kernel/capability.c b/kernel/capability.c
index 4e350a36ed6..14853be5944 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -3,7 +3,7 @@
* Copyright (C) 1997 Andrew Main <zefram@fysh.org>
- * Integrated into 2.1.97+, Andrew G. Morgan <morgan@transmeta.com>
+ * Integrated into 2.1.97+, Andrew G. Morgan <morgan@kernel.org>
* 30 May 2002: Cleanup, Robert M. Love <rml@tech9.net>
@@ -14,9 +14,6 @@
#include <linux/syscalls.h>
#include <asm/uaccess.h>
-unsigned securebits = SECUREBITS_DEFAULT; /* systemwide security settings */
-kernel_cap_t cap_bset = CAP_INIT_EFF_SET;
* This lock protects task->cap_* for all tasks including current.
* Locking rule: acquire this prior to tasklist_lock.
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index c25e67e19af..067554bda8b 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -24,7 +24,7 @@
#include <linux/slab.h>
#include <linux/sysctl.h>
#include <linux/proc_fs.h>
-#include <linux/capability.h>
+#include <linux/security.h>
#include <linux/ctype.h>
#include <linux/utsname.h>
#include <linux/smp_lock.h>
@@ -371,6 +371,7 @@ static struct ctl_table kern_table[] = {
.proc_handler = &proc_dointvec_taint,
.procname = "cap-bound",
.data = &cap_bset,
@@ -378,6 +379,7 @@ static struct ctl_table kern_table[] = {
.mode = 0600,
.proc_handler = &proc_dointvec_bset,
@@ -1872,10 +1874,11 @@ static int do_proc_dointvec_bset_conv(int *negp, unsigned long *lvalp,
return 0;
* init may raise the set.
int proc_dointvec_bset(struct ctl_table *table, int write, struct file *filp,
void __user *buffer, size_t *lenp, loff_t *ppos)
@@ -1889,6 +1892,7 @@ int proc_dointvec_bset(struct ctl_table *table, int write, struct file *filp,
return do_proc_dointvec(table,write,filp,buffer,lenp,ppos,
* Taint values can only be increased
diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index f47c33d1703..3c9ef5a7d57 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -38,7 +38,10 @@ static struct trans_ctl_table trans_kern_table[] = {
{ KERN_NODENAME, "hostname" },
{ KERN_DOMAINNAME, "domainname" },
{ KERN_CAP_BSET, "cap-bound" },
{ KERN_PANIC, "panic" },
{ KERN_REALROOTDEV, "real-root-dev" },
@@ -1532,7 +1535,9 @@ int sysctl_check_table(struct ctl_table *table)
(table->strategy == sysctl_ms_jiffies) ||
(table->proc_handler == proc_dostring) ||
(table->proc_handler == proc_dointvec) ||
(table->proc_handler == proc_dointvec_bset) ||
(table->proc_handler == proc_dointvec_minmax) ||
(table->proc_handler == proc_dointvec_jiffies) ||
(table->proc_handler == proc_dointvec_userhz_jiffies) ||