.\" Copyright (c) 2002 by Michael Kerrisk <mtk.manpages@gmail.com>
.\"
+.\" %%%LICENSE_START(VERBATIM)
.\" Permission is granted to make and distribute verbatim copies of this
.\" manual provided the copyright notice and this permission notice are
.\" preserved on all copies.
.\"
.\" Formatted or processed versions of this manual, if unaccompanied by
.\" the source, must acknowledge the copyright and authors of this work.
+.\" %%%LICENSE_END
.\"
.\" 6 Aug 2002 - Initial Creation
.\" Modified 2003-05-23, Michael Kerrisk, <mtk.manpages@gmail.com>
.\" Add text noting that if we set the effective flag for one file
.\" capability, then we must also set the effective flag for all
.\" other capabilities where the permitted or inheritable bit is set.
+.\" 2011-09-07, mtk/Serge hallyn: Add CAP_SYSLOG
.\"
-.TH CAPABILITIES 7 2010-06-19 "Linux" "Linux Programmer's Manual"
+.TH CAPABILITIES 7 2015-02-01 "Linux" "Linux Programmer's Manual"
.SH NAME
capabilities \- overview of Linux capabilities
.SH DESCRIPTION
For the purpose of performing permission checks,
-traditional Unix implementations distinguish two categories of processes:
+traditional UNIX implementations distinguish two categories of processes:
.I privileged
processes (whose effective user ID is 0, referred to as superuser or root),
and
which can be independently enabled and disabled.
Capabilities are a per-thread attribute.
.\"
-.SS Capabilities List
+.SS Capabilities list
The following list shows the capabilities implemented on Linux,
and the operations or behaviors that each capability permits:
.TP
Enable and disable kernel auditing; change auditing filter rules;
retrieve auditing status and filtering rules.
.TP
+.BR CAP_AUDIT_READ " (since Linux 3.16)"
+.\" commit a29b694aa1739f9d76538e34ae25524f9c549d59
+.\" commit 3a101b8de0d39403b2c7e5c23fd0b005668acf48
+Allow reading the audit log via a multicast netlink socket.
+.TP
.BR CAP_AUDIT_WRITE " (since Linux 2.6.11)"
Write records to kernel auditing log.
.TP
+.BR CAP_BLOCK_SUSPEND " (since Linux 3.5)"
+Employ features that can block system suspend
+.RB ( epoll (7)
+.BR EPOLLWAKEUP ,
+.IR /proc/sys/wake_lock ).
+.TP
.B CAP_CHOWN
Make arbitrary changes to file UIDs and GIDs (see
.BR chown (2)).
(DAC is an abbreviation of "discretionary access control".)
.TP
.B CAP_DAC_READ_SEARCH
+.PD 0
+.RS
+.IP * 2
Bypass file read permission checks and
-directory read and execute permission checks.
+directory read and execute permission checks;
+.IP *
+Invoke
+.BR open_by_handle_at (2).
+.RE
+.PD
+
.TP
.B CAP_FOWNER
.PD 0
.RS
.IP * 2
Bypass permission checks on operations that normally
-require the file system UID of the process to match the UID of
+require the filesystem UID of the process to match the UID of
the file (e.g.,
.BR chmod (2),
.BR utime (2)),
Don't clear set-user-ID and set-group-ID permission
bits when a file is modified;
set the set-group-ID bit for a file whose GID does not match
-the file system or any of the supplementary GIDs of the calling process.
+the filesystem or any of the supplementary GIDs of the calling process.
.TP
.B CAP_IPC_LOCK
+.\" FIXME . As at Linux 3.2, there are some strange uses of this capability
+.\" in other places; they probably should be replaced with something else.
Lock memory
.RB ( mlock (2),
.BR mlockall (2),
.BR ioctl (2)
.B KDSIGACCEPT
operation.
-.\" FIXME CAP_KILL also has an effect for threads + setting child
+.\" FIXME . CAP_KILL also has an effect for threads + setting child
.\" termination signal to other than SIGCHLD: without this
.\" capability, the termination signal reverts to SIGCHLD
.\" if the child does an exec(). What is the rationale
and
.B FS_IMMUTABLE_FL
.\" These attributes are now available on ext2, ext3, Reiserfs, XFS, JFS
-i-node flags (see
+inode flags (see
.BR chattr (1)).
.TP
.BR CAP_MAC_ADMIN " (since Linux 2.6.25)"
.BR mknod (2).
.TP
.B CAP_NET_ADMIN
-Perform various network-related operations
-(e.g., setting privileged socket options,
-enabling multicasting, interface configuration,
-modifying routing tables).
+Perform various network-related operations:
+.PD 0
+.RS
+.IP * 2
+interface configuration;
+.IP *
+administration of IP firewall, masquerading, and accounting;
+.IP *
+modify routing tables;
+.IP *
+bind to any address for transparent proxying;
+.IP *
+set type-of-service (TOS)
+.IP *
+clear driver statistics;
+.IP *
+set promiscuous mode;
+.IP *
+enabling multicasting;
+.IP *
+use
+.BR setsockopt (2)
+to set the following socket options:
+.BR SO_DEBUG ,
+.BR SO_MARK ,
+.BR SO_PRIORITY
+(for a priority outside the range 0 to 6),
+.BR SO_RCVBUFFORCE ,
+and
+.BR SO_SNDBUFFORCE .
+.RE
+.PD
.TP
.B CAP_NET_BIND_SERVICE
Bind a socket to Internet domain privileged ports
(Unused) Make socket broadcasts, and listen to multicasts.
.TP
.B CAP_NET_RAW
-Use RAW and PACKET sockets.
+.PD 0
+.RS
+.IP * 2
+use RAW and PACKET sockets;
+.IP *
+bind to any address for transparent proxying.
+.RE
+.PD
.\" Also various IP options and setsockopt(SO_BINDTODEVICE)
.TP
.B CAP_SETGID
Make arbitrary manipulations of process GIDs and supplementary GID list;
-forge GID when passing socket credentials via Unix domain sockets.
+forge GID when passing socket credentials via UNIX domain sockets;
+write a group ID mapping in a user namespace (see
+.BR user_namespaces (7)).
.TP
.BR CAP_SETFCAP " (since Linux 2.6.24)"
Set file capabilities.
.BR setreuid (2),
.BR setresuid (2),
.BR setfsuid (2));
-make forged UID when passing socket credentials via Unix domain sockets.
+forge UID when passing socket credentials via UNIX domain sockets;
+write a user ID mapping in a user namespace (see
+.BR user_namespaces (7)).
.\" FIXME CAP_SETUID also an effect in exec(); document this.
.TP
.B CAP_SYS_ADMIN
and
.BR setdomainname (2);
.IP *
+perform privileged
+.BR syslog (2)
+operations (since Linux 2.6.37,
+.BR CAP_SYSLOG
+should be used to permit such operations);
+.IP *
+perform
+.B VM86_REQUEST_IRQ
+.BR vm86 (2)
+command;
+.IP *
perform
.B IPC_SET
and
.B IPC_RMID
operations on arbitrary System V IPC objects;
.IP *
+override
+.B RLIMIT_NPROC
+resource limit;
+.IP *
perform operations on
.I trusted
and
.B IOPRIO_CLASS_IDLE
I/O scheduling classes;
.IP *
-forge UID when passing socket credentials;
+forge PID when passing socket credentials via UNIX domain sockets;
.IP *
exceed
.IR /proc/sys/fs/file-max ,
.BR pipe (2));
.IP *
employ
-.B CLONE_NEWNS
-flag with
+.B CLONE_*
+flags that create new namespaces with
.BR clone (2)
and
-.BR unshare (2);
+.BR unshare (2)
+(but, since Linux 3.8,
+creating user namespaces does not require any capability);
+.IP *
+call
+.BR perf_event_open (2);
+.IP *
+access privileged
+.I perf
+event information;
+.IP *
+call
+.BR setns (2)
+(requires
+.B CAP_SYS_ADMIN
+in the
+.I target
+namespace);
+.IP *
+call
+.BR fanotify_init (2);
.IP *
perform
.B KEYCTL_CHOWN
perform
.BR madvise (2)
.B MADV_HWPOISON
-operation.
+operation;
+.IP *
+employ the
+.B TIOCSTI
+.BR ioctl (2)
+to insert characters into the input queue of a terminal other than
+the caller's controlling terminal;
+.IP *
+employ the obsolete
+.BR nfsservctl (2)
+system call;
+.IP *
+employ the obsolete
+.BR bdflush (2)
+system call;
+.IP *
+perform various privileged block-device
+.BR ioctl (2)
+operations;
+.IP *
+perform various privileged filesystem
+.BR ioctl (2)
+operations;
+.IP *
+perform administrative operations on many device drivers.
.RE
.PD
.TP
set real-time scheduling policies for calling process,
and set scheduling policies and priorities for arbitrary processes
.RB ( sched_setscheduler (2),
-.BR sched_setparam (2));
+.BR sched_setparam (2),
+.BR shed_setattr (2));
.IP *
set CPU affinity for arbitrary processes
.RB ( sched_setaffinity (2));
.\" migrate_pages(2):
.\" do_migrate_pages(mm, &old, &new,
.\" capable(CAP_SYS_NICE) ? MPOL_MF_MOVE_ALL : MPOL_MF_MOVE);
+.\" Document this.
.IP *
apply
.BR move_pages (2)
.BR acct (2).
.TP
.B CAP_SYS_PTRACE
+.PD 0
+.RS
+.IP * 3
Trace arbitrary processes using
.BR ptrace (2);
+.IP *
apply
.BR get_robust_list (2)
-to arbitrary processes.
+to arbitrary processes;
+.IP *
+transfer data to or from the memory of arbitrary processes using
+.BR process_vm_readv (2)
+and
+.BR process_vm_writev (2).
+.IP *
+inspect processes using
+.BR kcmp (2).
+.RE
+.PD
.TP
.B CAP_SYS_RAWIO
+.PD 0
+.RS
+.IP * 2
Perform I/O port operations
.RB ( iopl (2)
and
.BR ioperm (2));
+.IP *
access
-.IR /proc/kcore .
+.IR /proc/kcore ;
+.IP *
+employ the
+.B FIBMAP
+.BR ioctl (2)
+operation;
+.IP *
+open devices for accessing x86 model-specific registers (MSRs, see
+.BR msr (4))
+.IP *
+update
+.IR /proc/sys/vm/mmap_min_addr ;
+.IP *
+create memory mappings at addresses below the value specified by
+.IR /proc/sys/vm/mmap_min_addr ;
+.IP *
+map files in
+.IR /proc/bus/pci ;
+.IP *
+open
+.IR /dev/mem
+and
+.IR /dev/kmem ;
+.IP *
+perform various SCSI device commands;
+.IP *
+perform certain operations on
+.BR hpsa (4)
+and
+.BR cciss (4)
+devices;
+.IP *
+perform a range of device-specific operations on other devices.
+.RE
+.PD
.TP
.B CAP_SYS_RESOURCE
.PD 0
.RS
.IP * 2
-Use reserved space on ext2 file systems;
+Use reserved space on ext2 filesystems;
.IP *
make
.BR ioctl (2)
.B RLIMIT_NPROC
resource limit;
.IP *
+override maximum number of consoles on console allocation;
+.IP *
+override maximum number of keymaps;
+.IP *
+allow more than 64hz interrupts from the real-time clock;
+.IP *
raise
.I msg_qbytes
limit for a System V message queue above the limit in
(see
.BR msgop (2)
and
-.BR msgctl (2)).
+.BR msgctl (2));
+.IP *
+override the
+.I /proc/sys/fs/pipe-size-max
+limit when setting the capacity of a pipe using the
+.B F_SETPIPE_SZ
+.BR fcntl (2)
+command.
.IP *
use
.BR F_SETPIPE_SZ
to increase the capacity of a pipe above the limit specified by
-.IR /proc/sys/fs/pipe-max-size .
+.IR /proc/sys/fs/pipe-max-size ;
+.IP *
+override
+.I /proc/sys/fs/mqueue/queues_max
+limit when creating POSIX message queues (see
+.BR mq_overview (7));
+.IP *
+employ
+.BR prctl (2)
+.B PR_SET_MM
+operation;
+.IP *
+set
+.IR /proc/PID/oom_score_adj
+to a value lower than the value last set by a process with
+.BR CAP_SYS_RESOURCE .
.RE
.PD
.TP
.TP
.B CAP_SYS_TTY_CONFIG
Use
-.BR vhangup (2).
+.BR vhangup (2);
+employ various privileged
+.BR ioctl (2)
+operations on virtual terminals.
+.TP
+.BR CAP_SYSLOG " (since Linux 2.6.37)"
+.RS
+.PD 0
+.IP * 3
+Perform privileged
+.BR syslog (2)
+operations.
+See
+.BR syslog (2)
+for information on which operations require privilege.
+.IP *
+View kernel addresses exposed via
+.I /proc
+and other interfaces when
+.IR /proc/sys/kernel/kptr_restrict
+has the value 1.
+(See the discussion of the
+.I kptr_restrict
+in
+.BR proc (5).)
+.PD
+.RE
+.TP
+.BR CAP_WAKE_ALARM " (since Linux 3.0)"
+Trigger something that will wake up the system (set
+.B CLOCK_REALTIME_ALARM
+and
+.B CLOCK_BOOTTIME_ALARM
+timers).
.\"
-.SS Past and Current Implementation
+.SS Past and current implementation
A full implementation of capabilities requires that:
.IP 1. 3
For all privileged operations,
The kernel must provide system calls allowing a thread's capability sets to
be changed and retrieved.
.IP 3.
-The file system must support attaching capabilities to an executable file,
+The filesystem must support attaching capabilities to an executable file,
so that a process gains those capabilities when the file is executed.
.PP
Before kernel 2.6.24, only the first two of these requirements are met;
since kernel 2.6.24, all three requirements are met.
.\"
-.SS Thread Capability Sets
+.SS Thread capability sets
Each thread has three capability sets containing zero or more
of the above capabilities:
.TP
Using
.BR capset (2),
a thread may manipulate its own capability sets (see below).
+.PP
+Since Linux 3.2, the file
+.I /proc/sys/kernel/cap_last_cap
+.\" commit 73efc0394e148d0e15583e13712637831f926720
+exposes the numerical value of the highest capability
+supported by the running kernel;
+this can be used to determine the highest bit
+that may be set in a capability set.
.\"
-.SS File Capabilities
+.SS File capabilities
Since kernel 2.6.24, the kernel supports
associating capability sets with an executable file using
.BR setcap (8).
for all other capabilities for which the corresponding permitted or
inheritable flags is enabled.
.\"
-.SS Transformation of Capabilities During execve()
+.SS Transformation of capabilities during execve()
.PP
During an
.BR execve (2),
.\" exec(), then it gets all capabilities in its
.\" permitted set, and no effective capabilities
This provides semantics that are the same as those provided by
-traditional Unix systems.
+traditional UNIX systems.
.SS Capability bounding set
The capability bounding set is a security mechanism that can be used
to limit the capabilities that can be gained during an
to Linux starting with kernel version 2.2.11.
.\"
.PP
-.B "Capability bounding set from Linux 2.6.25 onwards"
+.B "Capability bounding set from Linux 2.6.25 onward"
.PP
From Linux 2.6.25, the
.I "capability bounding set"
.B PR_CAPBSET_READ
operation.
-Removing capabilities from the bounding set is only supported if file
-capabilities are compiled into the kernel
-(CONFIG_SECURITY_FILE_CAPABILITIES).
-In that case, the
+Removing capabilities from the bounding set is supported only if file
+capabilities are compiled into the kernel.
+In kernels before Linux 2.6.33,
+file capabilities were an optional feature configurable via the
+.B CONFIG_SECURITY_FILE_CAPABILITIES
+option.
+Since Linux 2.6.33, the configuration option has been removed
+and file capabilities are always part of the kernel.
+When file capabilities are compiled into the kernel, the
.B init
process (the ancestor of all processes) begins with a full bounding set.
If file capabilities are not compiled into the kernel, then
back into the thread's inherited set in the future.
.\"
.\"
-.SS Effect of User ID Changes on Capabilities
+.SS Effect of user ID changes on capabilities
To preserve the traditional semantics for transitions between
0 and nonzero user IDs,
the kernel makes the following changes to a thread's capability
sets on changes to the thread's real, effective, saved set,
-and file system user IDs (using
+and filesystem user IDs (using
.BR setuid (2),
.BR setresuid (2),
or similar):
If the effective user ID is changed from nonzero to 0,
then the permitted set is copied to the effective set.
.IP 4.
-If the file system user ID is changed from 0 to nonzero (see
-.BR setfsuid (2))
+If the filesystem user ID is changed from 0 to nonzero (see
+.BR setfsuid (2)),
then the following capabilities are cleared from the effective set:
.BR CAP_CHOWN ,
.BR CAP_DAC_OVERRIDE ,
.BR CAP_FOWNER ,
.BR CAP_FSETID ,
.B CAP_LINUX_IMMUTABLE
-(since Linux 2.2.30),
+(since Linux 2.6.30),
.BR CAP_MAC_OVERRIDE ,
and
.B CAP_MKNOD
-(since Linux 2.2.30).
-If the file system UID is changed from nonzero to 0,
+(since Linux 2.6.30).
+If the filesystem UID is changed from nonzero to 0,
then any of these capabilities that are enabled in the permitted set
are enabled in the effective set.
.PP
the new inheritable set must be a subset of the combination
of the existing inheritable and permitted sets.
.IP 2.
-(Since kernel 2.6.25)
+(Since Linux 2.6.25)
The new inheritable set must be a subset of the combination of the
existing inheritable set and the capability bounding set.
.IP 3.
that the thread does not currently have).
.IP 4.
The new effective set must be a subset of the new permitted set.
-.SS The """securebits"" flags: establishing a capabilities-only environment
+.SS The securebits flags: establishing a capabilities-only environment
.\" For some background:
.\" see http://lwn.net/Articles/280279/ and
.\" http://article.gmane.org/gmane.linux.kernel.lsm/5476/
operation.)
.TP
.B SECBIT_NO_SETUID_FIXUP
-Setting this flag stops the kernel from adjusting capability sets when
-the threads's effective and file system UIDs are switched between
+Setting this flag stops the kernel from adjusting capability sets when
+the threads's effective and filesystem UIDs are switched between
zero and nonzero values.
(See the subsection
.IR "Effect of User ID Changes on Capabilities" .)
During an
.BR execve (2),
all of the flags are preserved, except
-.B SECURE_KEEP_CAPS
+.B SECBIT_KEEP_CAPS
which is always cleared.
An application can use the following call to lock itself,
SECBIT_NOROOT_LOCKED);
.fi
.in
-.SH "CONFORMING TO"
+.SS Interaction with user namespaces
+For a discussion of the interaction of capabilities and user namespaces, see
+.BR user_namespaces (7).
+.SH CONFORMING TO
.PP
No standards govern capabilities, but the Linux capability implementation
is based on the withdrawn POSIX.1e draft standard; see
-.IR http://wt.xpilot.org/publications/posix.1e/ .
+.UR http://wt.tuxomania.net\:/publications\:/posix.1e/
+.UE .
.SH NOTES
Since kernel 2.5.27, capabilities are an optional kernel component,
-and can be enabled/disabled via the CONFIG_SECURITY_CAPABILITIES
+and can be enabled/disabled via the
+.B CONFIG_SECURITY_CAPABILITIES
kernel configuration option.
The
The
.I /proc/PID/status
file shows the capability sets of a process's main thread.
+Before Linux 3.8, nonexistent capabilities were shown as being
+enabled (1) in these sets.
+Since Linux 3.8,
+.\" 7b9a7ec565505699f503b4fcf61500dceb36e744
+all nonexistent capabilities (above
+.BR CAP_LAST_CAP )
+are shown as disabled (0).
The
.I libcap
programs.
It can be found at
.br
-.IR http://www.kernel.org/pub/linux/libs/security/linux-privs .
+.UR http://www.kernel.org\:/pub\:/linux\:/libs\:/security\:/linux-privs
+.UE .
Before kernel 2.6.24, and since kernel 2.6.24 if
file capabilities are not enabled, a thread with the
starts out with this capability removed from its per-process bounding
set, and that bounding set is inherited by all other processes
created on the system.
-.SH "SEE ALSO"
-.BR capget (2),
+.SH SEE ALSO
+.BR capsh (1),
+.BR setpriv (2),
.BR prctl (2),
.BR setfsuid (2),
.BR cap_clear (3),
.BR cap_init (3),
.BR capgetp (3),
.BR capsetp (3),
+.BR libcap (3),
.BR credentials (7),
+.BR user_namespaces (7),
.BR pthreads (7),
.BR getcap (8),
.BR setcap (8)
.PP
.I include/linux/capability.h
-in the kernel source
+in the Linux kernel source tree
+.SH COLOPHON
+This page is part of release 3.79 of the Linux
+.I man-pages
+project.
+A description of the project,
+information about reporting bugs,
+and the latest version of this page,
+can be found at
+\%http://www.kernel.org/doc/man\-pages/.