The Design and Implementation of the FreeBSD Operating System, Second Edition
Now available: The Design and Implementation of the FreeBSD Operating System (Second Edition)


[ source navigation ] [ diff markup ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]

FreeBSD/Linux Kernel Cross Reference
sys/Documentation/credentials.txt

Version: -  FREEBSD  -  FREEBSD-13-STABLE  -  FREEBSD-13-0  -  FREEBSD-12-STABLE  -  FREEBSD-12-0  -  FREEBSD-11-STABLE  -  FREEBSD-11-0  -  FREEBSD-10-STABLE  -  FREEBSD-10-0  -  FREEBSD-9-STABLE  -  FREEBSD-9-0  -  FREEBSD-8-STABLE  -  FREEBSD-8-0  -  FREEBSD-7-STABLE  -  FREEBSD-7-0  -  FREEBSD-6-STABLE  -  FREEBSD-6-0  -  FREEBSD-5-STABLE  -  FREEBSD-5-0  -  FREEBSD-4-STABLE  -  FREEBSD-3-STABLE  -  FREEBSD22  -  l41  -  OPENBSD  -  linux-2.6  -  MK84  -  PLAN9  -  xnu-8792 
SearchContext: -  none  -  3  -  10 

    1                              ====================
    2                              CREDENTIALS IN LINUX
    3                              ====================
    4 
    5 By: David Howells <dhowells@redhat.com>
    6 
    7 Contents:
    8 
    9  (*) Overview.
   10 
   11  (*) Types of credentials.
   12 
   13  (*) File markings.
   14 
   15  (*) Task credentials.
   16 
   17      - Immutable credentials.
   18      - Accessing task credentials.
   19      - Accessing another task's credentials.
   20      - Altering credentials.
   21      - Managing credentials.
   22 
   23  (*) Open file credentials.
   24 
   25  (*) Overriding the VFS's use of credentials.
   26 
   27 
   28 ========
   29 OVERVIEW
   30 ========
   31 
   32 There are several parts to the security check performed by Linux when one
   33 object acts upon another:
   34 
   35  (1) Objects.
   36 
   37      Objects are things in the system that may be acted upon directly by
   38      userspace programs.  Linux has a variety of actionable objects, including:
   39 
   40         - Tasks
   41         - Files/inodes
   42         - Sockets
   43         - Message queues
   44         - Shared memory segments
   45         - Semaphores
   46         - Keys
   47 
   48      As a part of the description of all these objects there is a set of
   49      credentials.  What's in the set depends on the type of object.
   50 
   51  (2) Object ownership.
   52 
   53      Amongst the credentials of most objects, there will be a subset that
   54      indicates the ownership of that object.  This is used for resource
   55      accounting and limitation (disk quotas and task rlimits for example).
   56 
   57      In a standard UNIX filesystem, for instance, this will be defined by the
   58      UID marked on the inode.
   59 
   60  (3) The objective context.
   61 
   62      Also amongst the credentials of those objects, there will be a subset that
   63      indicates the 'objective context' of that object.  This may or may not be
   64      the same set as in (2) - in standard UNIX files, for instance, this is the
   65      defined by the UID and the GID marked on the inode.
   66 
   67      The objective context is used as part of the security calculation that is
   68      carried out when an object is acted upon.
   69 
   70  (4) Subjects.
   71 
   72      A subject is an object that is acting upon another object.
   73 
   74      Most of the objects in the system are inactive: they don't act on other
   75      objects within the system.  Processes/tasks are the obvious exception:
   76      they do stuff; they access and manipulate things.
   77 
   78      Objects other than tasks may under some circumstances also be subjects.
   79      For instance an open file may send SIGIO to a task using the UID and EUID
   80      given to it by a task that called fcntl(F_SETOWN) upon it.  In this case,
   81      the file struct will have a subjective context too.
   82 
   83  (5) The subjective context.
   84 
   85      A subject has an additional interpretation of its credentials.  A subset
   86      of its credentials forms the 'subjective context'.  The subjective context
   87      is used as part of the security calculation that is carried out when a
   88      subject acts.
   89 
   90      A Linux task, for example, has the FSUID, FSGID and the supplementary
   91      group list for when it is acting upon a file - which are quite separate
   92      from the real UID and GID that normally form the objective context of the
   93      task.
   94 
   95  (6) Actions.
   96 
   97      Linux has a number of actions available that a subject may perform upon an
   98      object.  The set of actions available depends on the nature of the subject
   99      and the object.
  100 
  101      Actions include reading, writing, creating and deleting files; forking or
  102      signalling and tracing tasks.
  103 
  104  (7) Rules, access control lists and security calculations.
  105 
  106      When a subject acts upon an object, a security calculation is made.  This
  107      involves taking the subjective context, the objective context and the
  108      action, and searching one or more sets of rules to see whether the subject
  109      is granted or denied permission to act in the desired manner on the
  110      object, given those contexts.
  111 
  112      There are two main sources of rules:
  113 
  114      (a) Discretionary access control (DAC):
  115 
  116          Sometimes the object will include sets of rules as part of its
  117          description.  This is an 'Access Control List' or 'ACL'.  A Linux
  118          file may supply more than one ACL.
  119 
  120          A traditional UNIX file, for example, includes a permissions mask that
  121          is an abbreviated ACL with three fixed classes of subject ('user',
  122          'group' and 'other'), each of which may be granted certain privileges
  123          ('read', 'write' and 'execute' - whatever those map to for the object
  124          in question).  UNIX file permissions do not allow the arbitrary
  125          specification of subjects, however, and so are of limited use.
  126 
  127          A Linux file might also sport a POSIX ACL.  This is a list of rules
  128          that grants various permissions to arbitrary subjects.
  129 
  130      (b) Mandatory access control (MAC):
  131 
  132          The system as a whole may have one or more sets of rules that get
  133          applied to all subjects and objects, regardless of their source.
  134          SELinux and Smack are examples of this.
  135 
  136          In the case of SELinux and Smack, each object is given a label as part
  137          of its credentials.  When an action is requested, they take the
  138          subject label, the object label and the action and look for a rule
  139          that says that this action is either granted or denied.
  140 
  141 
  142 ====================
  143 TYPES OF CREDENTIALS
  144 ====================
  145 
  146 The Linux kernel supports the following types of credentials:
  147 
  148  (1) Traditional UNIX credentials.
  149 
  150         Real User ID
  151         Real Group ID
  152 
  153      The UID and GID are carried by most, if not all, Linux objects, even if in
  154      some cases it has to be invented (FAT or CIFS files for example, which are
  155      derived from Windows).  These (mostly) define the objective context of
  156      that object, with tasks being slightly different in some cases.
  157 
  158         Effective, Saved and FS User ID
  159         Effective, Saved and FS Group ID
  160         Supplementary groups
  161 
  162      These are additional credentials used by tasks only.  Usually, an
  163      EUID/EGID/GROUPS will be used as the subjective context, and real UID/GID
  164      will be used as the objective.  For tasks, it should be noted that this is
  165      not always true.
  166 
  167  (2) Capabilities.
  168 
  169         Set of permitted capabilities
  170         Set of inheritable capabilities
  171         Set of effective capabilities
  172         Capability bounding set
  173 
  174      These are only carried by tasks.  They indicate superior capabilities
  175      granted piecemeal to a task that an ordinary task wouldn't otherwise have.
  176      These are manipulated implicitly by changes to the traditional UNIX
  177      credentials, but can also be manipulated directly by the capset() system
  178      call.
  179 
  180      The permitted capabilities are those caps that the process might grant
  181      itself to its effective or permitted sets through capset().  This
  182      inheritable set might also be so constrained.
  183 
  184      The effective capabilities are the ones that a task is actually allowed to
  185      make use of itself.
  186 
  187      The inheritable capabilities are the ones that may get passed across
  188      execve().
  189 
  190      The bounding set limits the capabilities that may be inherited across
  191      execve(), especially when a binary is executed that will execute as UID 0.
  192 
  193  (3) Secure management flags (securebits).
  194 
  195      These are only carried by tasks.  These govern the way the above
  196      credentials are manipulated and inherited over certain operations such as
  197      execve().  They aren't used directly as objective or subjective
  198      credentials.
  199 
  200  (4) Keys and keyrings.
  201 
  202      These are only carried by tasks.  They carry and cache security tokens
  203      that don't fit into the other standard UNIX credentials.  They are for
  204      making such things as network filesystem keys available to the file
  205      accesses performed by processes, without the necessity of ordinary
  206      programs having to know about security details involved.
  207 
  208      Keyrings are a special type of key.  They carry sets of other keys and can
  209      be searched for the desired key.  Each process may subscribe to a number
  210      of keyrings:
  211 
  212         Per-thread keying
  213         Per-process keyring
  214         Per-session keyring
  215 
  216      When a process accesses a key, if not already present, it will normally be
  217      cached on one of these keyrings for future accesses to find.
  218 
  219      For more information on using keys, see Documentation/keys.txt.
  220 
  221  (5) LSM
  222 
  223      The Linux Security Module allows extra controls to be placed over the
  224      operations that a task may do.  Currently Linux supports two main
  225      alternate LSM options: SELinux and Smack.
  226 
  227      Both work by labelling the objects in a system and then applying sets of
  228      rules (policies) that say what operations a task with one label may do to
  229      an object with another label.
  230 
  231  (6) AF_KEY
  232 
  233      This is a socket-based approach to credential management for networking
  234      stacks [RFC 2367].  It isn't discussed by this document as it doesn't
  235      interact directly with task and file credentials; rather it keeps system
  236      level credentials.
  237 
  238 
  239 When a file is opened, part of the opening task's subjective context is
  240 recorded in the file struct created.  This allows operations using that file
  241 struct to use those credentials instead of the subjective context of the task
  242 that issued the operation.  An example of this would be a file opened on a
  243 network filesystem where the credentials of the opened file should be presented
  244 to the server, regardless of who is actually doing a read or a write upon it.
  245 
  246 
  247 =============
  248 FILE MARKINGS
  249 =============
  250 
  251 Files on disk or obtained over the network may have annotations that form the
  252 objective security context of that file.  Depending on the type of filesystem,
  253 this may include one or more of the following:
  254 
  255  (*) UNIX UID, GID, mode;
  256 
  257  (*) Windows user ID;
  258 
  259  (*) Access control list;
  260 
  261  (*) LSM security label;
  262 
  263  (*) UNIX exec privilege escalation bits (SUID/SGID);
  264 
  265  (*) File capabilities exec privilege escalation bits.
  266 
  267 These are compared to the task's subjective security context, and certain
  268 operations allowed or disallowed as a result.  In the case of execve(), the
  269 privilege escalation bits come into play, and may allow the resulting process
  270 extra privileges, based on the annotations on the executable file.
  271 
  272 
  273 ================
  274 TASK CREDENTIALS
  275 ================
  276 
  277 In Linux, all of a task's credentials are held in (uid, gid) or through
  278 (groups, keys, LSM security) a refcounted structure of type 'struct cred'.
  279 Each task points to its credentials by a pointer called 'cred' in its
  280 task_struct.
  281 
  282 Once a set of credentials has been prepared and committed, it may not be
  283 changed, barring the following exceptions:
  284 
  285  (1) its reference count may be changed;
  286 
  287  (2) the reference count on the group_info struct it points to may be changed;
  288 
  289  (3) the reference count on the security data it points to may be changed;
  290 
  291  (4) the reference count on any keyrings it points to may be changed;
  292 
  293  (5) any keyrings it points to may be revoked, expired or have their security
  294      attributes changed; and
  295 
  296  (6) the contents of any keyrings to which it points may be changed (the whole
  297      point of keyrings being a shared set of credentials, modifiable by anyone
  298      with appropriate access).
  299 
  300 To alter anything in the cred struct, the copy-and-replace principle must be
  301 adhered to.  First take a copy, then alter the copy and then use RCU to change
  302 the task pointer to make it point to the new copy.  There are wrappers to aid
  303 with this (see below).
  304 
  305 A task may only alter its _own_ credentials; it is no longer permitted for a
  306 task to alter another's credentials.  This means the capset() system call is no
  307 longer permitted to take any PID other than the one of the current process.
  308 Also keyctl_instantiate() and keyctl_negate() functions no longer permit
  309 attachment to process-specific keyrings in the requesting process as the
  310 instantiating process may need to create them.
  311 
  312 
  313 IMMUTABLE CREDENTIALS
  314 ---------------------
  315 
  316 Once a set of credentials has been made public (by calling commit_creds() for
  317 example), it must be considered immutable, barring two exceptions:
  318 
  319  (1) The reference count may be altered.
  320 
  321  (2) Whilst the keyring subscriptions of a set of credentials may not be
  322      changed, the keyrings subscribed to may have their contents altered.
  323 
  324 To catch accidental credential alteration at compile time, struct task_struct
  325 has _const_ pointers to its credential sets, as does struct file.  Furthermore,
  326 certain functions such as get_cred() and put_cred() operate on const pointers,
  327 thus rendering casts unnecessary, but require to temporarily ditch the const
  328 qualification to be able to alter the reference count.
  329 
  330 
  331 ACCESSING TASK CREDENTIALS
  332 --------------------------
  333 
  334 A task being able to alter only its own credentials permits the current process
  335 to read or replace its own credentials without the need for any form of locking
  336 - which simplifies things greatly.  It can just call:
  337 
  338         const struct cred *current_cred()
  339 
  340 to get a pointer to its credentials structure, and it doesn't have to release
  341 it afterwards.
  342 
  343 There are convenience wrappers for retrieving specific aspects of a task's
  344 credentials (the value is simply returned in each case):
  345 
  346         uid_t current_uid(void)         Current's real UID
  347         gid_t current_gid(void)         Current's real GID
  348         uid_t current_euid(void)        Current's effective UID
  349         gid_t current_egid(void)        Current's effective GID
  350         uid_t current_fsuid(void)       Current's file access UID
  351         gid_t current_fsgid(void)       Current's file access GID
  352         kernel_cap_t current_cap(void)  Current's effective capabilities
  353         void *current_security(void)    Current's LSM security pointer
  354         struct user_struct *current_user(void)  Current's user account
  355 
  356 There are also convenience wrappers for retrieving specific associated pairs of
  357 a task's credentials:
  358 
  359         void current_uid_gid(uid_t *, gid_t *);
  360         void current_euid_egid(uid_t *, gid_t *);
  361         void current_fsuid_fsgid(uid_t *, gid_t *);
  362 
  363 which return these pairs of values through their arguments after retrieving
  364 them from the current task's credentials.
  365 
  366 
  367 In addition, there is a function for obtaining a reference on the current
  368 process's current set of credentials:
  369 
  370         const struct cred *get_current_cred(void);
  371 
  372 and functions for getting references to one of the credentials that don't
  373 actually live in struct cred:
  374 
  375         struct user_struct *get_current_user(void);
  376         struct group_info *get_current_groups(void);
  377 
  378 which get references to the current process's user accounting structure and
  379 supplementary groups list respectively.
  380 
  381 Once a reference has been obtained, it must be released with put_cred(),
  382 free_uid() or put_group_info() as appropriate.
  383 
  384 
  385 ACCESSING ANOTHER TASK'S CREDENTIALS
  386 ------------------------------------
  387 
  388 Whilst a task may access its own credentials without the need for locking, the
  389 same is not true of a task wanting to access another task's credentials.  It
  390 must use the RCU read lock and rcu_dereference().
  391 
  392 The rcu_dereference() is wrapped by:
  393 
  394         const struct cred *__task_cred(struct task_struct *task);
  395 
  396 This should be used inside the RCU read lock, as in the following example:
  397 
  398         void foo(struct task_struct *t, struct foo_data *f)
  399         {
  400                 const struct cred *tcred;
  401                 ...
  402                 rcu_read_lock();
  403                 tcred = __task_cred(t);
  404                 f->uid = tcred->uid;
  405                 f->gid = tcred->gid;
  406                 f->groups = get_group_info(tcred->groups);
  407                 rcu_read_unlock();
  408                 ...
  409         }
  410 
  411 Should it be necessary to hold another task's credentials for a long period of
  412 time, and possibly to sleep whilst doing so, then the caller should get a
  413 reference on them using:
  414 
  415         const struct cred *get_task_cred(struct task_struct *task);
  416 
  417 This does all the RCU magic inside of it.  The caller must call put_cred() on
  418 the credentials so obtained when they're finished with.
  419 
  420  [*] Note: The result of __task_cred() should not be passed directly to
  421      get_cred() as this may race with commit_cred().
  422 
  423 There are a couple of convenience functions to access bits of another task's
  424 credentials, hiding the RCU magic from the caller:
  425 
  426         uid_t task_uid(task)            Task's real UID
  427         uid_t task_euid(task)           Task's effective UID
  428 
  429 If the caller is holding the RCU read lock at the time anyway, then:
  430 
  431         __task_cred(task)->uid
  432         __task_cred(task)->euid
  433 
  434 should be used instead.  Similarly, if multiple aspects of a task's credentials
  435 need to be accessed, RCU read lock should be used, __task_cred() called, the
  436 result stored in a temporary pointer and then the credential aspects called
  437 from that before dropping the lock.  This prevents the potentially expensive
  438 RCU magic from being invoked multiple times.
  439 
  440 Should some other single aspect of another task's credentials need to be
  441 accessed, then this can be used:
  442 
  443         task_cred_xxx(task, member)
  444 
  445 where 'member' is a non-pointer member of the cred struct.  For instance:
  446 
  447         uid_t task_cred_xxx(task, suid);
  448 
  449 will retrieve 'struct cred::suid' from the task, doing the appropriate RCU
  450 magic.  This may not be used for pointer members as what they point to may
  451 disappear the moment the RCU read lock is dropped.
  452 
  453 
  454 ALTERING CREDENTIALS
  455 --------------------
  456 
  457 As previously mentioned, a task may only alter its own credentials, and may not
  458 alter those of another task.  This means that it doesn't need to use any
  459 locking to alter its own credentials.
  460 
  461 To alter the current process's credentials, a function should first prepare a
  462 new set of credentials by calling:
  463 
  464         struct cred *prepare_creds(void);
  465 
  466 this locks current->cred_replace_mutex and then allocates and constructs a
  467 duplicate of the current process's credentials, returning with the mutex still
  468 held if successful.  It returns NULL if not successful (out of memory).
  469 
  470 The mutex prevents ptrace() from altering the ptrace state of a process whilst
  471 security checks on credentials construction and changing is taking place as
  472 the ptrace state may alter the outcome, particularly in the case of execve().
  473 
  474 The new credentials set should be altered appropriately, and any security
  475 checks and hooks done.  Both the current and the proposed sets of credentials
  476 are available for this purpose as current_cred() will return the current set
  477 still at this point.
  478 
  479 
  480 When the credential set is ready, it should be committed to the current process
  481 by calling:
  482 
  483         int commit_creds(struct cred *new);
  484 
  485 This will alter various aspects of the credentials and the process, giving the
  486 LSM a chance to do likewise, then it will use rcu_assign_pointer() to actually
  487 commit the new credentials to current->cred, it will release
  488 current->cred_replace_mutex to allow ptrace() to take place, and it will notify
  489 the scheduler and others of the changes.
  490 
  491 This function is guaranteed to return 0, so that it can be tail-called at the
  492 end of such functions as sys_setresuid().
  493 
  494 Note that this function consumes the caller's reference to the new credentials.
  495 The caller should _not_ call put_cred() on the new credentials afterwards.
  496 
  497 Furthermore, once this function has been called on a new set of credentials,
  498 those credentials may _not_ be changed further.
  499 
  500 
  501 Should the security checks fail or some other error occur after prepare_creds()
  502 has been called, then the following function should be invoked:
  503 
  504         void abort_creds(struct cred *new);
  505 
  506 This releases the lock on current->cred_replace_mutex that prepare_creds() got
  507 and then releases the new credentials.
  508 
  509 
  510 A typical credentials alteration function would look something like this:
  511 
  512         int alter_suid(uid_t suid)
  513         {
  514                 struct cred *new;
  515                 int ret;
  516 
  517                 new = prepare_creds();
  518                 if (!new)
  519                         return -ENOMEM;
  520 
  521                 new->suid = suid;
  522                 ret = security_alter_suid(new);
  523                 if (ret < 0) {
  524                         abort_creds(new);
  525                         return ret;
  526                 }
  527 
  528                 return commit_creds(new);
  529         }
  530 
  531 
  532 MANAGING CREDENTIALS
  533 --------------------
  534 
  535 There are some functions to help manage credentials:
  536 
  537  (*) void put_cred(const struct cred *cred);
  538 
  539      This releases a reference to the given set of credentials.  If the
  540      reference count reaches zero, the credentials will be scheduled for
  541      destruction by the RCU system.
  542 
  543  (*) const struct cred *get_cred(const struct cred *cred);
  544 
  545      This gets a reference on a live set of credentials, returning a pointer to
  546      that set of credentials.
  547 
  548  (*) struct cred *get_new_cred(struct cred *cred);
  549 
  550      This gets a reference on a set of credentials that is under construction
  551      and is thus still mutable, returning a pointer to that set of credentials.
  552 
  553 
  554 =====================
  555 OPEN FILE CREDENTIALS
  556 =====================
  557 
  558 When a new file is opened, a reference is obtained on the opening task's
  559 credentials and this is attached to the file struct as 'f_cred' in place of
  560 'f_uid' and 'f_gid'.  Code that used to access file->f_uid and file->f_gid
  561 should now access file->f_cred->fsuid and file->f_cred->fsgid.
  562 
  563 It is safe to access f_cred without the use of RCU or locking because the
  564 pointer will not change over the lifetime of the file struct, and nor will the
  565 contents of the cred struct pointed to, barring the exceptions listed above
  566 (see the Task Credentials section).
  567 
  568 
  569 =======================================
  570 OVERRIDING THE VFS'S USE OF CREDENTIALS
  571 =======================================
  572 
  573 Under some circumstances it is desirable to override the credentials used by
  574 the VFS, and that can be done by calling into such as vfs_mkdir() with a
  575 different set of credentials.  This is done in the following places:
  576 
  577  (*) sys_faccessat().
  578 
  579  (*) do_coredump().
  580 
  581  (*) nfs4recover.c.

Cache object: 8a5dc4d33b0a219f508f7f8d2b03f13b


[ source navigation ] [ diff markup ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]


This page is part of the FreeBSD/Linux Linux Kernel Cross-Reference, and was automatically generated using a modified version of the LXR engine.