The Design and Implementation of the FreeBSD Operating System, Second Edition
Now available: The Design and Implementation of the FreeBSD Operating System (Second Edition)


[ source navigation ] [ diff markup ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]

FreeBSD/Linux Kernel Cross Reference
sys/Documentation/mandatory.txt

Version: -  FREEBSD  -  FREEBSD-13-STABLE  -  FREEBSD-13-0  -  FREEBSD-12-STABLE  -  FREEBSD-12-0  -  FREEBSD-11-STABLE  -  FREEBSD-11-0  -  FREEBSD-10-STABLE  -  FREEBSD-10-0  -  FREEBSD-9-STABLE  -  FREEBSD-9-0  -  FREEBSD-8-STABLE  -  FREEBSD-8-0  -  FREEBSD-7-STABLE  -  FREEBSD-7-0  -  FREEBSD-6-STABLE  -  FREEBSD-6-0  -  FREEBSD-5-STABLE  -  FREEBSD-5-0  -  FREEBSD-4-STABLE  -  FREEBSD-3-STABLE  -  FREEBSD22  -  l41  -  OPENBSD  -  linux-2.6  -  MK84  -  PLAN9  -  xnu-8792 
SearchContext: -  none  -  3  -  10 

    1         Mandatory File Locking For The Linux Operating System
    2 
    3                 Andy Walker <andy@lysaker.kvaerner.no>
    4 
    5                            15 April 1996
    6 
    7 
    8 1. What is  mandatory locking?
    9 ------------------------------
   10 
   11 Mandatory locking is kernel enforced file locking, as opposed to the more usual
   12 cooperative file locking used to guarantee sequential access to files among
   13 processes. File locks are applied using the flock() and fcntl() system calls
   14 (and the lockf() library routine which is a wrapper around fcntl().) It is
   15 normally a process' responsibility to check for locks on a file it wishes to
   16 update, before applying its own lock, updating the file and unlocking it again.
   17 The most commonly used example of this (and in the case of sendmail, the most
   18 troublesome) is access to a user's mailbox. The mail user agent and the mail
   19 transfer agent must guard against updating the mailbox at the same time, and
   20 prevent reading the mailbox while it is being updated.
   21 
   22 In a perfect world all processes would use and honour a cooperative, or
   23 "advisory" locking scheme. However, the world isn't perfect, and there's
   24 a lot of poorly written code out there.
   25 
   26 In trying to address this problem, the designers of System V UNIX came up
   27 with a "mandatory" locking scheme, whereby the operating system kernel would
   28 block attempts by a process to write to a file that another process holds a
   29 "read" -or- "shared" lock on, and block attempts to both read and write to a 
   30 file that a process holds a "write " -or- "exclusive" lock on.
   31 
   32 The System V mandatory locking scheme was intended to have as little impact as
   33 possible on existing user code. The scheme is based on marking individual files
   34 as candidates for mandatory locking, and using the existing fcntl()/lockf()
   35 interface for applying locks just as if they were normal, advisory locks.
   36 
   37 Note 1: In saying "file" in the paragraphs above I am actually not telling
   38 the whole truth. System V locking is based on fcntl(). The granularity of
   39 fcntl() is such that it allows the locking of byte ranges in files, in addition
   40 to entire files, so the mandatory locking rules also have byte level
   41 granularity.
   42 
   43 Note 2: POSIX.1 does not specify any scheme for mandatory locking, despite
   44 borrowing the fcntl() locking scheme from System V. The mandatory locking
   45 scheme is defined by the System V Interface Definition (SVID) Version 3.
   46 
   47 2. Marking a file for mandatory locking
   48 ---------------------------------------
   49 
   50 A file is marked as a candidate for mandatory locking by setting the group-id
   51 bit in its file mode but removing the group-execute bit. This is an otherwise
   52 meaningless combination, and was chosen by the System V implementors so as not
   53 to break existing user programs.
   54 
   55 Note that the group-id bit is usually automatically cleared by the kernel when
   56 a setgid file is written to. This is a security measure. The kernel has been
   57 modified to recognize the special case of a mandatory lock candidate and to
   58 refrain from clearing this bit. Similarly the kernel has been modified not
   59 to run mandatory lock candidates with setgid privileges.
   60 
   61 3. Available implementations
   62 ----------------------------
   63 
   64 I have considered the implementations of mandatory locking available with
   65 SunOS 4.1.x, Solaris 2.x and HP-UX 9.x.
   66 
   67 Generally I have tried to make the most sense out of the behaviour exhibited
   68 by these three reference systems. There are many anomalies.
   69 
   70 All the reference systems reject all calls to open() for a file on which
   71 another process has outstanding mandatory locks. This is in direct
   72 contravention of SVID 3, which states that only calls to open() with the
   73 O_TRUNC flag set should be rejected. The Linux implementation follows the SVID
   74 definition, which is the "Right Thing", since only calls with O_TRUNC can
   75 modify the contents of the file.
   76 
   77 HP-UX even disallows open() with O_TRUNC for a file with advisory locks, not
   78 just mandatory locks. That would appear to contravene POSIX.1.
   79 
   80 mmap() is another interesting case. All the operating systems mentioned
   81 prevent mandatory locks from being applied to an mmap()'ed file, but  HP-UX
   82 also disallows advisory locks for such a file. SVID actually specifies the
   83 paranoid HP-UX behaviour.
   84 
   85 In my opinion only MAP_SHARED mappings should be immune from locking, and then
   86 only from mandatory locks - that is what is currently implemented.
   87 
   88 SunOS is so hopeless that it doesn't even honour the O_NONBLOCK flag for
   89 mandatory locks, so reads and writes to locked files always block when they
   90 should return EAGAIN.
   91 
   92 I'm afraid that this is such an esoteric area that the semantics described
   93 below are just as valid as any others, so long as the main points seem to
   94 agree. 
   95 
   96 4. Semantics
   97 ------------
   98 
   99 1. Mandatory locks can only be applied via the fcntl()/lockf() locking
  100    interface - in other words the System V/POSIX interface. BSD style
  101    locks using flock() never result in a mandatory lock.
  102 
  103 2. If a process has locked a region of a file with a mandatory read lock, then
  104    other processes are permitted to read from that region. If any of these
  105    processes attempts to write to the region it will block until the lock is
  106    released, unless the process has opened the file with the O_NONBLOCK
  107    flag in which case the system call will return immediately with the error
  108    status EAGAIN.
  109 
  110 3. If a process has locked a region of a file with a mandatory write lock, all
  111    attempts to read or write to that region block until the lock is released,
  112    unless a process has opened the file with the O_NONBLOCK flag in which case
  113    the system call will return immediately with the error status EAGAIN.
  114 
  115 4. Calls to open() with O_TRUNC, or to creat(), on a existing file that has
  116    any mandatory locks owned by other processes will be rejected with the
  117    error status EAGAIN.
  118 
  119 5. Attempts to apply a mandatory lock to a file that is memory mapped and
  120    shared (via mmap() with MAP_SHARED) will be rejected with the error status
  121    EAGAIN.
  122 
  123 6. Attempts to create a shared memory map of a file (via mmap() with MAP_SHARED)
  124    that has any mandatory locks in effect will be rejected with the error status
  125    EAGAIN.
  126 
  127 5. Which system calls are affected?
  128 -----------------------------------
  129 
  130 Those which modify a file's contents, not just the inode. That gives read(),
  131 write(), readv(), writev(), open(), creat(), mmap(), truncate() and
  132 ftruncate(). truncate() and ftruncate() are considered to be "write" actions
  133 for the purposes of mandatory locking.
  134 
  135 The affected region is usually defined as stretching from the current position
  136 for the total number of bytes read or written. For the truncate calls it is
  137 defined as the bytes of a file removed or added (we must also consider bytes
  138 added, as a lock can specify just "the whole file", rather than a specific
  139 range of bytes.)
  140 
  141 Note 3: I may have overlooked some system calls that need mandatory lock
  142 checking in my eagerness to get this code out the door. Please let me know, or
  143 better still fix the system calls yourself and submit a patch to me or Linus.
  144 
  145 6. Warning!
  146 -----------
  147 
  148 Not even root can override a mandatory lock, so runaway processes can wreak
  149 havoc if they lock crucial files. The way around it is to change the file
  150 permissions (remove the setgid bit) before trying to read or write to it.
  151 Of course, that might be a bit tricky if the system is hung :-(
  152 

Cache object: 571e0c2ee49ce5d5f5d102d0fe59ae4e


[ source navigation ] [ diff markup ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]


This page is part of the FreeBSD/Linux Linux Kernel Cross-Reference, and was automatically generated using a modified version of the LXR engine.