The Design and Implementation of the FreeBSD Operating System, Second Edition
Now available: The Design and Implementation of the FreeBSD Operating System (Second Edition)


[ source navigation ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]

FreeBSD/Linux Kernel Cross Reference
sys/ufs/lfs/

Version: -  FREEBSD  -  FREEBSD-13-STABLE  -  FREEBSD-13-0  -  FREEBSD-12-STABLE  -  FREEBSD-12-0  -  FREEBSD-11-STABLE  -  FREEBSD-11-0  -  FREEBSD-10-STABLE  -  FREEBSD-10-0  -  FREEBSD-9-STABLE  -  FREEBSD-9-0  -  FREEBSD-8-STABLE  -  FREEBSD-8-0  -  FREEBSD-7-STABLE  -  FREEBSD-7-0  -  FREEBSD-6-STABLE  -  FREEBSD-6-0  -  FREEBSD-5-STABLE  -  FREEBSD-5-0  -  FREEBSD-4-STABLE  -  FREEBSD-3-STABLE  -  FREEBSD22  -  l41  -  OPENBSD  -  linux-2.6  -  MK84  -  PLAN9  -  xnu-8792 
SearchContext: -  none  -  3  -  10 

Name Size Last modified (GMT) Description
Back Parent directory 2008-12-02 20:19:57
File CHANGES 8151 bytes 2005-12-11 12:25:26
File Makefile 137 bytes 1998-06-12 23:23:12
File README 5995 bytes 1999-03-15 00:46:47
File TODO 5377 bytes 2005-12-11 12:25:26
C file lfs.h 43641 bytes 2008-05-16 09:22:00
C file lfs_alloc.c 18982 bytes 2008-04-28 20:24:11
C file lfs_balloc.c 16203 bytes 2008-05-16 09:22:00
C file lfs_bio.c 23372 bytes 2008-05-06 18:43:45
C file lfs_cksum.c 4235 bytes 2008-04-28 20:24:11
C file lfs_debug.c 10108 bytes 2008-04-28 20:24:11
C file lfs_extern.h 10841 bytes 2008-06-28 01:34:05
C file lfs_inode.c 26607 bytes 2011-05-23 17:49:01
C file lfs_itimes.c 3642 bytes 2008-04-28 20:24:11
C file lfs_rfw.c 19976 bytes 2008-05-16 09:22:00
C file lfs_segment.c 76822 bytes 2008-06-02 16:25:34
C file lfs_subr.c 17661 bytes 2008-04-28 20:24:11
C file lfs_syscalls.c 32960 bytes 2008-05-16 09:22:01
C file lfs_vfsops.c 58471 bytes 2011-05-23 17:49:01
C file lfs_vnops.c 65811 bytes 2008-06-24 10:47:32

    1 #       $NetBSD: README,v 1.3 1999/03/15 00:46:47 perseant Exp $
    2 
    3 #       @(#)README      8.1 (Berkeley) 6/11/93
    4 
    5 The file system is reasonably stable...I think.
    6 
    7 For details on the implementation, performance and why garbage
    8 collection always wins, see Dr. Margo Seltzer's thesis available for
    9 anonymous ftp from toe.cs.berkeley.edu, in the directory
   10 pub/personal/margo/thesis.ps.Z, or the January 1993 USENIX paper.
   11 
   12 ----------
   13 The disk is laid out in segments.  The first segment starts 8K into the
   14 disk (the first 8K is used for boot information).  Each segment is composed
   15 of the following:
   16 
   17         An optional super block
   18         One or more groups of:
   19                 segment summary
   20                 0 or more data blocks
   21                 0 or more inode blocks
   22 
   23 The segment summary and inode/data blocks start after the super block (if
   24 present), and grow toward the end of the segment.
   25 
   26         _______________________________________________
   27         |         |            |         |            |
   28         | summary | data/inode | summary | data/inode |
   29         |  block  |   blocks   |  block  |   blocks   | ...
   30         |_________|____________|_________|____________|
   31 
   32 The data/inode blocks following a summary block are described by the
   33 summary block.  In order to permit the segment to be written in any order
   34 and in a forward direction only, a checksum is calculated across the
   35 blocks described by the summary.  Additionally, the summary is checksummed
   36 and timestamped.  Both of these are intended for recovery; the former is
   37 to make it easy to determine that it *is* a summary block and the latter
   38 is to make it easy to determine when recovery is finished for partially
   39 written segments.  These checksums are also used by the cleaner.
   40 
   41         Summary block (detail)
   42         ________________
   43         | sum cksum    |
   44         | data cksum   |
   45         | next segment |
   46         | timestamp    |
   47         | FINFO count  |
   48         | inode count  |
   49         | flags        |
   50         |______________|
   51         |   FINFO-1    | 0 or more file info structures, identifying the
   52         |     .        | blocks in the segment.
   53         |     .        |
   54         |     .        |
   55         |   FINFO-N    |
   56         |   inode-N    |
   57         |     .        |
   58         |     .        |
   59         |     .        | 0 or more inode daddr_t's, identifying the inode
   60         |   inode-1    | blocks in the segment.
   61         |______________|
   62 
   63 Inode blocks are blocks of on-disk inodes in the same format as those in
   64 the FFS.  However, spare[0] contains the inode number of the inode so we
   65 can find a particular inode on a page.  They are packed page_size /
   66 sizeof(inode) to a block.  Data blocks are exactly as in the FFS.  Both
   67 inodes and data blocks move around the file system at will.
   68 
   69 The file system is described by a super-block which is replicated and
   70 occurs as the first block of the first and other segments.  (The maximum
   71 number of super-blocks is MAXNUMSB).  Each super-block maintains a list
   72 of the disk addresses of all the super-blocks.  The super-block maintains
   73 a small amount of checkpoint information, essentially just enough to find
   74 the inode for the IFILE (fs->lfs_idaddr).
   75 
   76 The IFILE is visible in the file system, as inode number IFILE_INUM.  It
   77 contains information shared between the kernel and various user processes.
   78 
   79         Ifile (detail)
   80         ________________
   81         | cleaner info | Cleaner information per file system.  (Page
   82         |              | granularity.)
   83         |______________|
   84         | segment      | Space available and last modified times per
   85         | usage table  | segment.  (Page granularity.)
   86         |______________|
   87         |   IFILE-1    | Per inode status information: current version #,
   88         |     .        | if currently allocated, last access time and
   89         |     .        | current disk address of containing inode block.
   90         |     .        | If current disk address is LFS_UNUSED_DADDR, the
   91         |   IFILE-N    | inode is not in use, and it's on the free list.
   92         |______________|
   93 
   94 
   95 First Segment at Creation Time:
   96 _____________________________________________________________
   97 |        |       |         |       |       |       |       |
   98 | 8K pad | Super | summary | inode | ifile | root  | l + f |
   99 |        | block |         | block |       | dir   | dir   |
  100 |________|_______|_________|_______|_______|_______|_______|
  101           ^
  102            Segment starts here.
  103 
  104 Some differences from the Sprite LFS implementation.
  105 
  106 1. The LFS implementation placed the ifile metadata and the super block
  107    at fixed locations.  This implementation replicates the super block
  108    and puts each at a fixed location.  The checkpoint data is divided into
  109    two parts -- just enough information to find the IFILE is stored in
  110    two of the super blocks, although it is not toggled between them as in
  111    the Sprite implementation.  (This was deliberate, to avoid a single
  112    point of failure.)  The remaining checkpoint information is treated as
  113    a regular file, which means that the cleaner info, the segment usage
  114    table and the ifile meta-data are stored in normal log segments.
  115    (Tastes great, less filling...)
  116 
  117 2. The segment layout is radically different in Sprite; this implementation
  118    uses something a lot like network framing, where data/inode blocks are
  119    written asynchronously, and a checksum is used to validate any set of
  120    summary and data/inode blocks.  Sprite writes summary blocks synchronously
  121    after the data/inode blocks have been written and the existence of the
  122    summary block validates the data/inode blocks.  This permits us to write
  123    everything contiguously, even partial segments and their summaries, whereas
  124    Sprite is forced to seek (from the end of the data inode to the summary
  125    which lives at the end of the segment).  Additionally, writing the summary
  126    synchronously should cost about 1/2 a rotation per summary.
  127 
  128 3. Sprite LFS distinguishes between different types of blocks in the segment.
  129    Other than inode blocks and data blocks, we don't.
  130 
  131 4. Sprite LFS traverses the IFILE looking for free blocks.  We maintain a
  132    free list threaded through the IFILE entries.
  133 
  134 5. The cleaner runs in user space, as opposed to kernel space.  It shares
  135    information with the kernel by reading/writing the IFILE and through
  136    cleaner specific system calls.
  137 

[ source navigation ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]


This page is part of the FreeBSD/Linux Linux Kernel Cross-Reference, and was automatically generated using a modified version of the LXR engine.