The Design and Implementation of the FreeBSD Operating System, Second Edition
Now available: The Design and Implementation of the FreeBSD Operating System (Second Edition)


[ source navigation ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]

FreeBSD/Linux Kernel Cross Reference
sys/geom/sched/

Version: -  FREEBSD  -  FREEBSD-13-STABLE  -  FREEBSD-13-0  -  FREEBSD-12-STABLE  -  FREEBSD-12-0  -  FREEBSD-11-STABLE  -  FREEBSD-11-0  -  FREEBSD-10-STABLE  -  FREEBSD-10-0  -  FREEBSD-9-STABLE  -  FREEBSD-9-0  -  FREEBSD-8-STABLE  -  FREEBSD-8-0  -  FREEBSD-7-STABLE  -  FREEBSD-7-0  -  FREEBSD-6-STABLE  -  FREEBSD-6-0  -  FREEBSD-5-STABLE  -  FREEBSD-5-0  -  FREEBSD-4-STABLE  -  FREEBSD-3-STABLE  -  FREEBSD22  -  l41  -  OPENBSD  -  linux-2.6  -  MK84  -  PLAN9  -  xnu-8792 
SearchContext: -  none  -  3  -  10 

Name Size Last modified (GMT) Description
Back Parent directory 2019-04-21 12:26:19
File README 6661 bytes 2011-05-21 10:00:34
C file g_sched.c 46594 bytes 2011-05-21 10:00:34
C file g_sched.h 5029 bytes 2011-05-21 10:00:34
C file gs_rr.c 19112 bytes 2011-05-21 10:00:34
C file gs_scheduler.h 7491 bytes 2011-05-21 10:00:34
C file subr_disk.c 6751 bytes 2011-05-21 10:00:34

    1 
    2         --- GEOM BASED DISK SCHEDULERS FOR FREEBSD ---
    3 
    4 This code contains a framework for GEOM-based disk schedulers and a
    5 couple of sample scheduling algorithms that use the framework and
    6 implement two forms of "anticipatory scheduling" (see below for more
    7 details).
    8 
    9 As a quick example of what this code can give you, try to run "dd",
   10 "tar", or some other program with highly SEQUENTIAL access patterns,
   11 together with "cvs", "cvsup", "svn" or other highly RANDOM access patterns
   12 (this is not a made-up example: it is pretty common for developers
   13 to have one or more apps doing random accesses, and others that do
   14 sequential accesses e.g., loading large binaries from disk, checking
   15 the integrity of tarballs, watching media streams and so on).
   16 
   17 These are the results we get on a local machine (AMD BE2400 dual
   18 core CPU, SATA 250GB disk):
   19 
   20     /mnt is a partition mounted on /dev/ad0s1f
   21 
   22     cvs:        cvs -d /mnt/home/ncvs-local update -Pd /mnt/ports
   23     dd-read:    dd bs=128k of=/dev/null if=/dev/ad0 (or ad0-sched-)
   24     dd-writew   dd bs=128k if=/dev/zero of=/mnt/largefile
   25 
   26                         NO SCHEDULER            RR SCHEDULER
   27                         dd      cvs             dd      cvs
   28 
   29     dd-read only        72 MB/s ----            72 MB/s ---
   30     dd-write only       55 MB/s ---             55 MB/s ---
   31     dd-read+cvs          6 MB/s ok              30 MB/s ok
   32     dd-write+cvs        55 MB/s slooow          14 MB/s ok
   33 
   34 As you can see, when a cvs is running concurrently with dd, the
   35 performance drops dramatically, and depending on read or write mode,
   36 one of the two is severely penalized.  The use of the RR scheduler
   37 in this example makes the dd-reader go much faster when competing
   38 with cvs, and lets cvs progress when competing with a writer.
   39 
   40 To try it out:
   41 
   42 1. USERS OF FREEBSD 7, PLEASE READ CAREFULLY THE FOLLOWING:
   43 
   44     On loading, this module patches one kernel function (g_io_request())
   45     so that I/O requests ("bio's") carry a classification tag, useful
   46     for scheduling purposes.
   47 
   48     ON FREEBSD 7, the tag is stored in an existing (though rarely used)
   49     field of the "struct bio", a solution which makes this module
   50     incompatible with other modules using it, such as ZFS and gjournal.
   51     Additionally, g_io_request() is patched in-memory to add a call
   52     to the function that initializes this field (i386/amd64 only;
   53     for other architectures you need to manually patch sys/geom/geom_io.c).
   54     See details in the file g_sched.c.
   55 
   56     On FreeBSD 8.0 and above, the above trick is not necessary,
   57     as the struct bio contains dedicated fields for the classifier,
   58     and hooks for request classifiers.
   59 
   60     If you don't like the above, don't run this code.
   61 
   62 2. PLEASE MAKE SURE THAT THE DISK THAT YOU WILL BE USING FOR TESTS
   63    DOES NOT CONTAIN PRECIOUS DATA.
   64     This is experimental code, so we make no guarantees, though
   65     I am routinely using it on my desktop and laptop.
   66 
   67 3. EXTRACT AND BUILD THE PROGRAMS
   68     A 'make install' in the directory should work (with root privs),
   69     or you can even try the binary modules.
   70     If you want to build the modules yourself, look at the Makefile.
   71 
   72 4. LOAD THE MODULE, CREATE A GEOM NODE, RUN TESTS
   73 
   74     The scheduler's module must be loaded first:
   75 
   76       # kldload gsched_rr
   77 
   78     substitute with gsched_as to test AS.  Then, supposing that you are
   79     using /dev/ad0 for testing, a scheduler can be attached to it with:
   80 
   81       # geom sched insert ad0
   82 
   83     The scheduler is inserted transparently in the geom chain, so
   84     mounted partitions and filesystems will keep working, but
   85     now requests will go through the scheduler.
   86 
   87     To change scheduler on-the-fly, you can reconfigure the geom:
   88 
   89       # geom sched configure -a as ad0.sched.
   90 
   91     assuming that gsched_as was loaded previously.
   92 
   93 5. SCHEDULER REMOVAL
   94 
   95     In principle it is possible to remove the scheduler module
   96     even on an active chain by doing
   97 
   98         # geom sched destroy ad0.sched.
   99 
  100     However, there is some race in the geom subsystem which makes
  101     the removal unsafe if there are active requests on a chain.
  102     So, in order to reduce the risk of data losses, make sure
  103     you don't remove a scheduler from a chain with ongoing transactions.
  104 
  105 --- NOTES ON THE SCHEDULERS ---
  106 
  107 The important contribution of this code is the framework to experiment
  108 with different scheduling algorithms.  'Anticipatory scheduling'
  109 is a very powerful technique based on the following reasoning:
  110 
  111     The disk throughput is much better if it serves sequential requests.
  112     If we have a mix of sequential and random requests, and we see a
  113     non-sequential request, do not serve it immediately but instead wait
  114     a little bit (2..5ms) to see if there is another one coming that
  115     the disk can serve more efficiently.
  116 
  117 There are many details that should be added to make sure that the
  118 mechanism is effective with different workloads and systems, to
  119 gain a few extra percent in performance, to improve fairness,
  120 insulation among processes etc.  A discussion of the vast literature
  121 on the subject is beyond the purpose of this short note.
  122 
  123 --------------------------------------------------------------------------
  124 
  125 TRANSPARENT INSERT/DELETE
  126 
  127 geom_sched is an ordinary geom module, however it is convenient
  128 to plug it transparently into the geom graph, so that one can
  129 enable or disable scheduling on a mounted filesystem, and the
  130 names in /etc/fstab do not depend on the presence of the scheduler.
  131 
  132 To understand how this works in practice, remember that in GEOM
  133 we have "providers" and "geom" objects.
  134 Say that we want to hook a scheduler on provider "ad0",
  135 accessible through pointer 'pp'. Originally, pp is attached to
  136 geom "ad0" (same name, different object) accessible through pointer old_gp
  137 
  138   BEFORE        ---> [ pp    --> old_gp ...]
  139 
  140 A normal "geom sched create ad0" call would create a new geom node
  141 on top of provider ad0/pp, and export a newly created provider
  142 ("ad0.sched." accessible through pointer newpp).
  143 
  144   AFTER create  ---> [ newpp --> gp --> cp ] ---> [ pp    --> old_gp ... ]
  145 
  146 On top of newpp, a whole tree will be created automatically, and we
  147 can e.g. mount partitions on /dev/ad0.sched.s1d, and those requests
  148 will go through the scheduler, whereas any partition mounted on
  149 the pre-existing device entries will not go through the scheduler.
  150 
  151 With the transparent insert mechanism, the original provider "ad0"/pp
  152 is hooked to the newly created geom, as follows:
  153 
  154   AFTER insert  ---> [ pp    --> gp --> cp ] ---> [ newpp --> old_gp ... ]
  155 
  156 so anything that was previously using provider pp will now have
  157 the requests routed through the scheduler node.
  158 
  159 A removal ("geom sched destroy ad0.sched.") will restore the original
  160 configuration.
  161 
  162 # $FreeBSD: releng/8.2/sys/geom/sched/README 206497 2010-04-12 16:37:45Z luigi $

[ source navigation ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]


This page is part of the FreeBSD/Linux Linux Kernel Cross-Reference, and was automatically generated using a modified version of the LXR engine.