The Design and Implementation of the FreeBSD Operating System, Second Edition
Now available: The Design and Implementation of the FreeBSD Operating System (Second Edition)


[ source navigation ] [ diff markup ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]

FreeBSD/Linux Kernel Cross Reference
sys/vm/vm_kern.c

Version: -  FREEBSD  -  FREEBSD-12-STABLE  -  FREEBSD-12-0  -  FREEBSD-11-STABLE  -  FREEBSD-11-2  -  FREEBSD-11-1  -  FREEBSD-11-0  -  FREEBSD-10-STABLE  -  FREEBSD-10-4  -  FREEBSD-10-3  -  FREEBSD-10-2  -  FREEBSD-10-1  -  FREEBSD-10-0  -  FREEBSD-9-STABLE  -  FREEBSD-9-3  -  FREEBSD-9-2  -  FREEBSD-9-1  -  FREEBSD-9-0  -  FREEBSD-8-STABLE  -  FREEBSD-8-4  -  FREEBSD-8-3  -  FREEBSD-8-2  -  FREEBSD-8-1  -  FREEBSD-8-0  -  FREEBSD-7-STABLE  -  FREEBSD-7-4  -  FREEBSD-7-3  -  FREEBSD-7-2  -  FREEBSD-7-1  -  FREEBSD-7-0  -  FREEBSD-6-STABLE  -  FREEBSD-6-4  -  FREEBSD-6-3  -  FREEBSD-6-2  -  FREEBSD-6-1  -  FREEBSD-6-0  -  FREEBSD-5-STABLE  -  FREEBSD-5-5  -  FREEBSD-5-4  -  FREEBSD-5-3  -  FREEBSD-5-2  -  FREEBSD-5-1  -  FREEBSD-5-0  -  FREEBSD-4-STABLE  -  FREEBSD-3-STABLE  -  FREEBSD22  -  linux-2.6  -  linux-2.4.22  -  MK83  -  MK84  -  PLAN9  -  DFBSD  -  NETBSD  -  NETBSD5  -  NETBSD4  -  NETBSD3  -  NETBSD20  -  OPENBSD  -  xnu-517  -  xnu-792  -  xnu-792.6.70  -  xnu-1228  -  xnu-1456.1.26  -  xnu-1699.24.8  -  xnu-2050.18.24  -  OPENSOLARIS  -  minix-3-1-1 
SearchContext: -  none  -  3  -  10 

    1 /* 
    2  * Mach Operating System
    3  * Copyright (c) 1993,1992,1991,1990,1989,1988,1987 Carnegie Mellon University
    4  * All Rights Reserved.
    5  * 
    6  * Permission to use, copy, modify and distribute this software and its
    7  * documentation is hereby granted, provided that both the copyright
    8  * notice and this permission notice appear in all copies of the
    9  * software, derivative works or modified versions, and any portions
   10  * thereof, and that both notices appear in supporting documentation.
   11  * 
   12  * CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
   13  * CONDITION.  CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND FOR
   14  * ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
   15  * 
   16  * Carnegie Mellon requests users of this software to return to
   17  * 
   18  *  Software Distribution Coordinator  or  Software.Distribution@CS.CMU.EDU
   19  *  School of Computer Science
   20  *  Carnegie Mellon University
   21  *  Pittsburgh PA 15213-3890
   22  * 
   23  * any improvements or extensions that they make and grant Carnegie Mellon
   24  * the rights to redistribute these changes.
   25  */
   26 /*
   27  * HISTORY
   28  * $Log:        vm_kern.c,v $
   29  * Revision 2.25  93/11/17  18:54:21  dbg
   30  *      Conditionalized projected buffer support under NET_ATM.
   31  *      [93/09/10            dbg]
   32  *      Added ANSI function prototypes.
   33  *      [93/06/16            dbg]
   34  * 
   35  * Revision 2.24  93/08/10  15:12:33  mrt
   36  *      Included support for projected buffers: projected_buffer_allocate,
   37  *      projected_buffer_map, projected_buffer_deallocate,
   38  *      projected_buffer_collect
   39  *      and projected_buffer_in_range. Projected buffers are buffers shared
   40  *      between the kernel and user tasks, and which may be persistent or not
   41  *      (be deallocated from the kernel map automatically when the last user
   42  *      reference is deallocated). The intended use is for device drivers.
   43  *      The user is denied direct manipulation of these map entries.
   44  *      [93/02/16  09:28:21  jcb]
   45  * 
   46  * Revision 2.23  93/01/14  18:01:03  danner
   47  *      64bit cleanup.
   48  *      [92/12/01            af]
   49  * 
   50  * Revision 2.22  92/08/03  18:00:41  jfriedl
   51  *      removed silly prototypes
   52  *      [92/08/02            jfriedl]
   53  * 
   54  * Revision 2.21  92/05/21  17:26:02  jfriedl
   55  *      Added types for functions that didn't have them explicitly.
   56  *      [92/05/16            jfriedl]
   57  * 
   58  * Revision 2.20  92/04/01  19:36:35  rpd
   59  *      Change kmem_io_map_copyout to handle multiple page lists.
   60  *      Don't call pmap_change_wiring from kmem_io_map_deallocate.
   61  *      [92/03/20  14:13:50  dlb]
   62  * 
   63  * Revision 2.19  92/02/23  19:50:53  elf
   64  *      kmem_io_map_deallocate() now calls pmap_remove() -- VM
   65  *      code optimization makes this necessary.  Optimized
   66  *      out kmem_alloc_pageable() call from kmem_io_map_copyout().
   67  *      [92/01/07  16:37:38  dlb]
   68  * 
   69  * Revision 2.18  92/01/14  16:47:57  rpd
   70  *      Added copyinmap and copyoutmap.
   71  *      [91/12/16            rpd]
   72  * 
   73  * Revision 2.17  91/08/28  11:18:00  jsb
   74  *      Delete kmem_fault_wire, io_wire, io_unwire - Replaced by
   75  *      kmem_io_{copyout,deallocate}.
   76  *      [91/08/06  17:17:40  dlb]
   77  * 
   78  *      Make kmem_io_map_deallocate return void.
   79  *      [91/08/05  17:45:39  dlb]
   80  * 
   81  *      New interfaces for kmem_io_map routines.
   82  *      [91/08/02  17:07:40  dlb]
   83  * 
   84  *      New and improved io wiring support based on vm page lists:
   85  *      kmem_io_map_{copyout,deallocate}.  io_wire and io_unwire will
   86  *      go away when the device logic fully supports this.
   87  *      [91/07/31  15:12:02  dlb]
   88  * 
   89  * Revision 2.16  91/07/30  15:47:20  rvb
   90  *      Fixed io_wire to allocate an object when the entry doesn't have one.
   91  *      [91/06/27            rpd]
   92  * 
   93  * Revision 2.15  91/05/18  14:40:31  rpd
   94  *      Added kmem_alloc_aligned.
   95  *      [91/05/02            rpd]
   96  *      Added VM_FAULT_FICTITIOUS_SHORTAGE.
   97  *      Revised vm_map_find_entry to allow coalescing of entries.
   98  *      Fixed deadlock problem in kmem_alloc.
   99  *      [91/03/29            rpd]
  100  *      Fixed kmem_init to not create a zero-size entry.
  101  *      [91/03/25            rpd]
  102  * 
  103  * Revision 2.14  91/05/14  17:49:15  mrt
  104  *      Correcting copyright
  105  * 
  106  * Revision 2.13  91/03/16  15:05:20  rpd
  107  *      Fixed kmem_alloc_pages and kmem_remap_pages
  108  *      to not hold locks across pmap_enter.
  109  *      [91/03/11            rpd]
  110  *      Added kmem_realloc.  Changed kmem_alloc, kmem_alloc_wired, and
  111  *      kmem_alloc_pageable to return error codes.  Changed kmem_alloc
  112  *      to not use the kernel object and to not zero memory.
  113  *      Changed kmem_alloc_wired to use the kernel object.
  114  *      [91/03/07  16:49:52  rpd]
  115  * 
  116  *      Added resume, continuation arguments to vm_fault_page.
  117  *      Added continuation argument to VM_PAGE_WAIT.
  118  *      [91/02/05            rpd]
  119  * 
  120  * Revision 2.12  91/02/05  17:58:22  mrt
  121  *      Changed to new Mach copyright
  122  *      [91/02/01  16:32:19  mrt]
  123  * 
  124  * Revision 2.11  91/01/08  16:44:59  rpd
  125  *      Changed VM_WAIT to VM_PAGE_WAIT.
  126  *      [90/11/13            rpd]
  127  * 
  128  * Revision 2.10  90/10/12  13:05:35  rpd
  129  *      Only activate the page returned by vm_fault_page if it isn't
  130  *      already on a pageout queue.
  131  *      [90/10/09  22:33:09  rpd]
  132  * 
  133  * Revision 2.9  90/06/19  23:01:54  rpd
  134  *      Picked up vm_submap_object.
  135  *      [90/06/08            rpd]
  136  * 
  137  * Revision 2.8  90/06/02  15:10:43  rpd
  138  *      Purged MACH_XP_FPD.
  139  *      [90/03/26  23:12:33  rpd]
  140  * 
  141  * Revision 2.7  90/02/22  20:05:39  dbg
  142  *      Update to vm_map.h.
  143  *      Remove kmem_alloc_wait, kmem_free_wakeup, vm_move.
  144  *      Fix copy_user_to_physical_page to test for kernel tasks.
  145  *      Simplify v_to_p allocation.
  146  *      Change PAGE_WAKEUP to PAGE_WAKEUP_DONE to reflect the
  147  *      fact that it clears the busy flag.
  148  *      [90/01/25            dbg]
  149  * 
  150  * Revision 2.6  90/01/22  23:09:12  af
  151  *      Undone VM_PROT_DEFAULT change, moved to vm_prot.h
  152  *      [90/01/20  17:28:57  af]
  153  * 
  154  * Revision 2.5  90/01/19  14:35:57  rwd
  155  *      Get new version from rfr
  156  *      [90/01/10            rwd]
  157  * 
  158  * Revision 2.4  90/01/11  11:47:44  dbg
  159  *      Remove kmem_mb_alloc and mb_map.
  160  *      [89/12/11            dbg]
  161  * 
  162  * Revision 2.3  89/11/29  14:17:43  af
  163  *      Redefine VM_PROT_DEFAULT locally for mips.
  164  *      Might migrate in the final place sometimes.
  165  * 
  166  * Revision 2.2  89/09/08  11:28:19  dbg
  167  *      Add special wiring code for IO memory.
  168  *      [89/08/10            dbg]
  169  * 
  170  *      Add keep_wired argument to vm_move.
  171  *      [89/07/14            dbg]
  172  * 
  173  * 28-Apr-89  David Golub (dbg) at Carnegie-Mellon University
  174  *      Changes for MACH_KERNEL:
  175  *      . Optimize kmem_alloc.  Add kmem_alloc_wired.
  176  *      . Remove non-MACH include files.
  177  *      . Change vm_move to call vm_map_move.
  178  *      . Clean up fast_pager_data option.
  179  *
  180  * Revision 2.14  89/04/22  15:35:28  gm0w
  181  *      Added code in kmem_mb_alloc to verify that requested allocation
  182  *      will fit in the map.
  183  *      [89/04/14            gm0w]
  184  * 
  185  * Revision 2.13  89/04/18  21:25:45  mwyoung
  186  *      Recent history:
  187  *              Add call to vm_map_simplify to reduce kernel map fragmentation.
  188  *      History condensation:
  189  *              Added routines for copying user data to physical
  190  *               addresses.  [rfr, mwyoung]
  191  *              Added routines for sleep/wakeup forms, interrupt-time
  192  *               allocation. [dbg]
  193  *              Created.  [avie, mwyoung, dbg]
  194  * 
  195  */
  196 /*
  197  *      File:   vm/vm_kern.c
  198  *      Author: Avadis Tevanian, Jr., Michael Wayne Young
  199  *      Date:   1985
  200  *
  201  *      Kernel memory management.
  202  */
  203 #include <net_atm.h>            /* projected buffers */
  204 
  205 #include <mach/kern_return.h>
  206 #include <mach/vm_param.h>
  207 #include <kern/assert.h>
  208 #include <kern/memory.h>
  209 #include <kern/lock.h>
  210 #include <kern/thread.h>
  211 #include <vm/vm_fault.h>
  212 #include <vm/vm_kern.h>
  213 #include <vm/vm_map.h>
  214 #include <vm/vm_object.h>
  215 #include <vm/vm_page.h>
  216 #include <vm/vm_pageout.h>
  217 
  218 
  219 
  220 /*
  221  *      Variables exported by this module.
  222  */
  223 
  224 vm_map_t        kernel_map;
  225 vm_map_t        kernel_pageable_map;
  226 
  227 void kmem_alloc_pages(
  228         vm_object_t     object,
  229         vm_offset_t     offset,
  230         vm_offset_t     start,
  231         vm_offset_t     end,
  232         vm_prot_t       protection);    /* forward */
  233 void kmem_remap_pages(
  234         vm_object_t     object,
  235         vm_offset_t     offset,
  236         vm_offset_t     start,
  237         vm_offset_t     end,
  238         vm_prot_t       protection);    /* forward */
  239 
  240 
  241 #if     NET_ATM
  242 /*
  243  *      projected_buffer_allocate
  244  *
  245  *      Allocate a wired-down buffer shared between kernel and user task.  
  246  *      Fresh, zero-filled memory is allocated.
  247  *      If persistence is false, this buffer can only be deallocated from
  248  *      user task using projected_buffer_deallocate, and deallocation 
  249  *      from user task also deallocates the buffer from the kernel map.
  250  *      projected_buffer_collect is called from vm_map_deallocate to
  251  *      automatically deallocate projected buffers on task_deallocate.
  252  *      Sharing with more than one user task is achieved by using 
  253  *      projected_buffer_map for the second and subsequent tasks.
  254  *      The user is precluded from manipulating the VM entry of this buffer
  255  *      (i.e. changing protection, inheritance or machine attributes).
  256  */
  257 
  258 kern_return_t
  259 projected_buffer_allocate(
  260         vm_map_t map,
  261         vm_size_t size,
  262         int persistence,
  263         vm_offset_t *kernel_p,
  264         vm_offset_t *user_p,
  265         vm_prot_t protection,
  266         vm_inherit_t inheritance)  /*Currently only VM_INHERIT_NONE supported*/
  267 {
  268         vm_object_t object;
  269         vm_map_entry_t u_entry, k_entry;
  270         vm_offset_t addr;
  271         vm_size_t r_size;
  272         kern_return_t kr;
  273 
  274         if (map == VM_MAP_NULL || map == kernel_map)
  275           return(KERN_INVALID_ARGUMENT);
  276 
  277         /*
  278          *      Allocate a new object. 
  279          */
  280 
  281         size = round_page(size);
  282         object = vm_object_allocate(size);
  283 
  284         vm_map_lock(kernel_map);
  285         kr = vm_map_find_entry(kernel_map, &addr, size, (vm_offset_t) 0,
  286                                VM_OBJECT_NULL, &k_entry);
  287         if (kr != KERN_SUCCESS) {
  288           vm_map_unlock(kernel_map);
  289           vm_object_deallocate(object);
  290           return kr;
  291         }
  292 
  293         k_entry->object.vm_object = object;
  294         if (!persistence)
  295           k_entry->projected_on = (vm_map_entry_t) -1;
  296               /*Mark entry so as to automatically deallocate it when
  297                 last corresponding user entry is deallocated*/
  298         vm_map_unlock(kernel_map);
  299         *kernel_p = addr;
  300 
  301         vm_map_lock(map);
  302         kr = vm_map_find_entry(map, &addr, size, (vm_offset_t) 0,
  303                                VM_OBJECT_NULL, &u_entry);
  304         if (kr != KERN_SUCCESS) {
  305           vm_map_unlock(map);
  306           vm_map_lock(kernel_map);
  307           vm_map_entry_delete(kernel_map, k_entry);
  308           vm_map_unlock(kernel_map);
  309           vm_object_deallocate(object);
  310           return kr;
  311         }
  312 
  313         u_entry->object.vm_object = object;
  314         vm_object_reference(object);
  315         u_entry->projected_on = k_entry;
  316              /*Creates coupling with kernel mapping of the buffer, and
  317                also guarantees that user cannot directly manipulate
  318                buffer VM entry*/
  319         u_entry->protection = protection;
  320         u_entry->max_protection = protection;
  321         u_entry->inheritance = inheritance;
  322         vm_map_unlock(map);
  323         *user_p = addr;
  324 
  325         /*
  326          *      Allocate wired-down memory in the object,
  327          *      and enter it in the kernel pmap.
  328          */
  329         kmem_alloc_pages(object, 0,
  330                          *kernel_p, *kernel_p + size,
  331                          VM_PROT_READ | VM_PROT_WRITE);
  332         bzero(*kernel_p, size);         /*Zero fill*/
  333 
  334         /* Set up physical mappings for user pmap */
  335 
  336         pmap_pageable(map->pmap, *user_p, *user_p + size, FALSE);
  337         for (r_size = 0; r_size < size; r_size += PAGE_SIZE) {
  338           addr = pmap_extract(kernel_pmap, *kernel_p + r_size);
  339           pmap_enter(map->pmap, *user_p + r_size, addr,
  340                      protection, TRUE);
  341         }
  342 
  343         return(KERN_SUCCESS);
  344 }
  345 
  346 
  347 /*
  348  *      projected_buffer_map
  349  *
  350  *      Map an area of kernel memory onto a task's address space.
  351  *      No new memory is allocated; the area must previously exist in the
  352  *      kernel memory map.
  353  */
  354 
  355 kern_return_t
  356 projected_buffer_map(
  357         vm_map_t map,
  358         vm_offset_t kernel_addr,
  359         vm_size_t size,
  360         vm_offset_t *user_p,
  361         vm_prot_t protection,
  362         vm_inherit_t inheritance)  /*Currently only VM_INHERIT_NONE supported*/
  363 {
  364         vm_object_t object;
  365         vm_map_entry_t u_entry, k_entry;
  366         vm_offset_t physical_addr, user_addr;
  367         vm_size_t r_size;
  368         kern_return_t kr;
  369 
  370         /*
  371          *      Find entry in kernel map 
  372          */
  373 
  374         size = round_page(size);
  375         if (map == VM_MAP_NULL || map == kernel_map ||
  376             !vm_map_lookup_entry(kernel_map, kernel_addr, &k_entry) ||
  377             kernel_addr + size > k_entry->vme_end)
  378           return(KERN_INVALID_ARGUMENT);
  379 
  380 
  381         /*
  382          *     Create entry in user task
  383          */
  384 
  385         vm_map_lock(map);
  386         kr = vm_map_find_entry(map, &user_addr, size, (vm_offset_t) 0,
  387                                VM_OBJECT_NULL, &u_entry);
  388         if (kr != KERN_SUCCESS) {
  389           vm_map_unlock(map);
  390           return kr;
  391         }
  392 
  393         u_entry->object.vm_object = k_entry->object.vm_object;
  394         vm_object_reference(k_entry->object.vm_object);
  395         u_entry->offset = kernel_addr - k_entry->vme_start + k_entry->offset;
  396         u_entry->projected_on = k_entry;
  397              /*Creates coupling with kernel mapping of the buffer, and
  398                also guarantees that user cannot directly manipulate
  399                buffer VM entry*/
  400         u_entry->protection = protection;
  401         u_entry->max_protection = protection;
  402         u_entry->inheritance = inheritance;
  403         u_entry->wired_count = k_entry->wired_count;
  404         vm_map_unlock(map);
  405         *user_p = user_addr;
  406 
  407         /* Set up physical mappings for user pmap */
  408 
  409         pmap_pageable(map->pmap, user_addr, user_addr + size,
  410                       !k_entry->wired_count);
  411         for (r_size = 0; r_size < size; r_size += PAGE_SIZE) {
  412           physical_addr = pmap_extract(kernel_pmap, kernel_addr + r_size);
  413           pmap_enter(map->pmap, user_addr + r_size, physical_addr,
  414                      protection, k_entry->wired_count);
  415         }
  416 
  417         return(KERN_SUCCESS);
  418 }
  419 
  420 
  421 /*
  422  *      projected_buffer_deallocate
  423  *
  424  *      Unmap projected buffer from task's address space.
  425  *      May also unmap buffer from kernel map, if buffer is not
  426  *      persistent and only the kernel reference remains.
  427  */
  428 
  429 kern_return_t
  430 projected_buffer_deallocate(
  431      vm_map_t map,
  432      vm_offset_t start,
  433      vm_offset_t end)
  434 {
  435         vm_map_entry_t entry, k_entry;
  436 
  437         vm_map_lock(map);
  438         if (map == VM_MAP_NULL || map == kernel_map ||
  439             !vm_map_lookup_entry(map, start, &entry) ||
  440             end > entry->vme_end ||
  441             /*Check corresponding kernel entry*/
  442             (k_entry = entry->projected_on) == 0) {
  443           vm_map_unlock(map);
  444           return(KERN_INVALID_ARGUMENT);
  445         }
  446 
  447         /*Prepare for deallocation*/
  448         if (entry->vme_start < start)
  449           _vm_map_clip_start(map, entry, start);
  450         if (entry->vme_end > end)
  451           _vm_map_clip_end(map, entry, end);
  452         if (map->first_free == entry)   /*Adjust first_free hint*/
  453           map->first_free = entry->vme_prev;
  454         entry->projected_on = 0;        /*Needed to allow deletion*/
  455         entry->wired_count = 0;         /*Avoid unwire fault*/
  456         vm_map_entry_delete(map, entry);
  457         vm_map_unlock(map);
  458 
  459         /*Check if the buffer is not persistent and only the 
  460           kernel mapping remains, and if so delete it*/
  461         vm_map_lock(kernel_map);
  462         if (k_entry->projected_on == (vm_map_entry_t) -1 &&
  463             k_entry->object.vm_object->ref_count == 1) {
  464           if (kernel_map->first_free == k_entry)
  465             kernel_map->first_free = k_entry->vme_prev;
  466           k_entry->projected_on = 0;    /*Allow unwire fault*/
  467           vm_map_entry_delete(kernel_map, k_entry);
  468         }
  469         vm_map_unlock(kernel_map);
  470         return(KERN_SUCCESS);
  471 }
  472 
  473 
  474 /*
  475  *      projected_buffer_collect
  476  *
  477  *      Unmap all projected buffers from task's address space.
  478  */
  479 
  480 kern_return_t
  481 projected_buffer_collect(
  482         vm_map_t map)
  483 {
  484         vm_map_entry_t entry, next;
  485 
  486         if (map == VM_MAP_NULL || map == kernel_map)
  487           return(KERN_INVALID_ARGUMENT);
  488 
  489         for (entry = vm_map_first_entry(map);
  490              entry != vm_map_to_entry(map);
  491              entry = next) {
  492           next = entry->vme_next;
  493           if (entry->projected_on != 0)
  494             projected_buffer_deallocate(map, entry->vme_start, entry->vme_end);
  495         }
  496         return(KERN_SUCCESS);
  497 }
  498 
  499 
  500 /*
  501  *      projected_buffer_in_range
  502  *
  503  *      Verifies whether a projected buffer exists in the address range 
  504  *      given.
  505  */
  506 
  507 boolean_t
  508 projected_buffer_in_range(
  509         vm_map_t map,
  510         vm_offset_t start,
  511         vm_offset_t end)
  512 {
  513         vm_map_entry_t entry;
  514 
  515         if (map == VM_MAP_NULL || map == kernel_map)
  516           return(FALSE);
  517 
  518         /*Find first entry*/
  519         if (!vm_map_lookup_entry(map, start, &entry))
  520           entry = entry->vme_next;
  521 
  522         while (entry != vm_map_to_entry(map) && entry->projected_on == 0 &&
  523                entry->vme_start <= end) {
  524           entry = entry->vme_next;
  525         }
  526         return(entry != vm_map_to_entry(map) && entry->vme_start <= end);
  527 }
  528 #endif  /* NET_ATM */
  529 
  530 /*
  531  *      kmem_alloc:
  532  *
  533  *      Allocate wired-down memory in the kernel's address map
  534  *      or a submap.  The memory is not zero-filled.
  535  */
  536 
  537 kern_return_t
  538 kmem_alloc(
  539         vm_map_t map,
  540         vm_offset_t *addrp,
  541         vm_size_t size)
  542 {
  543         vm_object_t object;
  544         vm_map_entry_t entry;
  545         vm_offset_t addr;
  546         kern_return_t kr;
  547 
  548         /*
  549          *      Allocate a new object.  We must do this before locking
  550          *      the map, lest we risk deadlock with the default pager:
  551          *              device_read_alloc uses kmem_alloc,
  552          *              which tries to allocate an object,
  553          *              which uses kmem_alloc_wired to get memory,
  554          *              which blocks for pages.
  555          *              then the default pager needs to read a block
  556          *              to process a memory_object_data_write,
  557          *              and device_read_alloc calls kmem_alloc
  558          *              and deadlocks on the map lock.
  559          */
  560 
  561         size = round_page(size);
  562         object = vm_object_allocate(size);
  563 
  564         vm_map_lock(map);
  565         kr = vm_map_find_entry(map, &addr, size, (vm_offset_t) 0,
  566                                VM_OBJECT_NULL, &entry);
  567         if (kr != KERN_SUCCESS) {
  568                 vm_map_unlock(map);
  569                 vm_object_deallocate(object);
  570                 return kr;
  571         }
  572 
  573         entry->object.vm_object = object;
  574         entry->offset = 0;
  575 
  576         /*
  577          *      Since we have not given out this address yet,
  578          *      it is safe to unlock the map.
  579          */
  580         vm_map_unlock(map);
  581 
  582         /*
  583          *      Allocate wired-down memory in the kernel_object,
  584          *      for this entry, and enter it in the kernel pmap.
  585          */
  586         kmem_alloc_pages(object, 0,
  587                          addr, addr + size,
  588                          VM_PROT_DEFAULT);
  589 
  590         /*
  591          *      Return the memory, not zeroed.
  592          */
  593         *addrp = addr;
  594         return KERN_SUCCESS;
  595 }
  596 
  597 /*
  598  *      kmem_realloc:
  599  *
  600  *      Reallocate wired-down memory in the kernel's address map
  601  *      or a submap.  Newly allocated pages are not zeroed.
  602  *      This can only be used on regions allocated with kmem_alloc.
  603  *
  604  *      If successful, the pages in the old region are mapped twice.
  605  *      The old region is unchanged.  Use kmem_free to get rid of it.
  606  */
  607 kern_return_t kmem_realloc(
  608         vm_map_t map,
  609         vm_offset_t oldaddr,
  610         vm_size_t oldsize,
  611         vm_offset_t *newaddrp,
  612         vm_size_t newsize)
  613 {
  614         vm_offset_t oldmin, oldmax;
  615         vm_offset_t newaddr;
  616         vm_object_t object;
  617         vm_map_entry_t oldentry, newentry;
  618         kern_return_t kr;
  619 
  620         oldmin = trunc_page(oldaddr);
  621         oldmax = round_page(oldaddr + oldsize);
  622         oldsize = oldmax - oldmin;
  623         newsize = round_page(newsize);
  624 
  625         /*
  626          *      Find space for the new region.
  627          */
  628 
  629         vm_map_lock(map);
  630         kr = vm_map_find_entry(map, &newaddr, newsize, (vm_offset_t) 0,
  631                                VM_OBJECT_NULL, &newentry);
  632         if (kr != KERN_SUCCESS) {
  633                 vm_map_unlock(map);
  634                 return kr;
  635         }
  636 
  637         /*
  638          *      Find the VM object backing the old region.
  639          */
  640 
  641         if (!vm_map_lookup_entry(map, oldmin, &oldentry))
  642                 panic("kmem_realloc");
  643         object = oldentry->object.vm_object;
  644 
  645         /*
  646          *      Increase the size of the object and
  647          *      fill in the new region.
  648          */
  649 
  650         vm_object_reference(object);
  651         vm_object_lock(object);
  652         if (object->size != oldsize)
  653                 panic("kmem_realloc");
  654         object->size = newsize;
  655         vm_object_unlock(object);
  656 
  657         newentry->object.vm_object = object;
  658         newentry->offset = 0;
  659 
  660         /*
  661          *      Since we have not given out this address yet,
  662          *      it is safe to unlock the map.  We are trusting
  663          *      that nobody will play with either region.
  664          */
  665 
  666         vm_map_unlock(map);
  667 
  668         /*
  669          *      Remap the pages in the old region and
  670          *      allocate more pages for the new region.
  671          */
  672 
  673         kmem_remap_pages(object, 0,
  674                          newaddr, newaddr + oldsize,
  675                          VM_PROT_DEFAULT);
  676         kmem_alloc_pages(object, oldsize,
  677                          newaddr + oldsize, newaddr + newsize,
  678                          VM_PROT_DEFAULT);
  679 
  680         *newaddrp = newaddr;
  681         return KERN_SUCCESS;
  682 }
  683 
  684 /*
  685  *      kmem_alloc_wired:
  686  *
  687  *      Allocate wired-down memory in the kernel's address map
  688  *      or a submap.  The memory is not zero-filled.
  689  *
  690  *      The memory is allocated in the kernel_object.
  691  *      It may not be copied with vm_map_copy, and
  692  *      it may not be reallocated with kmem_realloc.
  693  */
  694 
  695 kern_return_t
  696 kmem_alloc_wired(
  697         vm_map_t map,
  698         vm_offset_t *addrp,
  699         vm_size_t size)
  700 {
  701         vm_map_entry_t entry;
  702         vm_offset_t offset;
  703         vm_offset_t addr;
  704         kern_return_t kr;
  705 
  706         /*
  707          *      Use the kernel object for wired-down kernel pages.
  708          *      Assume that no region of the kernel object is
  709          *      referenced more than once.  We want vm_map_find_entry
  710          *      to extend an existing entry if possible.
  711          */
  712 
  713         size = round_page(size);
  714         vm_map_lock(map);
  715         kr = vm_map_find_entry(map, &addr, size, (vm_offset_t) 0,
  716                                kernel_object, &entry);
  717         if (kr != KERN_SUCCESS) {
  718                 vm_map_unlock(map);
  719                 return kr;
  720         }
  721 
  722         /*
  723          *      Since we didn't know where the new region would
  724          *      start, we couldn't supply the correct offset into
  725          *      the kernel object.  We only initialize the entry
  726          *      if we aren't extending an existing entry.
  727          */
  728 
  729         offset = addr - VM_MIN_KERNEL_ADDRESS;
  730 
  731         if (entry->object.vm_object == VM_OBJECT_NULL) {
  732                 vm_object_reference(kernel_object);
  733 
  734                 entry->object.vm_object = kernel_object;
  735                 entry->offset = offset;
  736         }
  737 
  738         /*
  739          *      Since we have not given out this address yet,
  740          *      it is safe to unlock the map.
  741          */
  742         vm_map_unlock(map);
  743 
  744         /*
  745          *      Allocate wired-down memory in the kernel_object,
  746          *      for this entry, and enter it in the kernel pmap.
  747          */
  748         kmem_alloc_pages(kernel_object, offset,
  749                          addr, addr + size,
  750                          VM_PROT_DEFAULT);
  751 
  752         /*
  753          *      Return the memory, not zeroed.
  754          */
  755         *addrp = addr;
  756         return KERN_SUCCESS;
  757 }
  758 
  759 /*
  760  *      kmem_alloc_aligned:
  761  *
  762  *      Like kmem_alloc_wired, except that the memory is aligned.
  763  *      The size should be a power-of-2.
  764  */
  765 
  766 kern_return_t
  767 kmem_alloc_aligned(
  768         vm_map_t map,
  769         vm_offset_t *addrp,
  770         vm_size_t size)
  771 {
  772         vm_map_entry_t entry;
  773         vm_offset_t offset;
  774         vm_offset_t addr;
  775         kern_return_t kr;
  776 
  777         if ((size & (size - 1)) != 0)
  778                 panic("kmem_alloc_aligned");
  779 
  780         /*
  781          *      Use the kernel object for wired-down kernel pages.
  782          *      Assume that no region of the kernel object is
  783          *      referenced more than once.  We want vm_map_find_entry
  784          *      to extend an existing entry if possible.
  785          */
  786 
  787         size = round_page(size);
  788         vm_map_lock(map);
  789         kr = vm_map_find_entry(map, &addr, size, size - 1,
  790                                kernel_object, &entry);
  791         if (kr != KERN_SUCCESS) {
  792                 vm_map_unlock(map);
  793                 return kr;
  794         }
  795 
  796         /*
  797          *      Since we didn't know where the new region would
  798          *      start, we couldn't supply the correct offset into
  799          *      the kernel object.  We only initialize the entry
  800          *      if we aren't extending an existing entry.
  801          */
  802 
  803         offset = addr - VM_MIN_KERNEL_ADDRESS;
  804 
  805         if (entry->object.vm_object == VM_OBJECT_NULL) {
  806                 vm_object_reference(kernel_object);
  807 
  808                 entry->object.vm_object = kernel_object;
  809                 entry->offset = offset;
  810         }
  811 
  812         /*
  813          *      Since we have not given out this address yet,
  814          *      it is safe to unlock the map.
  815          */
  816         vm_map_unlock(map);
  817 
  818         /*
  819          *      Allocate wired-down memory in the kernel_object,
  820          *      for this entry, and enter it in the kernel pmap.
  821          */
  822         kmem_alloc_pages(kernel_object, offset,
  823                          addr, addr + size,
  824                          VM_PROT_DEFAULT);
  825 
  826         /*
  827          *      Return the memory, not zeroed.
  828          */
  829         *addrp = addr;
  830         return KERN_SUCCESS;
  831 }
  832 
  833 /*
  834  *      kmem_alloc_pageable:
  835  *
  836  *      Allocate pageable memory in the kernel's address map.
  837  */
  838 
  839 kern_return_t
  840 kmem_alloc_pageable(
  841         vm_map_t map,
  842         vm_offset_t *addrp,
  843         vm_size_t size)
  844 {
  845         vm_offset_t addr;
  846         kern_return_t kr;
  847 
  848         addr = vm_map_min(map);
  849         kr = vm_map_enter(map, &addr, round_page(size),
  850                           (vm_offset_t) 0, TRUE,
  851                           VM_OBJECT_NULL, (vm_offset_t) 0, FALSE,
  852                           VM_PROT_DEFAULT, VM_PROT_ALL, VM_INHERIT_DEFAULT);
  853         if (kr != KERN_SUCCESS)
  854                 return kr;
  855 
  856         *addrp = addr;
  857         return KERN_SUCCESS;
  858 }
  859 
  860 /*
  861  *      kmem_free:
  862  *
  863  *      Release a region of kernel virtual memory allocated
  864  *      with kmem_alloc, kmem_alloc_wired, or kmem_alloc_pageable,
  865  *      and return the physical pages associated with that region.
  866  */
  867 
  868 void
  869 kmem_free(
  870         vm_map_t map,
  871         vm_offset_t addr,
  872         vm_size_t size)
  873 {
  874         kern_return_t kr;
  875 
  876         kr = vm_map_remove(map, trunc_page(addr), round_page(addr + size));
  877         if (kr != KERN_SUCCESS)
  878                 panic("kmem_free");
  879 }
  880 
  881 /*
  882  *      Allocate new wired pages in an object.
  883  *      The object is assumed to be mapped into the kernel map or
  884  *      a submap.
  885  */
  886 void
  887 kmem_alloc_pages(
  888         register vm_object_t    object,
  889         register vm_offset_t    offset,
  890         register vm_offset_t    start,
  891         register vm_offset_t    end,
  892         vm_prot_t               protection)
  893 {
  894         /*
  895          *      Mark the pmap region as not pageable.
  896          */
  897         pmap_pageable(kernel_pmap, start, end, FALSE);
  898 
  899         while (start < end) {
  900             register vm_page_t  mem;
  901 
  902             vm_object_lock(object);
  903 
  904             /*
  905              *  Allocate a page
  906              */
  907             while ((mem = vm_page_alloc(object, offset))
  908                          == VM_PAGE_NULL) {
  909                 vm_object_unlock(object);
  910                 VM_PAGE_WAIT(CONTINUE_NULL);
  911                 vm_object_lock(object);
  912             }
  913 
  914             /*
  915              *  Wire it down
  916              */
  917             vm_page_lock_queues();
  918             vm_page_wire(mem);
  919             vm_page_unlock_queues();
  920             vm_object_unlock(object);
  921 
  922             /*
  923              *  Enter it in the kernel pmap
  924              */
  925             PMAP_ENTER(kernel_pmap, start, mem,
  926                        protection, TRUE);
  927 
  928             vm_object_lock(object);
  929             PAGE_WAKEUP_DONE(mem);
  930             vm_object_unlock(object);
  931 
  932             start += PAGE_SIZE;
  933             offset += PAGE_SIZE;
  934         }
  935 }
  936 
  937 /*
  938  *      Remap wired pages in an object into a new region.
  939  *      The object is assumed to be mapped into the kernel map or
  940  *      a submap.
  941  */
  942 void
  943 kmem_remap_pages(
  944         register vm_object_t    object,
  945         register vm_offset_t    offset,
  946         register vm_offset_t    start,
  947         register vm_offset_t    end,
  948         vm_prot_t               protection)
  949 {
  950         /*
  951          *      Mark the pmap region as not pageable.
  952          */
  953         pmap_pageable(kernel_pmap, start, end, FALSE);
  954 
  955         while (start < end) {
  956             register vm_page_t  mem;
  957 
  958             vm_object_lock(object);
  959 
  960             /*
  961              *  Find a page
  962              */
  963             if ((mem = vm_page_lookup(object, offset)) == VM_PAGE_NULL)
  964                 panic("kmem_remap_pages");
  965 
  966             /*
  967              *  Wire it down (again)
  968              */
  969             vm_page_lock_queues();
  970             vm_page_wire(mem);
  971             vm_page_unlock_queues();
  972             vm_object_unlock(object);
  973 
  974             /*
  975              *  Enter it in the kernel pmap.  The page isn't busy,
  976              *  but this shouldn't be a problem because it is wired.
  977              */
  978             PMAP_ENTER(kernel_pmap, start, mem,
  979                        protection, TRUE);
  980 
  981             start += PAGE_SIZE;
  982             offset += PAGE_SIZE;
  983         }
  984 }
  985 
  986 /*
  987  *      kmem_suballoc:
  988  *
  989  *      Allocates a map to manage a subrange
  990  *      of the kernel virtual address space.
  991  *
  992  *      Arguments are as follows:
  993  *
  994  *      parent          Map to take range from
  995  *      size            Size of range to find
  996  *      min, max        Returned endpoints of map
  997  *      pageable        Can the region be paged
  998  */
  999 
 1000 vm_map_t
 1001 kmem_suballoc(
 1002         vm_map_t        parent,
 1003         vm_offset_t     *min,
 1004         vm_offset_t     *max,
 1005         vm_size_t       size,
 1006         boolean_t       pageable)
 1007 {
 1008         vm_map_t map;
 1009         vm_offset_t addr;
 1010         kern_return_t kr;
 1011 
 1012         size = round_page(size);
 1013 
 1014         /*
 1015          *      Need reference on submap object because it is internal
 1016          *      to the vm_system.  vm_object_enter will never be called
 1017          *      on it (usual source of reference for vm_map_enter).
 1018          */
 1019         vm_object_reference(vm_submap_object);
 1020 
 1021         addr = (vm_offset_t) vm_map_min(parent);
 1022         kr = vm_map_enter(parent, &addr, size,
 1023                           (vm_offset_t) 0, TRUE,
 1024                           vm_submap_object, (vm_offset_t) 0, FALSE,
 1025                           VM_PROT_DEFAULT, VM_PROT_ALL, VM_INHERIT_DEFAULT);
 1026         if (kr != KERN_SUCCESS)
 1027                 panic("kmem_suballoc");
 1028 
 1029         pmap_reference(vm_map_pmap(parent));
 1030         map = vm_map_create(vm_map_pmap(parent), addr, addr + size, pageable);
 1031         if (map == VM_MAP_NULL)
 1032                 panic("kmem_suballoc");
 1033 
 1034         kr = vm_map_submap(parent, addr, addr + size, map);
 1035         if (kr != KERN_SUCCESS)
 1036                 panic("kmem_suballoc");
 1037 
 1038         *min = addr;
 1039         *max = addr + size;
 1040         return map;
 1041 }
 1042 
 1043 /*
 1044  *      kmem_init:
 1045  *
 1046  *      Initialize the kernel's virtual memory map, taking
 1047  *      into account all memory allocated up to this time.
 1048  */
 1049 void kmem_init(
 1050         vm_offset_t     start,
 1051         vm_offset_t     end)
 1052 {
 1053         kernel_map = vm_map_create(pmap_kernel(),
 1054                                    VM_MIN_KERNEL_ADDRESS, end,
 1055                                    FALSE);
 1056 
 1057         /*
 1058          *      Reserve virtual memory allocated up to this time.
 1059          */
 1060 
 1061         if (start != VM_MIN_KERNEL_ADDRESS) {
 1062                 vm_offset_t addr = VM_MIN_KERNEL_ADDRESS;
 1063                 (void) vm_map_enter(kernel_map,
 1064                                     &addr, start - VM_MIN_KERNEL_ADDRESS,
 1065                                     (vm_offset_t) 0, TRUE,
 1066                                     VM_OBJECT_NULL, (vm_offset_t) 0, FALSE,
 1067                                     VM_PROT_DEFAULT, VM_PROT_ALL,
 1068                                     VM_INHERIT_DEFAULT);
 1069         }
 1070 }
 1071 
 1072 /*
 1073  *      New and improved IO wiring support.
 1074  */
 1075 
 1076 /*
 1077  *      kmem_io_map_copyout:
 1078  *
 1079  *      Establish temporary mapping in designated map for the memory
 1080  *      passed in.  Memory format must be a page_list vm_map_copy.
 1081  *      Mapping is READ-ONLY.
 1082  */
 1083 
 1084 kern_return_t
 1085 kmem_io_map_copyout(
 1086      vm_map_t           map,
 1087      vm_offset_t        *addr,          /* actual addr of data */
 1088      vm_offset_t        *alloc_addr,    /* page aligned addr */
 1089      vm_size_t          *alloc_size,    /* size allocated */
 1090      vm_map_copy_t      copy,
 1091      vm_size_t          min_size)       /* Do at least this much */
 1092 {
 1093         vm_offset_t     myaddr, offset;
 1094         vm_size_t       mysize, copy_size;
 1095         kern_return_t   ret;
 1096         register
 1097         vm_page_t       *page_list;
 1098         vm_map_copy_t   new_copy;
 1099         register
 1100         int             i;
 1101 
 1102         assert(copy->type == VM_MAP_COPY_PAGE_LIST);
 1103         assert(min_size != 0);
 1104 
 1105         /*
 1106          *      Figure out the size in vm pages.
 1107          */
 1108         min_size += copy->offset - trunc_page(copy->offset);
 1109         min_size = round_page(min_size);
 1110         mysize = round_page(copy->offset + copy->size) -
 1111                 trunc_page(copy->offset);
 1112 
 1113         /*
 1114          *      If total size is larger than one page list and
 1115          *      we don't have to do more than one page list, then
 1116          *      only do one page list.  
 1117          *
 1118          * XXX  Could be much smarter about this ... like trimming length
 1119          * XXX  if we need more than one page list but not all of them.
 1120          */
 1121 
 1122         copy_size = ptoa(copy->cpy_npages);
 1123         if (mysize > copy_size && copy_size > min_size)
 1124                 mysize = copy_size;
 1125 
 1126         /*
 1127          *      Allocate some address space in the map (must be kernel
 1128          *      space).
 1129          */
 1130         myaddr = vm_map_min(map);
 1131         ret = vm_map_enter(map, &myaddr, mysize,
 1132                           (vm_offset_t) 0, TRUE,
 1133                           VM_OBJECT_NULL, (vm_offset_t) 0, FALSE,
 1134                           VM_PROT_DEFAULT, VM_PROT_ALL, VM_INHERIT_DEFAULT);
 1135 
 1136         if (ret != KERN_SUCCESS)
 1137                 return ret;
 1138 
 1139         /*
 1140          *      Tell the pmap module that this will be wired, and
 1141          *      enter the mappings.
 1142          */
 1143         pmap_pageable(vm_map_pmap(map), myaddr, myaddr + mysize, TRUE);
 1144 
 1145         *addr = myaddr + (copy->offset - trunc_page(copy->offset));
 1146         *alloc_addr = myaddr;
 1147         *alloc_size = mysize;
 1148 
 1149         offset = myaddr;
 1150         page_list = &copy->cpy_page_list[0];
 1151         while (TRUE) {
 1152                 for ( i = 0; i < copy->cpy_npages; i++, offset += PAGE_SIZE) {
 1153                         PMAP_ENTER(vm_map_pmap(map), offset, *page_list,
 1154                                    VM_PROT_READ, TRUE);
 1155                         page_list++;
 1156                 }
 1157 
 1158                 if (offset == (myaddr + mysize))
 1159                         break;
 1160 
 1161                 /*
 1162                  *      Onward to the next page_list.  The extend_cont
 1163                  *      leaves the current page list's pages alone; 
 1164                  *      they'll be cleaned up at discard.  Reset this
 1165                  *      copy's continuation to discard the next one.
 1166                  */
 1167                 vm_map_copy_invoke_extend_cont(copy, &new_copy, &ret);
 1168 
 1169                 if (ret != KERN_SUCCESS) {
 1170                         kmem_io_map_deallocate(map, myaddr, mysize);
 1171                         return ret;
 1172                 }
 1173                 copy->cpy_cont = vm_map_copy_discard_cont;
 1174                 copy->cpy_cont_args = (char *) new_copy;
 1175                 copy = new_copy;
 1176                 page_list = &copy->cpy_page_list[0];
 1177         }
 1178 
 1179         return(ret);
 1180 }
 1181 
 1182 /*
 1183  *      kmem_io_map_deallocate:
 1184  *
 1185  *      Get rid of the mapping established by kmem_io_map_copyout.
 1186  *      Assumes that addr and size have been rounded to page boundaries.
 1187  *      (e.g., the alloc_addr and alloc_size returned by kmem_io_map_copyout)
 1188  */
 1189 
 1190 void
 1191 kmem_io_map_deallocate(
 1192         vm_map_t        map,
 1193         vm_offset_t     addr,
 1194         vm_size_t       size)
 1195 {
 1196         /*
 1197          *      Remove the mappings.  The pmap_remove is needed.
 1198          */
 1199         
 1200         pmap_remove(vm_map_pmap(map), addr, addr + size);
 1201         vm_map_remove(map, addr, addr + size);
 1202 }
 1203 
 1204 /*
 1205  *      Routine:        copyinmap
 1206  *      Purpose:
 1207  *              Like copyin, except that fromaddr is an address
 1208  *              in the specified VM map.  This implementation
 1209  *              is incomplete; it handles the current user map
 1210  *              and the kernel map/submaps.
 1211  */
 1212 
 1213 int copyinmap(
 1214         vm_map_t        map,
 1215         void *          fromaddr,
 1216         void *          *toaddr,
 1217         unsigned int    length)
 1218 {
 1219         if (vm_map_pmap(map) == kernel_pmap) {
 1220                 /* assume a correct copy */
 1221                 bcopy(fromaddr, toaddr, length);
 1222                 return 0;
 1223         }
 1224 
 1225         if (current_map() == map)
 1226                 return copyin( fromaddr, toaddr, length);
 1227 
 1228         return 1;
 1229 }
 1230 
 1231 /*
 1232  *      Routine:        copyoutmap
 1233  *      Purpose:
 1234  *              Like copyout, except that toaddr is an address
 1235  *              in the specified VM map.  This implementation
 1236  *              is incomplete; it handles the current user map
 1237  *              and the kernel map/submaps.
 1238  */
 1239 
 1240 int copyoutmap(
 1241         vm_map_t        map,
 1242         void *          fromaddr,
 1243         void *          *toaddr,
 1244         unsigned int    length)
 1245 {
 1246         if (vm_map_pmap(map) == kernel_pmap) {
 1247                 /* assume a correct copy */
 1248                 bcopy(fromaddr, toaddr, length);
 1249                 return 0;
 1250         }
 1251 
 1252         if (current_map() == map)
 1253                 return copyout(fromaddr, toaddr, length);
 1254 
 1255         return 1;
 1256 }

Cache object: 4d7c237c2b7deb4278a0eed06df1be18


[ source navigation ] [ diff markup ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]


This page is part of the FreeBSD/Linux Linux Kernel Cross-Reference, and was automatically generated using a modified version of the LXR engine.