The Design and Implementation of the FreeBSD Operating System, Second Edition
Now available: The Design and Implementation of the FreeBSD Operating System (Second Edition)


[ source navigation ] [ diff markup ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]

FreeBSD/Linux Kernel Cross Reference
sys/bsd/net/bridge.c

Version: -  FREEBSD  -  FREEBSD-13-STABLE  -  FREEBSD-13-0  -  FREEBSD-12-STABLE  -  FREEBSD-12-0  -  FREEBSD-11-STABLE  -  FREEBSD-11-0  -  FREEBSD-10-STABLE  -  FREEBSD-10-0  -  FREEBSD-9-STABLE  -  FREEBSD-9-0  -  FREEBSD-8-STABLE  -  FREEBSD-8-0  -  FREEBSD-7-STABLE  -  FREEBSD-7-0  -  FREEBSD-6-STABLE  -  FREEBSD-6-0  -  FREEBSD-5-STABLE  -  FREEBSD-5-0  -  FREEBSD-4-STABLE  -  FREEBSD-3-STABLE  -  FREEBSD22  -  l41  -  OPENBSD  -  linux-2.6  -  MK84  -  PLAN9  -  xnu-8792 
SearchContext: -  none  -  3  -  10 

    1 /*
    2  * Copyright (c) 2000 Apple Computer, Inc. All rights reserved.
    3  *
    4  * @APPLE_LICENSE_HEADER_START@
    5  * 
    6  * Copyright (c) 1999-2003 Apple Computer, Inc.  All Rights Reserved.
    7  * 
    8  * This file contains Original Code and/or Modifications of Original Code
    9  * as defined in and that are subject to the Apple Public Source License
   10  * Version 2.0 (the 'License'). You may not use this file except in
   11  * compliance with the License. Please obtain a copy of the License at
   12  * http://www.opensource.apple.com/apsl/ and read it before using this
   13  * file.
   14  * 
   15  * The Original Code and all software distributed under the License are
   16  * distributed on an 'AS IS' basis, WITHOUT WARRANTY OF ANY KIND, EITHER
   17  * EXPRESS OR IMPLIED, AND APPLE HEREBY DISCLAIMS ALL SUCH WARRANTIES,
   18  * INCLUDING WITHOUT LIMITATION, ANY WARRANTIES OF MERCHANTABILITY,
   19  * FITNESS FOR A PARTICULAR PURPOSE, QUIET ENJOYMENT OR NON-INFRINGEMENT.
   20  * Please see the License for the specific language governing rights and
   21  * limitations under the License.
   22  * 
   23  * @APPLE_LICENSE_HEADER_END@
   24  */
   25 /*
   26  * Copyright (c) 1998 Luigi Rizzo
   27  *
   28  * Redistribution and use in source and binary forms, with or without
   29  * modification, are permitted provided that the following conditions
   30  * are met:
   31  * 1. Redistributions of source code must retain the above copyright
   32  *    notice, this list of conditions and the following disclaimer.
   33  * 2. Redistributions in binary form must reproduce the above copyright
   34  *    notice, this list of conditions and the following disclaimer in the
   35  *    documentation and/or other materials provided with the distribution.
   36  *
   37  * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND
   38  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
   39  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
   40  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
   41  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
   42  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
   43  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
   44  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
   45  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
   46  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
   47  * SUCH DAMAGE.
   48  *
   49  * $FreeBSD: src/sys/net/bridge.c,v 1.16.2.14 2001/02/09 23:13:41 luigi Exp $
   50  */
   51 
   52 /*
   53  * This code implements bridging in FreeBSD. It only acts on ethernet
   54  * type of interfaces (others are still usable for routing).
   55  * A bridging table holds the source MAC address/dest. interface for each
   56  * known node. The table is indexed using an hash of the source address.
   57  *
   58  * Input packets are tapped near the beginning of ether_input(), and
   59  * analysed by calling bridge_in(). Depending on the result, the packet
   60  * can be forwarded to one or more output interfaces using bdg_forward(),
   61  * and/or sent to the upper layer (e.g. in case of multicast).
   62  *
   63  * Output packets are intercepted near the end of ether_output(),
   64  * the correct destination is selected calling bridge_dst_lookup(),
   65  * and then forwarding is done using bdg_forward().
   66  * Bridging is controlled by the sysctl variable net.link.ether.bridge
   67  *
   68  * The arp code is also modified to let a machine answer to requests
   69  * irrespective of the port the request came from.
   70  *
   71  * In case of loops in the bridging topology, the bridge detects this
   72  * event and temporarily mutes output bridging on one of the ports.
   73  * Periodically, interfaces are unmuted by bdg_timeout().
   74  * Muting is only implemented as a safety measure, and also as
   75  * a mechanism to support a user-space implementation of the spanning
   76  * tree algorithm. In the final release, unmuting will only occur
   77  * because of explicit action of the user-level daemon.
   78  *
   79  * To build a bridging kernel, use the following option
   80  *    option BRIDGE
   81  * and then at runtime set the sysctl variable to enable bridging.
   82  *
   83  * Only one interface is supposed to have addresses set (but
   84  * there are no problems in practice if you set addresses for more
   85  * than one interface).
   86  * Bridging will act before routing, but nothing prevents a machine
   87  * from doing both (modulo bugs in the implementation...).
   88  *
   89  * THINGS TO REMEMBER
   90  *  - bridging is incompatible with multicast routing on the same
   91  *    machine. There is not an easy fix to this.
   92  *  - loop detection is still not very robust.
   93  *  - the interface of bdg_forward() could be improved.
   94  */
   95 
   96 #include <sys/param.h>
   97 #include <sys/mbuf.h>
   98 #include <sys/malloc.h>
   99 #include <sys/systm.h>
  100 #include <sys/socket.h> /* for net/if.h */
  101 #include <sys/kernel.h>
  102 #include <sys/sysctl.h>
  103 
  104 #include <net/if.h>
  105 #include <net/if_types.h>
  106 
  107 #include <netinet/in.h> /* for struct arpcom */
  108 #include <netinet/in_systm.h>
  109 #include <netinet/in_var.h>
  110 #include <netinet/ip.h>
  111 #include <netinet/if_ether.h> /* for struct arpcom */
  112 
  113 #include "opt_ipfw.h" 
  114 #include "opt_ipdn.h" 
  115 
  116 #if defined(IPFIREWALL)
  117 #include <net/route.h>
  118 #include <netinet/ip_fw.h>
  119 #if defined(DUMMYNET)
  120 #include <netinet/ip_dummynet.h>
  121 #endif
  122 #endif
  123 
  124 #include <net/bridge.h>
  125 
  126 /*
  127  * For debugging, you can use the following macros.
  128  * remember, rdtsc() only works on Pentium-class machines
  129 
  130     quad_t ticks;
  131     DDB(ticks = rdtsc();)
  132     ... interesting code ...
  133     DDB(bdg_fw_ticks += (u_long)(rdtsc() - ticks) ; bdg_fw_count++ ;)
  134 
  135  *
  136  */
  137 
  138 #define DDB(x) x
  139 #define DEB(x)
  140 
  141 static void bdginit(void *);
  142 static void bdgtakeifaces(void);
  143 static void flush_table(void);
  144 static void bdg_promisc_on(void);
  145 static void parse_bdg_cfg(void);
  146 
  147 static int bdg_ipfw = 0 ;
  148 int do_bridge = 0;
  149 bdg_hash_table *bdg_table = NULL ;
  150 
  151 /*
  152  * System initialization
  153  */
  154 
  155 SYSINIT(interfaces, SI_SUB_PROTO_IF, SI_ORDER_FIRST, bdginit, NULL)
  156 
  157 static struct bdg_stats bdg_stats ;
  158 struct bdg_softc *ifp2sc = NULL ;
  159 /* XXX make it static of size BDG_MAX_PORTS */
  160 
  161 #define IFP_CHK(ifp, x) \
  162         if (ifp2sc[ifp->if_index].magic != 0xDEADBEEF) { x ; }
  163 
  164 /*
  165  * turn off promisc mode, optionally clear the IFF_USED flag.
  166  * The flag is turned on by parse_bdg_config
  167  */
  168 static void
  169 bdg_promisc_off(int clear_used)
  170 {
  171     struct ifnet *ifp ;
  172     TAILQ_FOREACH(ifp, &ifnet, if_link) {
  173         if ( (ifp2sc[ifp->if_index].flags & IFF_BDG_PROMISC) ) {
  174             int s, ret ;
  175             s = splimp();
  176             ret = ifpromisc(ifp, 0);
  177             splx(s);
  178             ifp2sc[ifp->if_index].flags &= ~(IFF_BDG_PROMISC|IFF_MUTE) ;
  179             DEB(printf(">> now %s%d promisc OFF if_flags 0x%x bdg_flags 0x%x\n",
  180                     ifp->if_name, ifp->if_unit,
  181                     ifp->if_flags, ifp2sc[ifp->if_index].flags);)
  182         }
  183         if (clear_used) {
  184             ifp2sc[ifp->if_index].flags &= ~(IFF_USED) ;
  185             bdg_stats.s[ifp->if_index].name[0] = '\0';
  186         }
  187     }
  188 }
  189 
  190 /*
  191  * set promisc mode on the interfaces we use.
  192  */
  193 static void
  194 bdg_promisc_on()
  195 {
  196     struct ifnet *ifp ;
  197     int s ;
  198 
  199     TAILQ_FOREACH(ifp, &ifnet, if_link) {
  200         if ( !BDG_USED(ifp) )
  201             continue ;
  202         if ( 0 == ( ifp->if_flags & IFF_UP) ) {
  203             s = splimp();
  204             if_up(ifp);
  205             splx(s);
  206         }
  207         if ( !(ifp2sc[ifp->if_index].flags & IFF_BDG_PROMISC) ) {
  208             int ret ;
  209             s = splimp();
  210             ret = ifpromisc(ifp, 1);
  211             splx(s);
  212             ifp2sc[ifp->if_index].flags |= IFF_BDG_PROMISC ;
  213             printf(">> now %s%d promisc ON if_flags 0x%x bdg_flags 0x%x\n",
  214                     ifp->if_name, ifp->if_unit,
  215                     ifp->if_flags, ifp2sc[ifp->if_index].flags);
  216         }
  217         if (BDG_MUTED(ifp)) {
  218             printf(">> unmuting %s%d\n", ifp->if_name, ifp->if_unit);
  219             BDG_UNMUTE(ifp) ;
  220        }
  221     }
  222 }
  223 
  224 static int
  225 sysctl_bdg(SYSCTL_HANDLER_ARGS)
  226 {
  227     int error, oldval = do_bridge ;
  228 
  229     error = sysctl_handle_int(oidp,
  230         oidp->oid_arg1, oidp->oid_arg2, req);
  231     DEB( printf("called sysctl for bridge name %s arg2 %d val %d->%d\n",
  232         oidp->oid_name, oidp->oid_arg2,
  233         oldval, do_bridge); )
  234 
  235     if (bdg_table == NULL)
  236         do_bridge = 0 ;
  237     if (oldval != do_bridge) {
  238         bdg_promisc_off( 1 ); /* reset previously used interfaces */
  239         flush_table();
  240         if (do_bridge) {
  241             parse_bdg_cfg();
  242             bdg_promisc_on();
  243         }
  244     }
  245     return error ;
  246 }
  247 
  248 static char bridge_cfg[256] = { "" } ;
  249 
  250 /*
  251  * parse the config string, set IFF_USED, name and cluster_id
  252  * for all interfaces found.
  253  */
  254 static void
  255 parse_bdg_cfg()
  256 {
  257     char *p, *beg ;
  258     int i, l, cluster;
  259     struct bdg_softc *b;
  260 
  261     for (p= bridge_cfg; *p ; p++) {
  262         /* interface names begin with [a-z]  and continue up to ':' */
  263         if (*p < 'a' || *p > 'z')
  264             continue ;
  265         for ( beg = p ; *p && *p != ':' ; p++ )
  266             ;
  267         if (*p == 0) /* end of string, ':' not found */
  268             return ;
  269         l = p - beg ; /* length of name string */
  270         p++ ;
  271         DEB(printf("-- match beg(%d) <%s> p <%s>\n", l, beg, p);)
  272         for (cluster = 0 ; *p && *p >= '' && *p <= '9' ; p++)
  273             cluster = cluster*10 + (*p -'');
  274         /*
  275          * now search in bridge strings
  276          */
  277         for (i=0, b = ifp2sc ; i < if_index ; i++, b++) {
  278             char buf[32];
  279             struct ifnet *ifp = b->ifp ;
  280 
  281             if (ifp == NULL)
  282                 continue;
  283             sprintf(buf, "%s%d", ifp->if_name, ifp->if_unit);
  284             if (!strncmp(beg, buf, l)) { /* XXX not correct for >10 if! */
  285                 b->cluster_id = htons(cluster) ;
  286                 b->flags |= IFF_USED ;
  287                 sprintf(bdg_stats.s[ifp->if_index].name,
  288                         "%s%d:%d", ifp->if_name, ifp->if_unit, cluster);
  289 
  290                 DEB(printf("--++  found %s\n",
  291                     bdg_stats.s[ifp->if_index].name);)
  292                 break ;
  293             }
  294         }
  295         if (*p == '\0')
  296             break ;
  297     }
  298 }
  299 
  300 static int
  301 sysctl_bdg_cfg(SYSCTL_HANDLER_ARGS)
  302 {
  303     int error = 0 ;
  304     char oldval[256] ;
  305 
  306     strcpy(oldval, bridge_cfg) ;
  307 
  308     error = sysctl_handle_string(oidp,
  309             bridge_cfg, oidp->oid_arg2, req);
  310     DEB(
  311         printf("called sysctl for bridge name %s arg2 %d err %d val %s->%s\n",
  312                 oidp->oid_name, oidp->oid_arg2,
  313                 error,
  314                 oldval, bridge_cfg);
  315         )
  316     if (strcmp(oldval, bridge_cfg)) {
  317         bdg_promisc_off( 1 );  /* reset previously-used interfaces */
  318         flush_table();
  319         parse_bdg_cfg();        /* and set new ones... */
  320         if (do_bridge)
  321             bdg_promisc_on();   /* re-enable interfaces */
  322     }
  323     return error ;
  324 }
  325 
  326 static int
  327 sysctl_refresh(SYSCTL_HANDLER_ARGS)
  328 {
  329     if (req->newptr)
  330             bdgtakeifaces();
  331     
  332     return 0;
  333 }
  334 
  335 
  336 SYSCTL_DECL(_net_link_ether);
  337 SYSCTL_PROC(_net_link_ether, OID_AUTO, bridge_cfg, CTLTYPE_STRING|CTLFLAG_RW,
  338             &bridge_cfg, sizeof(bridge_cfg), &sysctl_bdg_cfg, "A",
  339             "Bridge configuration");
  340 
  341 SYSCTL_PROC(_net_link_ether, OID_AUTO, bridge, CTLTYPE_INT|CTLFLAG_RW,
  342             &do_bridge, 0, &sysctl_bdg, "I", "Bridging");
  343 
  344 SYSCTL_INT(_net_link_ether, OID_AUTO, bridge_ipfw, CTLFLAG_RW,
  345             &bdg_ipfw,0,"Pass bridged pkts through firewall");
  346 
  347 #define SY(parent, var, comment)                        \
  348         static int var ;                                \
  349         SYSCTL_INT(parent, OID_AUTO, var, CTLFLAG_RW, &(var), 0, comment);
  350 
  351 int bdg_ipfw_drops;
  352 SYSCTL_INT(_net_link_ether, OID_AUTO, bridge_ipfw_drop,
  353         CTLFLAG_RW, &bdg_ipfw_drops,0,"");
  354 
  355 int bdg_ipfw_colls;
  356 SYSCTL_INT(_net_link_ether, OID_AUTO, bridge_ipfw_collisions,
  357         CTLFLAG_RW, &bdg_ipfw_colls,0,"");
  358 
  359 SYSCTL_PROC(_net_link_ether, OID_AUTO, bridge_refresh, CTLTYPE_INT|CTLFLAG_WR,
  360             NULL, 0, &sysctl_refresh, "I", "iface refresh");
  361 
  362 #if 1 /* diagnostic vars */
  363 
  364 SY(_net_link_ether, verbose, "Be verbose");
  365 SY(_net_link_ether, bdg_split_pkts, "Packets split in bdg_forward");
  366 
  367 SY(_net_link_ether, bdg_thru, "Packets through bridge");
  368 
  369 SY(_net_link_ether, bdg_copied, "Packets copied in bdg_forward");
  370 
  371 SY(_net_link_ether, bdg_copy, "Force copy in bdg_forward");
  372 SY(_net_link_ether, bdg_predict, "Correctly predicted header location");
  373 
  374 SY(_net_link_ether, bdg_fw_avg, "Cycle counter avg");
  375 SY(_net_link_ether, bdg_fw_ticks, "Cycle counter item");
  376 SY(_net_link_ether, bdg_fw_count, "Cycle counter count");
  377 #endif
  378 
  379 SYSCTL_STRUCT(_net_link_ether, PF_BDG, bdgstats,
  380         CTLFLAG_RD, &bdg_stats , bdg_stats, "bridge statistics");
  381 
  382 static int bdg_loops ;
  383 
  384 /*
  385  * completely flush the bridge table.
  386  */
  387 static void
  388 flush_table()
  389 {   
  390     int s,i;
  391 
  392     if (bdg_table == NULL)
  393         return ;
  394     s = splimp();
  395     for (i=0; i< HASH_SIZE; i++)
  396         bdg_table[i].name= NULL; /* clear table */
  397     splx(s);
  398 }
  399 
  400 /* wrapper for funnel */
  401 void
  402 bdg_timeout_funneled(void * dummy)
  403 {
  404     boolean_t   funnel_state;
  405         
  406     funnel_state = thread_funnel_set(network_flock, TRUE);
  407     bdg_timeout(dummy);
  408     funnel_state = thread_funnel_set(network_flock, FALSE);
  409 }
  410 
  411 /*
  412  * called periodically to flush entries etc.
  413  */
  414 static void
  415 bdg_timeout(void *dummy)
  416 {
  417     static int slowtimer = 0 ;
  418 
  419     if (do_bridge) {
  420         static int age_index = 0 ; /* index of table position to age */
  421         int l = age_index + HASH_SIZE/4 ;
  422         /*
  423          * age entries in the forwarding table.
  424          */
  425         if (l > HASH_SIZE)
  426             l = HASH_SIZE ;
  427         for (; age_index < l ; age_index++)
  428             if (bdg_table[age_index].used)
  429                 bdg_table[age_index].used = 0 ;
  430             else if (bdg_table[age_index].name) {
  431                 /* printf("xx flushing stale entry %d\n", age_index); */
  432                 bdg_table[age_index].name = NULL ;
  433             }
  434         if (age_index >= HASH_SIZE)
  435             age_index = 0 ;
  436 
  437         if (--slowtimer <= 0 ) {
  438             slowtimer = 5 ;
  439 
  440             bdg_promisc_on() ; /* we just need unmute, really */
  441             bdg_loops = 0 ;
  442         }
  443     }
  444     timeout(bdg_timeout_funneled, (void *)0, 2*hz );
  445 }
  446 
  447 /*
  448  * local MAC addresses are held in a small array. This makes comparisons
  449  * much faster.
  450  */
  451 bdg_addr bdg_addresses[BDG_MAX_PORTS];
  452 int bdg_ports ;
  453 
  454 /*
  455  * initialization of bridge code. This needs to be done after all
  456  * interfaces have been configured.
  457  */
  458 static void
  459 bdginit(void *dummy)
  460 {
  461 
  462     if (bdg_table == NULL)
  463         bdg_table = (struct hash_table *)
  464                 _MALLOC(HASH_SIZE * sizeof(struct hash_table),
  465                     M_IFADDR, M_WAITOK);
  466     flush_table();
  467 
  468     ifp2sc = _MALLOC(BDG_MAX_PORTS * sizeof(struct bdg_softc),
  469                 M_IFADDR, M_WAITOK );
  470     bzero(ifp2sc, BDG_MAX_PORTS * sizeof(struct bdg_softc) );
  471 
  472     bzero(&bdg_stats, sizeof(bdg_stats) );
  473     bdgtakeifaces();
  474     bdg_timeout(0);
  475     do_bridge=0;
  476 }
  477     
  478 void
  479 bdgtakeifaces(void)
  480 {
  481     int i ;
  482     struct ifnet *ifp;
  483     struct arpcom *ac ;
  484     bdg_addr *p = bdg_addresses ;
  485     struct bdg_softc *bp;
  486 
  487     bdg_ports = 0 ;
  488     *bridge_cfg = '\0';
  489 
  490     printf("BRIDGE 010131, have %d interfaces\n", if_index);
  491     for (i = 0 , ifp = ifnet.tqh_first ; i < if_index ;
  492                 i++, ifp = TAILQ_NEXT(ifp, if_link) )
  493         if (ifp->if_type == IFT_ETHER) { /* ethernet ? */
  494             bp = &ifp2sc[ifp->if_index] ;
  495             ac = (struct arpcom *)ifp;
  496             sprintf(bridge_cfg + strlen(bridge_cfg),
  497                 "%s%d:1,", ifp->if_name, ifp->if_unit);
  498             printf("-- index %d %s type %d phy %d addrl %d addr %6D\n",
  499                     ifp->if_index,
  500                     bdg_stats.s[ifp->if_index].name,
  501                     (int)ifp->if_type, (int) ifp->if_physical,
  502                     (int)ifp->if_addrlen,
  503                     ac->ac_enaddr, "." );
  504             bcopy(ac->ac_enaddr, p->etheraddr, 6);
  505             p++ ;
  506             bp->ifp = ifp ;
  507             bp->flags = IFF_USED ;
  508             bp->cluster_id = htons(1) ;
  509             bp->magic = 0xDEADBEEF ;
  510 
  511             sprintf(bdg_stats.s[ifp->if_index].name,
  512                 "%s%d:%d", ifp->if_name, ifp->if_unit,
  513                 ntohs(bp->cluster_id));
  514             bdg_ports ++ ;
  515         }
  516 
  517 }
  518 
  519 /*
  520  * bridge_in() is invoked to perform bridging decision on input packets.
  521  *
  522  * On Input:
  523  *   eh         Ethernet header of the incoming packet.
  524  *
  525  * On Return: destination of packet, one of
  526  *   BDG_BCAST  broadcast
  527  *   BDG_MCAST  multicast
  528  *   BDG_LOCAL  is only for a local address (do not forward)
  529  *   BDG_DROP   drop the packet
  530  *   ifp        ifp of the destination interface.
  531  *
  532  * Forwarding is not done directly to give a chance to some drivers
  533  * to fetch more of the packet, or simply drop it completely.
  534  */
  535 
  536 struct ifnet *
  537 bridge_in(struct ifnet *ifp, struct ether_header *eh)
  538 {
  539     int index;
  540     struct ifnet *dst , *old ;
  541     int dropit = BDG_MUTED(ifp) ;
  542 
  543     /*
  544      * hash the source address
  545      */
  546     index= HASH_FN(eh->ether_shost);
  547     bdg_table[index].used = 1 ;
  548     old = bdg_table[index].name ;
  549     if ( old ) { /* the entry is valid. */
  550         IFP_CHK(old, printf("bridge_in-- reading table\n") );
  551 
  552         if (!BDG_MATCH( eh->ether_shost, bdg_table[index].etheraddr) ) {
  553             bdg_ipfw_colls++ ;
  554             bdg_table[index].name = NULL ;
  555         } else if (old != ifp) {
  556             /*
  557              * found a loop. Either a machine has moved, or there
  558              * is a misconfiguration/reconfiguration of the network.
  559              * First, do not forward this packet!
  560              * Record the relocation anyways; then, if loops persist,
  561              * suspect a reconfiguration and disable forwarding
  562              * from the old interface.
  563              */
  564             bdg_table[index].name = ifp ; /* relocate address */
  565             printf("-- loop (%d) %6D to %s%d from %s%d (%s)\n",
  566                         bdg_loops, eh->ether_shost, ".",
  567                         ifp->if_name, ifp->if_unit,
  568                         old->if_name, old->if_unit,
  569                         BDG_MUTED(old) ? "muted":"active");
  570             dropit = 1 ;
  571             if ( !BDG_MUTED(old) ) {
  572                 if (++bdg_loops > 10)
  573                     BDG_MUTE(old) ;
  574             }
  575         }
  576     }
  577 
  578     /*
  579      * now write the source address into the table
  580      */
  581     if (bdg_table[index].name == NULL) {
  582         DEB(printf("new addr %6D at %d for %s%d\n",
  583             eh->ether_shost, ".", index, ifp->if_name, ifp->if_unit);)
  584         bcopy(eh->ether_shost, bdg_table[index].etheraddr, 6);
  585         bdg_table[index].name = ifp ;
  586     }
  587     dst = bridge_dst_lookup(eh);
  588     /* Return values:
  589      *   BDG_BCAST, BDG_MCAST, BDG_LOCAL, BDG_UNKNOWN, BDG_DROP, ifp.
  590      * For muted interfaces, the first 3 are changed in BDG_LOCAL,
  591      * and others to BDG_DROP. Also, for incoming packets, ifp is changed
  592      * to BDG_DROP in case ifp == src . These mods are not necessary
  593      * for outgoing packets from ether_output().
  594      */
  595     BDG_STAT(ifp, BDG_IN);
  596     switch ((int)dst) {
  597     case (int)BDG_BCAST:
  598     case (int)BDG_MCAST:
  599     case (int)BDG_LOCAL:
  600     case (int)BDG_UNKNOWN:
  601     case (int)BDG_DROP:
  602         BDG_STAT(ifp, dst);
  603         break ;
  604     default :
  605         if (dst == ifp || dropit )
  606             BDG_STAT(ifp, BDG_DROP);
  607         else
  608             BDG_STAT(ifp, BDG_FORWARD);
  609         break ;
  610     }
  611 
  612     if ( dropit ) {
  613         if (dst == BDG_BCAST || dst == BDG_MCAST || dst == BDG_LOCAL)
  614             return BDG_LOCAL ;
  615         else
  616             return BDG_DROP ;
  617     } else {
  618         return (dst == ifp ? BDG_DROP : dst ) ;
  619     }
  620 }
  621 
  622 /*
  623  * Forward to dst, excluding src port and muted interfaces.
  624  * If src == NULL, the pkt comes from ether_output, and dst is the real
  625  * interface the packet is originally sent to. In this case we must forward
  626  * it to the whole cluster. We never call bdg_forward ether_output on
  627  * interfaces which are not part of a cluster.
  628  *
  629  * The packet is freed if possible (i.e. surely not of interest for
  630  * the upper layer), otherwise a copy is left for use by the caller
  631  * (pointer in m0).
  632  *
  633  * It would be more efficient to make bdg_forward() always consume
  634  * the packet, leaving to the caller the task to check if it needs a copy
  635  * and get one in case. As it is now, bdg_forward() can sometimes make
  636  * a copy whereas it is not necessary.
  637  *
  638  * XXX be careful about eh, it can be a pointer into *m
  639  */
  640 struct mbuf *
  641 bdg_forward(struct mbuf *m0, struct ether_header *const eh, struct ifnet *dst)
  642 {
  643     struct ifnet *src = m0->m_pkthdr.rcvif; /* could be NULL in output */
  644     struct ifnet *ifp, *last = NULL ;
  645     int s ;
  646     int shared = bdg_copy ; /* someone else is using the mbuf */
  647     int once = 0;      /* loop only once */
  648     struct ifnet *real_dst = dst ; /* real dst from ether_output */
  649 #ifdef IPFIREWALL
  650     struct ip_fw_chain *rule = NULL ; /* did we match a firewall rule ? */
  651 #endif
  652 
  653     /*
  654      * XXX eh is usually a pointer within the mbuf (some ethernet drivers
  655      * do that), so we better copy it before doing anything with the mbuf,
  656      * or we might corrupt the header.
  657      */
  658     struct ether_header save_eh = *eh ;
  659 
  660 #if defined(IPFIREWALL) && defined(DUMMYNET)
  661     if (m0->m_type == MT_DUMMYNET) {
  662         /* extract info from dummynet header */
  663         rule = (struct ip_fw_chain *)(m0->m_data) ;
  664         m0 = m0->m_next ;
  665         src = m0->m_pkthdr.rcvif;
  666         shared = 0 ; /* For sure this is our own mbuf. */
  667     } else
  668 #endif
  669     bdg_thru++; /* only count once */
  670 
  671     if (src == NULL) /* packet from ether_output */
  672         dst = bridge_dst_lookup(eh);
  673     if (dst == BDG_DROP) { /* this should not happen */
  674         printf("xx bdg_forward for BDG_DROP\n");
  675         m_freem(m0);
  676         return NULL;
  677     }
  678     if (dst == BDG_LOCAL) { /* this should not happen as well */
  679         printf("xx ouch, bdg_forward for local pkt\n");
  680         return m0;
  681     }
  682     if (dst == BDG_BCAST || dst == BDG_MCAST || dst == BDG_UNKNOWN) {
  683         ifp = ifnet.tqh_first ; /* scan all ports */
  684         once = 0 ;
  685         if (dst != BDG_UNKNOWN) /* need a copy for the local stack */
  686             shared = 1 ;
  687     } else {
  688         ifp = dst ;
  689         once = 1 ;
  690     }
  691     if ( (u_int)(ifp) <= (u_int)BDG_FORWARD )
  692         panic("bdg_forward: bad dst");
  693 
  694 #ifdef IPFIREWALL
  695     /*
  696      * Do filtering in a very similar way to what is done in ip_output.
  697      * Only if firewall is loaded, enabled, and the packet is not
  698      * from ether_output() (src==NULL, or we would filter it twice).
  699      * Additional restrictions may apply e.g. non-IP, short packets,
  700      * and pkts already gone through a pipe.
  701      */
  702     if (ip_fw_chk_ptr && bdg_ipfw != 0 && src != NULL) {
  703         struct ip *ip ;
  704         int i;
  705 
  706         if (rule != NULL) /* dummynet packet, already partially processed */
  707             goto forward; /* HACK! I should obey the fw_one_pass */
  708         if (ntohs(save_eh.ether_type) != ETHERTYPE_IP)
  709             goto forward ; /* not an IP packet, ipfw is not appropriate */
  710         if (m0->m_pkthdr.len < sizeof(struct ip) )
  711             goto forward ; /* header too short for an IP pkt, cannot filter */
  712         /*
  713          * i need some amt of data to be contiguous, and in case others need
  714          * the packet (shared==1) also better be in the first mbuf.
  715          */
  716         i = min(m0->m_pkthdr.len, max_protohdr) ;
  717         if ( shared || m0->m_len < i) {
  718             m0 = m_pullup(m0, i) ;
  719             if (m0 == NULL) {
  720                 printf("-- bdg: pullup failed.\n") ;
  721                 return NULL ;
  722             }
  723         }
  724 
  725         /*
  726          * before calling the firewall, swap fields the same as IP does.
  727          * here we assume the pkt is an IP one and the header is contiguous
  728          */
  729         ip = mtod(m0, struct ip *);
  730         NTOHS(ip->ip_len);
  731         NTOHS(ip->ip_off);
  732 
  733         /*
  734          * The third parameter to the firewall code is the dst. interface.
  735          * Since we apply checks only on input pkts we use NULL.
  736          * The firewall knows this is a bridged packet as the cookie ptr
  737          * is NULL.
  738          */
  739         i = (*ip_fw_chk_ptr)(&ip, 0, NULL, NULL /* cookie */, &m0, &rule, NULL);
  740         if ( (i & IP_FW_PORT_DENY_FLAG) || m0 == NULL) /* drop */
  741             return m0 ;
  742         /*
  743          * If we get here, the firewall has passed the pkt, but the mbuf
  744          * pointer might have changed. Restore ip and the fields NTOHS()'d.
  745          */
  746         ip = mtod(m0, struct ip *);
  747         HTONS(ip->ip_len);
  748         HTONS(ip->ip_off);
  749 
  750         if (i == 0) /* a PASS rule.  */
  751             goto forward ;
  752 #ifdef DUMMYNET
  753         if (i & IP_FW_PORT_DYNT_FLAG) {
  754             /*
  755              * Pass the pkt to dummynet, which consumes it.
  756              * If shared, make a copy and keep the original.
  757              * Need to prepend the ethernet header, optimize the common
  758              * case of eh pointing already into the original mbuf.
  759              */
  760             struct mbuf *m ;
  761             if (shared) {
  762                 m = m_copypacket(m0, M_DONTWAIT);
  763                 if (m == NULL) {
  764                     printf("bdg_fwd: copy(1) failed\n");
  765                     return m0;
  766                 }
  767             } else {
  768                 m = m0 ; /* pass the original to dummynet */
  769                 m0 = NULL ; /* and nothing back to the caller */
  770             }
  771             if ( (void *)(eh + 1) == (void *)m->m_data) {
  772                 m->m_data -= ETHER_HDR_LEN ;
  773                 m->m_len += ETHER_HDR_LEN ;
  774                 m->m_pkthdr.len += ETHER_HDR_LEN ;
  775                 bdg_predict++;
  776             } else {
  777                 M_PREPEND(m, ETHER_HDR_LEN, M_DONTWAIT);
  778                 if (!m && verbose) printf("M_PREPEND failed\n");
  779                 if (m == NULL) /* nope... */
  780                     return m0 ;
  781                 bcopy(&save_eh, mtod(m, struct ether_header *), ETHER_HDR_LEN);
  782             }
  783             dummynet_io((i & 0xffff),DN_TO_BDG_FWD,m,real_dst,NULL,0,rule,0);
  784             return m0 ;
  785         }
  786 #endif
  787         /*
  788          * XXX add divert/forward actions...
  789          */
  790         /* if none of the above matches, we have to drop the pkt */
  791         bdg_ipfw_drops++ ;
  792         printf("bdg_forward: No rules match, so dropping packet!\n");
  793         return m0 ;
  794     }
  795 forward:
  796 #endif /* IPFIREWALL */
  797     /*
  798      * Again, bring up the headers in case of shared bufs to avoid
  799      * corruptions in the future.
  800      */
  801     if ( shared ) {
  802         int i = min(m0->m_pkthdr.len, max_protohdr) ;
  803 
  804         m0 = m_pullup(m0, i) ;
  805         if (m0 == NULL) {
  806             printf("-- bdg: pullup2 failed.\n") ;
  807             return NULL ;
  808         }
  809     }
  810     /* now real_dst is used to determine the cluster where to forward */
  811     if (src != NULL) /* pkt comes from ether_input */
  812         real_dst = src ;
  813     for (;;) {
  814         if (last) { /* need to forward packet leftover from previous loop */
  815             struct mbuf *m ;
  816             if (shared == 0 && once ) { /* no need to copy */
  817                 m = m0 ;
  818                 m0 = NULL ; /* original is gone */
  819             } else {
  820                 m = m_copypacket(m0, M_DONTWAIT);
  821                 if (m == NULL) {
  822                     printf("bdg_forward: sorry, m_copypacket failed!\n");
  823                     return m0 ; /* the original is still there... */
  824                 }
  825             }
  826             /*
  827              * Add header (optimized for the common case of eh pointing
  828              * already into the mbuf) and execute last part of ether_output:
  829              * queue pkt and start output if interface not yet active.
  830              */
  831             if ( (void *)(eh + 1) == (void *)m->m_data) {
  832                 m->m_data -= ETHER_HDR_LEN ;
  833                 m->m_len += ETHER_HDR_LEN ;
  834                 m->m_pkthdr.len += ETHER_HDR_LEN ;
  835                 bdg_predict++;
  836             } else {
  837                 M_PREPEND(m, ETHER_HDR_LEN, M_DONTWAIT);
  838                 if (!m && verbose) printf("M_PREPEND failed\n");
  839                 if (m == NULL)
  840                     return m0;
  841                 bcopy(&save_eh, mtod(m, struct ether_header *), ETHER_HDR_LEN);
  842             }
  843             s = splimp();
  844             if (IF_QFULL(&last->if_snd)) {
  845                 IF_DROP(&last->if_snd);
  846 #if 0
  847                 BDG_MUTE(last); /* should I also mute ? */
  848 #endif
  849                 splx(s);
  850                 m_freem(m); /* consume the pkt anyways */
  851             } else {
  852                 last->if_obytes += m->m_pkthdr.len ;
  853                 if (m->m_flags & M_MCAST)
  854                     last->if_omcasts++;
  855                 if (m->m_pkthdr.len != m->m_len) /* this pkt is on >1 bufs */
  856                     bdg_split_pkts++;
  857 
  858                 IF_ENQUEUE(&last->if_snd, m);
  859                 if ((last->if_flags & IFF_OACTIVE) == 0)
  860                     (*last->if_start)(last);
  861                 splx(s);
  862             }
  863             BDG_STAT(last, BDG_OUT);
  864             last = NULL ;
  865             if (once)
  866                 break ;
  867         }
  868         if (ifp == NULL)
  869             break ;
  870         /*
  871          * If the interface is used for bridging, not muted, not full,
  872          * up and running, is not the source interface, and belongs to
  873          * the same cluster as the 'real_dst', then send here.
  874          */
  875         if ( BDG_USED(ifp) && !BDG_MUTED(ifp) && !IF_QFULL(&ifp->if_snd)  &&
  876              (ifp->if_flags & (IFF_UP|IFF_RUNNING)) == (IFF_UP|IFF_RUNNING) &&
  877              ifp != src && BDG_SAMECLUSTER(ifp, real_dst) )
  878             last = ifp ;
  879         ifp = TAILQ_NEXT(ifp, if_link) ;
  880         if (ifp == NULL)
  881             once = 1 ;
  882     }
  883     DEB(bdg_fw_ticks += (u_long)(rdtsc() - ticks) ; bdg_fw_count++ ;
  884         if (bdg_fw_count != 0) bdg_fw_avg = bdg_fw_ticks/bdg_fw_count; )
  885     return m0 ;
  886 }

Cache object: 1f9f0a9fb11b1a707fc51a486e25e4ae


[ source navigation ] [ diff markup ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]


This page is part of the FreeBSD/Linux Linux Kernel Cross-Reference, and was automatically generated using a modified version of the LXR engine.