The Design and Implementation of the FreeBSD Operating System, Second Edition
Now available: The Design and Implementation of the FreeBSD Operating System (Second Edition)


[ source navigation ] [ diff markup ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]

FreeBSD/Linux Kernel Cross Reference
sys/netinet/tcp_subr.c

Version: -  FREEBSD  -  FREEBSD-13-STABLE  -  FREEBSD-13-0  -  FREEBSD-12-STABLE  -  FREEBSD-12-0  -  FREEBSD-11-STABLE  -  FREEBSD-11-0  -  FREEBSD-10-STABLE  -  FREEBSD-10-0  -  FREEBSD-9-STABLE  -  FREEBSD-9-0  -  FREEBSD-8-STABLE  -  FREEBSD-8-0  -  FREEBSD-7-STABLE  -  FREEBSD-7-0  -  FREEBSD-6-STABLE  -  FREEBSD-6-0  -  FREEBSD-5-STABLE  -  FREEBSD-5-0  -  FREEBSD-4-STABLE  -  FREEBSD-3-STABLE  -  FREEBSD22  -  l41  -  OPENBSD  -  linux-2.6  -  MK84  -  PLAN9  -  xnu-8792 
SearchContext: -  none  -  3  -  10 

    1 /*      $NetBSD: tcp_subr.c,v 1.208.2.2 2008/03/30 15:27:49 jdc Exp $   */
    2 
    3 /*
    4  * Copyright (C) 1995, 1996, 1997, and 1998 WIDE Project.
    5  * All rights reserved.
    6  *
    7  * Redistribution and use in source and binary forms, with or without
    8  * modification, are permitted provided that the following conditions
    9  * are met:
   10  * 1. Redistributions of source code must retain the above copyright
   11  *    notice, this list of conditions and the following disclaimer.
   12  * 2. Redistributions in binary form must reproduce the above copyright
   13  *    notice, this list of conditions and the following disclaimer in the
   14  *    documentation and/or other materials provided with the distribution.
   15  * 3. Neither the name of the project nor the names of its contributors
   16  *    may be used to endorse or promote products derived from this software
   17  *    without specific prior written permission.
   18  *
   19  * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
   20  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
   21  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
   22  * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
   23  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
   24  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
   25  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
   26  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
   27  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
   28  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
   29  * SUCH DAMAGE.
   30  */
   31 
   32 /*-
   33  * Copyright (c) 1997, 1998, 2000, 2001 The NetBSD Foundation, Inc.
   34  * All rights reserved.
   35  *
   36  * This code is derived from software contributed to The NetBSD Foundation
   37  * by Jason R. Thorpe and Kevin M. Lahey of the Numerical Aerospace Simulation
   38  * Facility, NASA Ames Research Center.
   39  *
   40  * Redistribution and use in source and binary forms, with or without
   41  * modification, are permitted provided that the following conditions
   42  * are met:
   43  * 1. Redistributions of source code must retain the above copyright
   44  *    notice, this list of conditions and the following disclaimer.
   45  * 2. Redistributions in binary form must reproduce the above copyright
   46  *    notice, this list of conditions and the following disclaimer in the
   47  *    documentation and/or other materials provided with the distribution.
   48  * 3. All advertising materials mentioning features or use of this software
   49  *    must display the following acknowledgement:
   50  *      This product includes software developed by the NetBSD
   51  *      Foundation, Inc. and its contributors.
   52  * 4. Neither the name of The NetBSD Foundation nor the names of its
   53  *    contributors may be used to endorse or promote products derived
   54  *    from this software without specific prior written permission.
   55  *
   56  * THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
   57  * ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
   58  * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
   59  * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
   60  * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
   61  * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
   62  * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
   63  * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
   64  * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
   65  * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
   66  * POSSIBILITY OF SUCH DAMAGE.
   67  */
   68 
   69 /*
   70  * Copyright (c) 1982, 1986, 1988, 1990, 1993, 1995
   71  *      The Regents of the University of California.  All rights reserved.
   72  *
   73  * Redistribution and use in source and binary forms, with or without
   74  * modification, are permitted provided that the following conditions
   75  * are met:
   76  * 1. Redistributions of source code must retain the above copyright
   77  *    notice, this list of conditions and the following disclaimer.
   78  * 2. Redistributions in binary form must reproduce the above copyright
   79  *    notice, this list of conditions and the following disclaimer in the
   80  *    documentation and/or other materials provided with the distribution.
   81  * 3. Neither the name of the University nor the names of its contributors
   82  *    may be used to endorse or promote products derived from this software
   83  *    without specific prior written permission.
   84  *
   85  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
   86  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
   87  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
   88  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
   89  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
   90  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
   91  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
   92  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
   93  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
   94  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
   95  * SUCH DAMAGE.
   96  *
   97  *      @(#)tcp_subr.c  8.2 (Berkeley) 5/24/95
   98  */
   99 
  100 #include <sys/cdefs.h>
  101 __KERNEL_RCSID(0, "$NetBSD: tcp_subr.c,v 1.208.2.2 2008/03/30 15:27:49 jdc Exp $");
  102 
  103 #include "opt_inet.h"
  104 #include "opt_ipsec.h"
  105 #include "opt_tcp_compat_42.h"
  106 #include "opt_inet_csum.h"
  107 #include "opt_mbuftrace.h"
  108 #include "rnd.h"
  109 
  110 #include <sys/param.h>
  111 #include <sys/proc.h>
  112 #include <sys/systm.h>
  113 #include <sys/malloc.h>
  114 #include <sys/mbuf.h>
  115 #include <sys/socket.h>
  116 #include <sys/socketvar.h>
  117 #include <sys/protosw.h>
  118 #include <sys/errno.h>
  119 #include <sys/kernel.h>
  120 #include <sys/pool.h>
  121 #if NRND > 0
  122 #include <sys/md5.h>
  123 #include <sys/rnd.h>
  124 #endif
  125 
  126 #include <net/route.h>
  127 #include <net/if.h>
  128 
  129 #include <netinet/in.h>
  130 #include <netinet/in_systm.h>
  131 #include <netinet/ip.h>
  132 #include <netinet/in_pcb.h>
  133 #include <netinet/ip_var.h>
  134 #include <netinet/ip_icmp.h>
  135 
  136 #ifdef INET6
  137 #ifndef INET
  138 #include <netinet/in.h>
  139 #endif
  140 #include <netinet/ip6.h>
  141 #include <netinet6/in6_pcb.h>
  142 #include <netinet6/ip6_var.h>
  143 #include <netinet6/in6_var.h>
  144 #include <netinet6/ip6protosw.h>
  145 #include <netinet/icmp6.h>
  146 #include <netinet6/nd6.h>
  147 #endif
  148 
  149 #include <netinet/tcp.h>
  150 #include <netinet/tcp_fsm.h>
  151 #include <netinet/tcp_seq.h>
  152 #include <netinet/tcp_timer.h>
  153 #include <netinet/tcp_var.h>
  154 #include <netinet/tcp_congctl.h>
  155 #include <netinet/tcpip.h>
  156 
  157 #ifdef IPSEC
  158 #include <netinet6/ipsec.h>
  159 #include <netkey/key.h>
  160 #endif /*IPSEC*/
  161 
  162 #ifdef FAST_IPSEC
  163 #include <netipsec/ipsec.h>
  164 #include <netipsec/xform.h>
  165 #ifdef INET6
  166 #include <netipsec/ipsec6.h>
  167 #endif
  168  #include <netipsec/key.h>
  169 #endif  /* FAST_IPSEC*/
  170 
  171 
  172 struct  inpcbtable tcbtable;    /* head of queue of active tcpcb's */
  173 struct  tcpstat tcpstat;        /* tcp statistics */
  174 u_int32_t tcp_now;              /* for RFC 1323 timestamps */
  175 
  176 /* patchable/settable parameters for tcp */
  177 int     tcp_mssdflt = TCP_MSS;
  178 int     tcp_rttdflt = TCPTV_SRTTDFLT / PR_SLOWHZ;
  179 int     tcp_do_rfc1323 = 1;     /* window scaling / timestamps (obsolete) */
  180 #if NRND > 0
  181 int     tcp_do_rfc1948 = 0;     /* ISS by cryptographic hash */
  182 #endif
  183 int     tcp_do_sack = 1;        /* selective acknowledgement */
  184 int     tcp_do_win_scale = 1;   /* RFC1323 window scaling */
  185 int     tcp_do_timestamps = 1;  /* RFC1323 timestamps */
  186 int     tcp_ack_on_push = 0;    /* set to enable immediate ACK-on-PUSH */
  187 int     tcp_do_ecn = 0;         /* Explicit Congestion Notification */
  188 #ifndef TCP_INIT_WIN
  189 #define TCP_INIT_WIN    0       /* initial slow start window */
  190 #endif
  191 #ifndef TCP_INIT_WIN_LOCAL
  192 #define TCP_INIT_WIN_LOCAL 4    /* initial slow start window for local nets */
  193 #endif
  194 int     tcp_init_win = TCP_INIT_WIN;
  195 int     tcp_init_win_local = TCP_INIT_WIN_LOCAL;
  196 int     tcp_mss_ifmtu = 0;
  197 #ifdef TCP_COMPAT_42
  198 int     tcp_compat_42 = 1;
  199 #else
  200 int     tcp_compat_42 = 0;
  201 #endif
  202 int     tcp_rst_ppslim = 100;   /* 100pps */
  203 int     tcp_ackdrop_ppslim = 100;       /* 100pps */
  204 int     tcp_do_loopback_cksum = 0;
  205 int     tcp_do_abc = 1;         /* RFC3465 Appropriate byte counting. */
  206 int     tcp_abc_aggressive = 1; /* 1: L=2*SMSS  0: L=1*SMSS */
  207 int     tcp_sack_tp_maxholes = 32;
  208 int     tcp_sack_globalmaxholes = 1024;
  209 int     tcp_sack_globalholes = 0;
  210 int     tcp_ecn_maxretries = 1;
  211 
  212 /* tcb hash */
  213 #ifndef TCBHASHSIZE
  214 #define TCBHASHSIZE     128
  215 #endif
  216 int     tcbhashsize = TCBHASHSIZE;
  217 
  218 /* syn hash parameters */
  219 #define TCP_SYN_HASH_SIZE       293
  220 #define TCP_SYN_BUCKET_SIZE     35
  221 int     tcp_syn_cache_size = TCP_SYN_HASH_SIZE;
  222 int     tcp_syn_cache_limit = TCP_SYN_HASH_SIZE*TCP_SYN_BUCKET_SIZE;
  223 int     tcp_syn_bucket_limit = 3*TCP_SYN_BUCKET_SIZE;
  224 struct  syn_cache_head tcp_syn_cache[TCP_SYN_HASH_SIZE];
  225 
  226 int     tcp_freeq(struct tcpcb *);
  227 
  228 #ifdef INET
  229 void    tcp_mtudisc_callback(struct in_addr);
  230 #endif
  231 #ifdef INET6
  232 void    tcp6_mtudisc_callback(struct in6_addr *);
  233 #endif
  234 
  235 #ifdef INET6
  236 void    tcp6_mtudisc(struct in6pcb *, int);
  237 #endif
  238 
  239 POOL_INIT(tcpcb_pool, sizeof(struct tcpcb), 0, 0, 0, "tcpcbpl", NULL);
  240 
  241 #ifdef TCP_CSUM_COUNTERS
  242 #include <sys/device.h>
  243 
  244 #if defined(INET)
  245 struct evcnt tcp_hwcsum_bad = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  246     NULL, "tcp", "hwcsum bad");
  247 struct evcnt tcp_hwcsum_ok = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  248     NULL, "tcp", "hwcsum ok");
  249 struct evcnt tcp_hwcsum_data = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  250     NULL, "tcp", "hwcsum data");
  251 struct evcnt tcp_swcsum = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  252     NULL, "tcp", "swcsum");
  253 
  254 EVCNT_ATTACH_STATIC(tcp_hwcsum_bad);
  255 EVCNT_ATTACH_STATIC(tcp_hwcsum_ok);
  256 EVCNT_ATTACH_STATIC(tcp_hwcsum_data);
  257 EVCNT_ATTACH_STATIC(tcp_swcsum);
  258 #endif /* defined(INET) */
  259 
  260 #if defined(INET6)
  261 struct evcnt tcp6_hwcsum_bad = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  262     NULL, "tcp6", "hwcsum bad");
  263 struct evcnt tcp6_hwcsum_ok = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  264     NULL, "tcp6", "hwcsum ok");
  265 struct evcnt tcp6_hwcsum_data = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  266     NULL, "tcp6", "hwcsum data");
  267 struct evcnt tcp6_swcsum = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  268     NULL, "tcp6", "swcsum");
  269 
  270 EVCNT_ATTACH_STATIC(tcp6_hwcsum_bad);
  271 EVCNT_ATTACH_STATIC(tcp6_hwcsum_ok);
  272 EVCNT_ATTACH_STATIC(tcp6_hwcsum_data);
  273 EVCNT_ATTACH_STATIC(tcp6_swcsum);
  274 #endif /* defined(INET6) */
  275 #endif /* TCP_CSUM_COUNTERS */
  276 
  277 
  278 #ifdef TCP_OUTPUT_COUNTERS
  279 #include <sys/device.h>
  280 
  281 struct evcnt tcp_output_bigheader = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  282     NULL, "tcp", "output big header");
  283 struct evcnt tcp_output_predict_hit = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  284     NULL, "tcp", "output predict hit");
  285 struct evcnt tcp_output_predict_miss = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  286     NULL, "tcp", "output predict miss");
  287 struct evcnt tcp_output_copysmall = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  288     NULL, "tcp", "output copy small");
  289 struct evcnt tcp_output_copybig = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  290     NULL, "tcp", "output copy big");
  291 struct evcnt tcp_output_refbig = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  292     NULL, "tcp", "output reference big");
  293 
  294 EVCNT_ATTACH_STATIC(tcp_output_bigheader);
  295 EVCNT_ATTACH_STATIC(tcp_output_predict_hit);
  296 EVCNT_ATTACH_STATIC(tcp_output_predict_miss);
  297 EVCNT_ATTACH_STATIC(tcp_output_copysmall);
  298 EVCNT_ATTACH_STATIC(tcp_output_copybig);
  299 EVCNT_ATTACH_STATIC(tcp_output_refbig);
  300 
  301 #endif /* TCP_OUTPUT_COUNTERS */
  302 
  303 #ifdef TCP_REASS_COUNTERS
  304 #include <sys/device.h>
  305 
  306 struct evcnt tcp_reass_ = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  307     NULL, "tcp_reass", "calls");
  308 struct evcnt tcp_reass_empty = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  309     &tcp_reass_, "tcp_reass", "insert into empty queue");
  310 struct evcnt tcp_reass_iteration[8] = {
  311     EVCNT_INITIALIZER(EVCNT_TYPE_MISC, &tcp_reass_, "tcp_reass", ">7 iterations"),
  312     EVCNT_INITIALIZER(EVCNT_TYPE_MISC, &tcp_reass_, "tcp_reass", "1 iteration"),
  313     EVCNT_INITIALIZER(EVCNT_TYPE_MISC, &tcp_reass_, "tcp_reass", "2 iterations"),
  314     EVCNT_INITIALIZER(EVCNT_TYPE_MISC, &tcp_reass_, "tcp_reass", "3 iterations"),
  315     EVCNT_INITIALIZER(EVCNT_TYPE_MISC, &tcp_reass_, "tcp_reass", "4 iterations"),
  316     EVCNT_INITIALIZER(EVCNT_TYPE_MISC, &tcp_reass_, "tcp_reass", "5 iterations"),
  317     EVCNT_INITIALIZER(EVCNT_TYPE_MISC, &tcp_reass_, "tcp_reass", "6 iterations"),
  318     EVCNT_INITIALIZER(EVCNT_TYPE_MISC, &tcp_reass_, "tcp_reass", "7 iterations"),
  319 };
  320 struct evcnt tcp_reass_prependfirst = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  321     &tcp_reass_, "tcp_reass", "prepend to first");
  322 struct evcnt tcp_reass_prepend = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  323     &tcp_reass_, "tcp_reass", "prepend");
  324 struct evcnt tcp_reass_insert = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  325     &tcp_reass_, "tcp_reass", "insert");
  326 struct evcnt tcp_reass_inserttail = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  327     &tcp_reass_, "tcp_reass", "insert at tail");
  328 struct evcnt tcp_reass_append = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  329     &tcp_reass_, "tcp_reass", "append");
  330 struct evcnt tcp_reass_appendtail = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  331     &tcp_reass_, "tcp_reass", "append to tail fragment");
  332 struct evcnt tcp_reass_overlaptail = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  333     &tcp_reass_, "tcp_reass", "overlap at end");
  334 struct evcnt tcp_reass_overlapfront = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  335     &tcp_reass_, "tcp_reass", "overlap at start");
  336 struct evcnt tcp_reass_segdup = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  337     &tcp_reass_, "tcp_reass", "duplicate segment");
  338 struct evcnt tcp_reass_fragdup = EVCNT_INITIALIZER(EVCNT_TYPE_MISC,
  339     &tcp_reass_, "tcp_reass", "duplicate fragment");
  340 
  341 EVCNT_ATTACH_STATIC(tcp_reass_);
  342 EVCNT_ATTACH_STATIC(tcp_reass_empty);
  343 EVCNT_ATTACH_STATIC2(tcp_reass_iteration, 0);
  344 EVCNT_ATTACH_STATIC2(tcp_reass_iteration, 1);
  345 EVCNT_ATTACH_STATIC2(tcp_reass_iteration, 2);
  346 EVCNT_ATTACH_STATIC2(tcp_reass_iteration, 3);
  347 EVCNT_ATTACH_STATIC2(tcp_reass_iteration, 4);
  348 EVCNT_ATTACH_STATIC2(tcp_reass_iteration, 5);
  349 EVCNT_ATTACH_STATIC2(tcp_reass_iteration, 6);
  350 EVCNT_ATTACH_STATIC2(tcp_reass_iteration, 7);
  351 EVCNT_ATTACH_STATIC(tcp_reass_prependfirst);
  352 EVCNT_ATTACH_STATIC(tcp_reass_prepend);
  353 EVCNT_ATTACH_STATIC(tcp_reass_insert);
  354 EVCNT_ATTACH_STATIC(tcp_reass_inserttail);
  355 EVCNT_ATTACH_STATIC(tcp_reass_append);
  356 EVCNT_ATTACH_STATIC(tcp_reass_appendtail);
  357 EVCNT_ATTACH_STATIC(tcp_reass_overlaptail);
  358 EVCNT_ATTACH_STATIC(tcp_reass_overlapfront);
  359 EVCNT_ATTACH_STATIC(tcp_reass_segdup);
  360 EVCNT_ATTACH_STATIC(tcp_reass_fragdup);
  361 
  362 #endif /* TCP_REASS_COUNTERS */
  363 
  364 #ifdef MBUFTRACE
  365 struct mowner tcp_mowner = MOWNER_INIT("tcp", "");
  366 struct mowner tcp_rx_mowner = MOWNER_INIT("tcp", "rx");
  367 struct mowner tcp_tx_mowner = MOWNER_INIT("tcp", "tx");
  368 #endif
  369 
  370 /*
  371  * Tcp initialization
  372  */
  373 void
  374 tcp_init(void)
  375 {
  376         int hlen;
  377 
  378         /* Initialize the TCPCB template. */
  379         tcp_tcpcb_template();
  380 
  381         in_pcbinit(&tcbtable, tcbhashsize, tcbhashsize);
  382 
  383         hlen = sizeof(struct ip) + sizeof(struct tcphdr);
  384 #ifdef INET6
  385         if (sizeof(struct ip) < sizeof(struct ip6_hdr))
  386                 hlen = sizeof(struct ip6_hdr) + sizeof(struct tcphdr);
  387 #endif
  388         if (max_protohdr < hlen)
  389                 max_protohdr = hlen;
  390         if (max_linkhdr + hlen > MHLEN)
  391                 panic("tcp_init");
  392 
  393 #ifdef INET
  394         icmp_mtudisc_callback_register(tcp_mtudisc_callback);
  395 #endif
  396 #ifdef INET6
  397         icmp6_mtudisc_callback_register(tcp6_mtudisc_callback);
  398 #endif
  399 
  400         /* Initialize timer state. */
  401         tcp_timer_init();
  402 
  403         /* Initialize the compressed state engine. */
  404         syn_cache_init();
  405 
  406         /* Initialize the congestion control algorithms. */
  407         tcp_congctl_init();
  408 
  409         MOWNER_ATTACH(&tcp_tx_mowner);
  410         MOWNER_ATTACH(&tcp_rx_mowner);
  411         MOWNER_ATTACH(&tcp_mowner);
  412 }
  413 
  414 /*
  415  * Create template to be used to send tcp packets on a connection.
  416  * Call after host entry created, allocates an mbuf and fills
  417  * in a skeletal tcp/ip header, minimizing the amount of work
  418  * necessary when the connection is used.
  419  */
  420 struct mbuf *
  421 tcp_template(struct tcpcb *tp)
  422 {
  423         struct inpcb *inp = tp->t_inpcb;
  424 #ifdef INET6
  425         struct in6pcb *in6p = tp->t_in6pcb;
  426 #endif
  427         struct tcphdr *n;
  428         struct mbuf *m;
  429         int hlen;
  430 
  431         switch (tp->t_family) {
  432         case AF_INET:
  433                 hlen = sizeof(struct ip);
  434                 if (inp)
  435                         break;
  436 #ifdef INET6
  437                 if (in6p) {
  438                         /* mapped addr case */
  439                         if (IN6_IS_ADDR_V4MAPPED(&in6p->in6p_laddr)
  440                          && IN6_IS_ADDR_V4MAPPED(&in6p->in6p_faddr))
  441                                 break;
  442                 }
  443 #endif
  444                 return NULL;    /*EINVAL*/
  445 #ifdef INET6
  446         case AF_INET6:
  447                 hlen = sizeof(struct ip6_hdr);
  448                 if (in6p) {
  449                         /* more sainty check? */
  450                         break;
  451                 }
  452                 return NULL;    /*EINVAL*/
  453 #endif
  454         default:
  455                 hlen = 0;       /*pacify gcc*/
  456                 return NULL;    /*EAFNOSUPPORT*/
  457         }
  458 #ifdef DIAGNOSTIC
  459         if (hlen + sizeof(struct tcphdr) > MCLBYTES)
  460                 panic("mclbytes too small for t_template");
  461 #endif
  462         m = tp->t_template;
  463         if (m && m->m_len == hlen + sizeof(struct tcphdr))
  464                 ;
  465         else {
  466                 if (m)
  467                         m_freem(m);
  468                 m = tp->t_template = NULL;
  469                 MGETHDR(m, M_DONTWAIT, MT_HEADER);
  470                 if (m && hlen + sizeof(struct tcphdr) > MHLEN) {
  471                         MCLGET(m, M_DONTWAIT);
  472                         if ((m->m_flags & M_EXT) == 0) {
  473                                 m_free(m);
  474                                 m = NULL;
  475                         }
  476                 }
  477                 if (m == NULL)
  478                         return NULL;
  479                 MCLAIM(m, &tcp_mowner);
  480                 m->m_pkthdr.len = m->m_len = hlen + sizeof(struct tcphdr);
  481         }
  482 
  483         bzero(mtod(m, caddr_t), m->m_len);
  484 
  485         n = (struct tcphdr *)(mtod(m, caddr_t) + hlen);
  486 
  487         switch (tp->t_family) {
  488         case AF_INET:
  489             {
  490                 struct ipovly *ipov;
  491                 mtod(m, struct ip *)->ip_v = 4;
  492                 mtod(m, struct ip *)->ip_hl = hlen >> 2;
  493                 ipov = mtod(m, struct ipovly *);
  494                 ipov->ih_pr = IPPROTO_TCP;
  495                 ipov->ih_len = htons(sizeof(struct tcphdr));
  496                 if (inp) {
  497                         ipov->ih_src = inp->inp_laddr;
  498                         ipov->ih_dst = inp->inp_faddr;
  499                 }
  500 #ifdef INET6
  501                 else if (in6p) {
  502                         /* mapped addr case */
  503                         bcopy(&in6p->in6p_laddr.s6_addr32[3], &ipov->ih_src,
  504                                 sizeof(ipov->ih_src));
  505                         bcopy(&in6p->in6p_faddr.s6_addr32[3], &ipov->ih_dst,
  506                                 sizeof(ipov->ih_dst));
  507                 }
  508 #endif
  509                 /*
  510                  * Compute the pseudo-header portion of the checksum
  511                  * now.  We incrementally add in the TCP option and
  512                  * payload lengths later, and then compute the TCP
  513                  * checksum right before the packet is sent off onto
  514                  * the wire.
  515                  */
  516                 n->th_sum = in_cksum_phdr(ipov->ih_src.s_addr,
  517                     ipov->ih_dst.s_addr,
  518                     htons(sizeof(struct tcphdr) + IPPROTO_TCP));
  519                 break;
  520             }
  521 #ifdef INET6
  522         case AF_INET6:
  523             {
  524                 struct ip6_hdr *ip6;
  525                 mtod(m, struct ip *)->ip_v = 6;
  526                 ip6 = mtod(m, struct ip6_hdr *);
  527                 ip6->ip6_nxt = IPPROTO_TCP;
  528                 ip6->ip6_plen = htons(sizeof(struct tcphdr));
  529                 ip6->ip6_src = in6p->in6p_laddr;
  530                 ip6->ip6_dst = in6p->in6p_faddr;
  531                 ip6->ip6_flow = in6p->in6p_flowinfo & IPV6_FLOWINFO_MASK;
  532                 if (ip6_auto_flowlabel) {
  533                         ip6->ip6_flow &= ~IPV6_FLOWLABEL_MASK;
  534                         ip6->ip6_flow |=
  535                             (htonl(ip6_randomflowlabel()) & IPV6_FLOWLABEL_MASK);
  536                 }
  537                 ip6->ip6_vfc &= ~IPV6_VERSION_MASK;
  538                 ip6->ip6_vfc |= IPV6_VERSION;
  539 
  540                 /*
  541                  * Compute the pseudo-header portion of the checksum
  542                  * now.  We incrementally add in the TCP option and
  543                  * payload lengths later, and then compute the TCP
  544                  * checksum right before the packet is sent off onto
  545                  * the wire.
  546                  */
  547                 n->th_sum = in6_cksum_phdr(&in6p->in6p_laddr,
  548                     &in6p->in6p_faddr, htonl(sizeof(struct tcphdr)),
  549                     htonl(IPPROTO_TCP));
  550                 break;
  551             }
  552 #endif
  553         }
  554         if (inp) {
  555                 n->th_sport = inp->inp_lport;
  556                 n->th_dport = inp->inp_fport;
  557         }
  558 #ifdef INET6
  559         else if (in6p) {
  560                 n->th_sport = in6p->in6p_lport;
  561                 n->th_dport = in6p->in6p_fport;
  562         }
  563 #endif
  564         n->th_seq = 0;
  565         n->th_ack = 0;
  566         n->th_x2 = 0;
  567         n->th_off = 5;
  568         n->th_flags = 0;
  569         n->th_win = 0;
  570         n->th_urp = 0;
  571         return (m);
  572 }
  573 
  574 /*
  575  * Send a single message to the TCP at address specified by
  576  * the given TCP/IP header.  If m == 0, then we make a copy
  577  * of the tcpiphdr at ti and send directly to the addressed host.
  578  * This is used to force keep alive messages out using the TCP
  579  * template for a connection tp->t_template.  If flags are given
  580  * then we send a message back to the TCP which originated the
  581  * segment ti, and discard the mbuf containing it and any other
  582  * attached mbufs.
  583  *
  584  * In any case the ack and sequence number of the transmitted
  585  * segment are as specified by the parameters.
  586  */
  587 int
  588 tcp_respond(struct tcpcb *tp, struct mbuf *template, struct mbuf *m,
  589     struct tcphdr *th0, tcp_seq ack, tcp_seq seq, int flags)
  590 {
  591         struct route *ro;
  592         int error, tlen, win = 0;
  593         int hlen;
  594         struct ip *ip;
  595 #ifdef INET6
  596         struct ip6_hdr *ip6;
  597 #endif
  598         int family;     /* family on packet, not inpcb/in6pcb! */
  599         struct tcphdr *th;
  600         struct socket *so;
  601 
  602         if (tp != NULL && (flags & TH_RST) == 0) {
  603 #ifdef DIAGNOSTIC
  604                 if (tp->t_inpcb && tp->t_in6pcb)
  605                         panic("tcp_respond: both t_inpcb and t_in6pcb are set");
  606 #endif
  607 #ifdef INET
  608                 if (tp->t_inpcb)
  609                         win = sbspace(&tp->t_inpcb->inp_socket->so_rcv);
  610 #endif
  611 #ifdef INET6
  612                 if (tp->t_in6pcb)
  613                         win = sbspace(&tp->t_in6pcb->in6p_socket->so_rcv);
  614 #endif
  615         }
  616 
  617         th = NULL;      /* Quell uninitialized warning */
  618         ip = NULL;
  619 #ifdef INET6
  620         ip6 = NULL;
  621 #endif
  622         if (m == 0) {
  623                 if (!template)
  624                         return EINVAL;
  625 
  626                 /* get family information from template */
  627                 switch (mtod(template, struct ip *)->ip_v) {
  628                 case 4:
  629                         family = AF_INET;
  630                         hlen = sizeof(struct ip);
  631                         break;
  632 #ifdef INET6
  633                 case 6:
  634                         family = AF_INET6;
  635                         hlen = sizeof(struct ip6_hdr);
  636                         break;
  637 #endif
  638                 default:
  639                         return EAFNOSUPPORT;
  640                 }
  641 
  642                 MGETHDR(m, M_DONTWAIT, MT_HEADER);
  643                 if (m) {
  644                         MCLAIM(m, &tcp_tx_mowner);
  645                         MCLGET(m, M_DONTWAIT);
  646                         if ((m->m_flags & M_EXT) == 0) {
  647                                 m_free(m);
  648                                 m = NULL;
  649                         }
  650                 }
  651                 if (m == NULL)
  652                         return (ENOBUFS);
  653 
  654                 if (tcp_compat_42)
  655                         tlen = 1;
  656                 else
  657                         tlen = 0;
  658 
  659                 m->m_data += max_linkhdr;
  660                 bcopy(mtod(template, caddr_t), mtod(m, caddr_t),
  661                         template->m_len);
  662                 switch (family) {
  663                 case AF_INET:
  664                         ip = mtod(m, struct ip *);
  665                         th = (struct tcphdr *)(ip + 1);
  666                         break;
  667 #ifdef INET6
  668                 case AF_INET6:
  669                         ip6 = mtod(m, struct ip6_hdr *);
  670                         th = (struct tcphdr *)(ip6 + 1);
  671                         break;
  672 #endif
  673 #if 0
  674                 default:
  675                         /* noone will visit here */
  676                         m_freem(m);
  677                         return EAFNOSUPPORT;
  678 #endif
  679                 }
  680                 flags = TH_ACK;
  681         } else {
  682 
  683                 if ((m->m_flags & M_PKTHDR) == 0) {
  684 #if 0
  685                         printf("non PKTHDR to tcp_respond\n");
  686 #endif
  687                         m_freem(m);
  688                         return EINVAL;
  689                 }
  690 #ifdef DIAGNOSTIC
  691                 if (!th0)
  692                         panic("th0 == NULL in tcp_respond");
  693 #endif
  694 
  695                 /* get family information from m */
  696                 switch (mtod(m, struct ip *)->ip_v) {
  697                 case 4:
  698                         family = AF_INET;
  699                         hlen = sizeof(struct ip);
  700                         ip = mtod(m, struct ip *);
  701                         break;
  702 #ifdef INET6
  703                 case 6:
  704                         family = AF_INET6;
  705                         hlen = sizeof(struct ip6_hdr);
  706                         ip6 = mtod(m, struct ip6_hdr *);
  707                         break;
  708 #endif
  709                 default:
  710                         m_freem(m);
  711                         return EAFNOSUPPORT;
  712                 }
  713                 /* clear h/w csum flags inherited from rx packet */
  714                 m->m_pkthdr.csum_flags = 0;
  715 
  716                 if ((flags & TH_SYN) == 0 || sizeof(*th0) > (th0->th_off << 2))
  717                         tlen = sizeof(*th0);
  718                 else
  719                         tlen = th0->th_off << 2;
  720 
  721                 if (m->m_len > hlen + tlen && (m->m_flags & M_EXT) == 0 &&
  722                     mtod(m, caddr_t) + hlen == (caddr_t)th0) {
  723                         m->m_len = hlen + tlen;
  724                         m_freem(m->m_next);
  725                         m->m_next = NULL;
  726                 } else {
  727                         struct mbuf *n;
  728 
  729 #ifdef DIAGNOSTIC
  730                         if (max_linkhdr + hlen + tlen > MCLBYTES) {
  731                                 m_freem(m);
  732                                 return EMSGSIZE;
  733                         }
  734 #endif
  735                         MGETHDR(n, M_DONTWAIT, MT_HEADER);
  736                         if (n && max_linkhdr + hlen + tlen > MHLEN) {
  737                                 MCLGET(n, M_DONTWAIT);
  738                                 if ((n->m_flags & M_EXT) == 0) {
  739                                         m_freem(n);
  740                                         n = NULL;
  741                                 }
  742                         }
  743                         if (!n) {
  744                                 m_freem(m);
  745                                 return ENOBUFS;
  746                         }
  747 
  748                         MCLAIM(n, &tcp_tx_mowner);
  749                         n->m_data += max_linkhdr;
  750                         n->m_len = hlen + tlen;
  751                         m_copyback(n, 0, hlen, mtod(m, caddr_t));
  752                         m_copyback(n, hlen, tlen, (caddr_t)th0);
  753 
  754                         m_freem(m);
  755                         m = n;
  756                         n = NULL;
  757                 }
  758 
  759 #define xchg(a,b,type) { type t; t=a; a=b; b=t; }
  760                 switch (family) {
  761                 case AF_INET:
  762                         ip = mtod(m, struct ip *);
  763                         th = (struct tcphdr *)(ip + 1);
  764                         ip->ip_p = IPPROTO_TCP;
  765                         xchg(ip->ip_dst, ip->ip_src, struct in_addr);
  766                         ip->ip_p = IPPROTO_TCP;
  767                         break;
  768 #ifdef INET6
  769                 case AF_INET6:
  770                         ip6 = mtod(m, struct ip6_hdr *);
  771                         th = (struct tcphdr *)(ip6 + 1);
  772                         ip6->ip6_nxt = IPPROTO_TCP;
  773                         xchg(ip6->ip6_dst, ip6->ip6_src, struct in6_addr);
  774                         ip6->ip6_nxt = IPPROTO_TCP;
  775                         break;
  776 #endif
  777 #if 0
  778                 default:
  779                         /* noone will visit here */
  780                         m_freem(m);
  781                         return EAFNOSUPPORT;
  782 #endif
  783                 }
  784                 xchg(th->th_dport, th->th_sport, u_int16_t);
  785 #undef xchg
  786                 tlen = 0;       /*be friendly with the following code*/
  787         }
  788         th->th_seq = htonl(seq);
  789         th->th_ack = htonl(ack);
  790         th->th_x2 = 0;
  791         if ((flags & TH_SYN) == 0) {
  792                 if (tp)
  793                         win >>= tp->rcv_scale;
  794                 if (win > TCP_MAXWIN)
  795                         win = TCP_MAXWIN;
  796                 th->th_win = htons((u_int16_t)win);
  797                 th->th_off = sizeof (struct tcphdr) >> 2;
  798                 tlen += sizeof(*th);
  799         } else
  800                 tlen += th->th_off << 2;
  801         m->m_len = hlen + tlen;
  802         m->m_pkthdr.len = hlen + tlen;
  803         m->m_pkthdr.rcvif = (struct ifnet *) 0;
  804         th->th_flags = flags;
  805         th->th_urp = 0;
  806 
  807         switch (family) {
  808 #ifdef INET
  809         case AF_INET:
  810             {
  811                 struct ipovly *ipov = (struct ipovly *)ip;
  812                 bzero(ipov->ih_x1, sizeof ipov->ih_x1);
  813                 ipov->ih_len = htons((u_int16_t)tlen);
  814 
  815                 th->th_sum = 0;
  816                 th->th_sum = in_cksum(m, hlen + tlen);
  817                 ip->ip_len = htons(hlen + tlen);
  818                 ip->ip_ttl = ip_defttl;
  819                 break;
  820             }
  821 #endif
  822 #ifdef INET6
  823         case AF_INET6:
  824             {
  825                 th->th_sum = 0;
  826                 th->th_sum = in6_cksum(m, IPPROTO_TCP, sizeof(struct ip6_hdr),
  827                                 tlen);
  828                 ip6->ip6_plen = htons(tlen);
  829                 if (tp && tp->t_in6pcb) {
  830                         struct ifnet *oifp;
  831                         ro = (struct route *)&tp->t_in6pcb->in6p_route;
  832                         oifp = ro->ro_rt ? ro->ro_rt->rt_ifp : NULL;
  833                         ip6->ip6_hlim = in6_selecthlim(tp->t_in6pcb, oifp);
  834                 } else
  835                         ip6->ip6_hlim = ip6_defhlim;
  836                 ip6->ip6_flow &= ~IPV6_FLOWINFO_MASK;
  837                 if (ip6_auto_flowlabel) {
  838                         ip6->ip6_flow |=
  839                             (htonl(ip6_randomflowlabel()) & IPV6_FLOWLABEL_MASK);
  840                 }
  841                 break;
  842             }
  843 #endif
  844         }
  845 
  846         if (tp && tp->t_inpcb)
  847                 so = tp->t_inpcb->inp_socket;
  848 #ifdef INET6
  849         else if (tp && tp->t_in6pcb)
  850                 so = tp->t_in6pcb->in6p_socket;
  851 #endif
  852         else
  853                 so = NULL;
  854 
  855         if (tp != NULL && tp->t_inpcb != NULL) {
  856                 ro = &tp->t_inpcb->inp_route;
  857 #ifdef DIAGNOSTIC
  858                 if (family != AF_INET)
  859                         panic("tcp_respond: address family mismatch");
  860                 if (!in_hosteq(ip->ip_dst, tp->t_inpcb->inp_faddr)) {
  861                         panic("tcp_respond: ip_dst %x != inp_faddr %x",
  862                             ntohl(ip->ip_dst.s_addr),
  863                             ntohl(tp->t_inpcb->inp_faddr.s_addr));
  864                 }
  865 #endif
  866         }
  867 #ifdef INET6
  868         else if (tp != NULL && tp->t_in6pcb != NULL) {
  869                 ro = (struct route *)&tp->t_in6pcb->in6p_route;
  870 #ifdef DIAGNOSTIC
  871                 if (family == AF_INET) {
  872                         if (!IN6_IS_ADDR_V4MAPPED(&tp->t_in6pcb->in6p_faddr))
  873                                 panic("tcp_respond: not mapped addr");
  874                         if (bcmp(&ip->ip_dst,
  875                             &tp->t_in6pcb->in6p_faddr.s6_addr32[3],
  876                             sizeof(ip->ip_dst)) != 0) {
  877                                 panic("tcp_respond: ip_dst != in6p_faddr");
  878                         }
  879                 } else if (family == AF_INET6) {
  880                         if (!IN6_ARE_ADDR_EQUAL(&ip6->ip6_dst,
  881                             &tp->t_in6pcb->in6p_faddr))
  882                                 panic("tcp_respond: ip6_dst != in6p_faddr");
  883                 } else
  884                         panic("tcp_respond: address family mismatch");
  885 #endif
  886         }
  887 #endif
  888         else
  889                 ro = NULL;
  890 
  891         switch (family) {
  892 #ifdef INET
  893         case AF_INET:
  894                 error = ip_output(m, NULL, ro,
  895                     (tp && tp->t_mtudisc ? IP_MTUDISC : 0),
  896                     (struct ip_moptions *)0, so);
  897                 break;
  898 #endif
  899 #ifdef INET6
  900         case AF_INET6:
  901                 error = ip6_output(m, NULL, (struct route_in6 *)ro, 0,
  902                     (struct ip6_moptions *)0, so, NULL);
  903                 break;
  904 #endif
  905         default:
  906                 error = EAFNOSUPPORT;
  907                 break;
  908         }
  909 
  910         return (error);
  911 }
  912 
  913 /*
  914  * Template TCPCB.  Rather than zeroing a new TCPCB and initializing
  915  * a bunch of members individually, we maintain this template for the
  916  * static and mostly-static components of the TCPCB, and copy it into
  917  * the new TCPCB instead.
  918  */
  919 static struct tcpcb tcpcb_template = {
  920         /*
  921          * If TCP_NTIMERS ever changes, we'll need to update this
  922          * initializer.
  923          */
  924         .t_timer = {
  925                 CALLOUT_INITIALIZER,
  926                 CALLOUT_INITIALIZER,
  927                 CALLOUT_INITIALIZER,
  928                 CALLOUT_INITIALIZER,
  929         },
  930         .t_delack_ch = CALLOUT_INITIALIZER,
  931 
  932         .t_srtt = TCPTV_SRTTBASE,
  933         .t_rttmin = TCPTV_MIN,
  934 
  935         .snd_cwnd = TCP_MAXWIN << TCP_MAX_WINSHIFT,
  936         .snd_ssthresh = TCP_MAXWIN << TCP_MAX_WINSHIFT,
  937         .snd_numholes = 0,
  938 
  939         .t_partialacks = -1,
  940         .t_bytes_acked = 0,
  941 };
  942 
  943 /*
  944  * Updates the TCPCB template whenever a parameter that would affect
  945  * the template is changed.
  946  */
  947 void
  948 tcp_tcpcb_template(void)
  949 {
  950         struct tcpcb *tp = &tcpcb_template;
  951         int flags;
  952 
  953         tp->t_peermss = tcp_mssdflt;
  954         tp->t_ourmss = tcp_mssdflt;
  955         tp->t_segsz = tcp_mssdflt;
  956 
  957         flags = 0;
  958         if (tcp_do_rfc1323 && tcp_do_win_scale)
  959                 flags |= TF_REQ_SCALE;
  960         if (tcp_do_rfc1323 && tcp_do_timestamps)
  961                 flags |= TF_REQ_TSTMP;
  962         tp->t_flags = flags;
  963 
  964         /*
  965          * Init srtt to TCPTV_SRTTBASE (0), so we can tell that we have no
  966          * rtt estimate.  Set rttvar so that srtt + 2 * rttvar gives
  967          * reasonable initial retransmit time.
  968          */
  969         tp->t_rttvar = tcp_rttdflt * PR_SLOWHZ << (TCP_RTTVAR_SHIFT + 2 - 1);
  970         TCPT_RANGESET(tp->t_rxtcur, TCP_REXMTVAL(tp),
  971             TCPTV_MIN, TCPTV_REXMTMAX);
  972 }
  973 
  974 /*
  975  * Create a new TCP control block, making an
  976  * empty reassembly queue and hooking it to the argument
  977  * protocol control block.
  978  */
  979 /* family selects inpcb, or in6pcb */
  980 struct tcpcb *
  981 tcp_newtcpcb(int family, void *aux)
  982 {
  983         struct tcpcb *tp;
  984         int i;
  985 
  986         /* XXX Consider using a pool_cache for speed. */
  987         tp = pool_get(&tcpcb_pool, PR_NOWAIT);  /* splsoftnet via tcp_usrreq */
  988         if (tp == NULL)
  989                 return (NULL);
  990         memcpy(tp, &tcpcb_template, sizeof(*tp));
  991         TAILQ_INIT(&tp->segq);
  992         TAILQ_INIT(&tp->timeq);
  993         tp->t_family = family;          /* may be overridden later on */
  994         TAILQ_INIT(&tp->snd_holes);
  995         LIST_INIT(&tp->t_sc);           /* XXX can template this */
  996 
  997         /* Don't sweat this loop; hopefully the compiler will unroll it. */
  998         for (i = 0; i < TCPT_NTIMERS; i++)
  999                 TCP_TIMER_INIT(tp, i);
 1000 
 1001         switch (family) {
 1002         case AF_INET:
 1003             {
 1004                 struct inpcb *inp = (struct inpcb *)aux;
 1005 
 1006                 inp->inp_ip.ip_ttl = ip_defttl;
 1007                 inp->inp_ppcb = (caddr_t)tp;
 1008 
 1009                 tp->t_inpcb = inp;
 1010                 tp->t_mtudisc = ip_mtudisc;
 1011                 break;
 1012             }
 1013 #ifdef INET6
 1014         case AF_INET6:
 1015             {
 1016                 struct in6pcb *in6p = (struct in6pcb *)aux;
 1017 
 1018                 in6p->in6p_ip6.ip6_hlim = in6_selecthlim(in6p,
 1019                         in6p->in6p_route.ro_rt ? in6p->in6p_route.ro_rt->rt_ifp
 1020                                                : NULL);
 1021                 in6p->in6p_ppcb = (caddr_t)tp;
 1022 
 1023                 tp->t_in6pcb = in6p;
 1024                 /* for IPv6, always try to run path MTU discovery */
 1025                 tp->t_mtudisc = 1;
 1026                 break;
 1027             }
 1028 #endif /* INET6 */
 1029         default:
 1030                 pool_put(&tcpcb_pool, tp);      /* splsoftnet via tcp_usrreq */
 1031                 return (NULL);
 1032         }
 1033 
 1034         /*
 1035          * Initialize our timebase.  When we send timestamps, we take
 1036          * the delta from tcp_now -- this means each connection always
 1037          * gets a timebase of 0, which makes it, among other things,
 1038          * more difficult to determine how long a system has been up,
 1039          * and thus how many TCP sequence increments have occurred.
 1040          */
 1041         tp->ts_timebase = tcp_now;
 1042         
 1043         tp->t_congctl = tcp_congctl_global;
 1044         tp->t_congctl->refcnt++;
 1045         
 1046         return (tp);
 1047 }
 1048 
 1049 /*
 1050  * Drop a TCP connection, reporting
 1051  * the specified error.  If connection is synchronized,
 1052  * then send a RST to peer.
 1053  */
 1054 struct tcpcb *
 1055 tcp_drop(struct tcpcb *tp, int errno)
 1056 {
 1057         struct socket *so = NULL;
 1058 
 1059 #ifdef DIAGNOSTIC
 1060         if (tp->t_inpcb && tp->t_in6pcb)
 1061                 panic("tcp_drop: both t_inpcb and t_in6pcb are set");
 1062 #endif
 1063 #ifdef INET
 1064         if (tp->t_inpcb)
 1065                 so = tp->t_inpcb->inp_socket;
 1066 #endif
 1067 #ifdef INET6
 1068         if (tp->t_in6pcb)
 1069                 so = tp->t_in6pcb->in6p_socket;
 1070 #endif
 1071         if (!so)
 1072                 return NULL;
 1073 
 1074         if (TCPS_HAVERCVDSYN(tp->t_state)) {
 1075                 tp->t_state = TCPS_CLOSED;
 1076                 (void) tcp_output(tp);
 1077                 tcpstat.tcps_drops++;
 1078         } else
 1079                 tcpstat.tcps_conndrops++;
 1080         if (errno == ETIMEDOUT && tp->t_softerror)
 1081                 errno = tp->t_softerror;
 1082         so->so_error = errno;
 1083         return (tcp_close(tp));
 1084 }
 1085 
 1086 /*
 1087  * Return whether this tcpcb is marked as dead, indicating
 1088  * to the calling timer function that no further action should
 1089  * be taken, as we are about to release this tcpcb.  The release
 1090  * of the storage will be done if this is the last timer running.
 1091  *
 1092  * This should be called from the callout handler function after
 1093  * callout_ack() is done, so that the number of invoking timer
 1094  * functions is 0.
 1095  */
 1096 int
 1097 tcp_isdead(struct tcpcb *tp)
 1098 {
 1099         int dead = (tp->t_flags & TF_DEAD);
 1100 
 1101         if (__predict_false(dead)) {
 1102                 if (tcp_timers_invoking(tp) > 0)
 1103                                 /* not quite there yet -- count separately? */
 1104                         return dead;
 1105                 tcpstat.tcps_delayed_free++;
 1106                 pool_put(&tcpcb_pool, tp);      /* splsoftnet via tcp_timer.c */
 1107         }
 1108         return dead;
 1109 }
 1110 
 1111 /*
 1112  * Close a TCP control block:
 1113  *      discard all space held by the tcp
 1114  *      discard internet protocol block
 1115  *      wake up any sleepers
 1116  */
 1117 struct tcpcb *
 1118 tcp_close(struct tcpcb *tp)
 1119 {
 1120         struct inpcb *inp;
 1121 #ifdef INET6
 1122         struct in6pcb *in6p;
 1123 #endif
 1124         struct socket *so;
 1125 #ifdef RTV_RTT
 1126         struct rtentry *rt;
 1127 #endif
 1128         struct route *ro;
 1129 
 1130         inp = tp->t_inpcb;
 1131 #ifdef INET6
 1132         in6p = tp->t_in6pcb;
 1133 #endif
 1134         so = NULL;
 1135         ro = NULL;
 1136         if (inp) {
 1137                 so = inp->inp_socket;
 1138                 ro = &inp->inp_route;
 1139         }
 1140 #ifdef INET6
 1141         else if (in6p) {
 1142                 so = in6p->in6p_socket;
 1143                 ro = (struct route *)&in6p->in6p_route;
 1144         }
 1145 #endif
 1146 
 1147 #ifdef RTV_RTT
 1148         /*
 1149          * If we sent enough data to get some meaningful characteristics,
 1150          * save them in the routing entry.  'Enough' is arbitrarily
 1151          * defined as the sendpipesize (default 4K) * 16.  This would
 1152          * give us 16 rtt samples assuming we only get one sample per
 1153          * window (the usual case on a long haul net).  16 samples is
 1154          * enough for the srtt filter to converge to within 5% of the correct
 1155          * value; fewer samples and we could save a very bogus rtt.
 1156          *
 1157          * Don't update the default route's characteristics and don't
 1158          * update anything that the user "locked".
 1159          */
 1160         if (SEQ_LT(tp->iss + so->so_snd.sb_hiwat * 16, tp->snd_max) &&
 1161             ro && (rt = ro->ro_rt) &&
 1162             !in_nullhost(satosin(rt_key(rt))->sin_addr)) {
 1163                 u_long i = 0;
 1164 
 1165                 if ((rt->rt_rmx.rmx_locks & RTV_RTT) == 0) {
 1166                         i = tp->t_srtt *
 1167                             ((RTM_RTTUNIT / PR_SLOWHZ) >> (TCP_RTT_SHIFT + 2));
 1168                         if (rt->rt_rmx.rmx_rtt && i)
 1169                                 /*
 1170                                  * filter this update to half the old & half
 1171                                  * the new values, converting scale.
 1172                                  * See route.h and tcp_var.h for a
 1173                                  * description of the scaling constants.
 1174                                  */
 1175                                 rt->rt_rmx.rmx_rtt =
 1176                                     (rt->rt_rmx.rmx_rtt + i) / 2;
 1177                         else
 1178                                 rt->rt_rmx.rmx_rtt = i;
 1179                 }
 1180                 if ((rt->rt_rmx.rmx_locks & RTV_RTTVAR) == 0) {
 1181                         i = tp->t_rttvar *
 1182                             ((RTM_RTTUNIT / PR_SLOWHZ) >> (TCP_RTTVAR_SHIFT + 2));
 1183                         if (rt->rt_rmx.rmx_rttvar && i)
 1184                                 rt->rt_rmx.rmx_rttvar =
 1185                                     (rt->rt_rmx.rmx_rttvar + i) / 2;
 1186                         else
 1187                                 rt->rt_rmx.rmx_rttvar = i;
 1188                 }
 1189                 /*
 1190                  * update the pipelimit (ssthresh) if it has been updated
 1191                  * already or if a pipesize was specified & the threshhold
 1192                  * got below half the pipesize.  I.e., wait for bad news
 1193                  * before we start updating, then update on both good
 1194                  * and bad news.
 1195                  */
 1196                 if (((rt->rt_rmx.rmx_locks & RTV_SSTHRESH) == 0 &&
 1197                     (i = tp->snd_ssthresh) && rt->rt_rmx.rmx_ssthresh) ||
 1198                     i < (rt->rt_rmx.rmx_sendpipe / 2)) {
 1199                         /*
 1200                          * convert the limit from user data bytes to
 1201                          * packets then to packet data bytes.
 1202                          */
 1203                         i = (i + tp->t_segsz / 2) / tp->t_segsz;
 1204                         if (i < 2)
 1205                                 i = 2;
 1206                         i *= (u_long)(tp->t_segsz + sizeof (struct tcpiphdr));
 1207                         if (rt->rt_rmx.rmx_ssthresh)
 1208                                 rt->rt_rmx.rmx_ssthresh =
 1209                                     (rt->rt_rmx.rmx_ssthresh + i) / 2;
 1210                         else
 1211                                 rt->rt_rmx.rmx_ssthresh = i;
 1212                 }
 1213         }
 1214 #endif /* RTV_RTT */
 1215         /* free the reassembly queue, if any */
 1216         TCP_REASS_LOCK(tp);
 1217         (void) tcp_freeq(tp);
 1218         TCP_REASS_UNLOCK(tp);
 1219 
 1220         /* free the SACK holes list. */
 1221         tcp_free_sackholes(tp);
 1222         
 1223         tp->t_congctl->refcnt--;
 1224 
 1225         tcp_canceltimers(tp);
 1226         TCP_CLEAR_DELACK(tp);
 1227         syn_cache_cleanup(tp);
 1228 
 1229         if (tp->t_template) {
 1230                 m_free(tp->t_template);
 1231                 tp->t_template = NULL;
 1232         }
 1233         if (tcp_timers_invoking(tp))
 1234                 tp->t_flags |= TF_DEAD;
 1235         else
 1236                 pool_put(&tcpcb_pool, tp);
 1237 
 1238         if (inp) {
 1239                 inp->inp_ppcb = 0;
 1240                 soisdisconnected(so);
 1241                 in_pcbdetach(inp);
 1242         }
 1243 #ifdef INET6
 1244         else if (in6p) {
 1245                 in6p->in6p_ppcb = 0;
 1246                 soisdisconnected(so);
 1247                 in6_pcbdetach(in6p);
 1248         }
 1249 #endif
 1250         tcpstat.tcps_closed++;
 1251         return ((struct tcpcb *)0);
 1252 }
 1253 
 1254 int
 1255 tcp_freeq(tp)
 1256         struct tcpcb *tp;
 1257 {
 1258         struct ipqent *qe;
 1259         int rv = 0;
 1260 #ifdef TCPREASS_DEBUG
 1261         int i = 0;
 1262 #endif
 1263 
 1264         TCP_REASS_LOCK_CHECK(tp);
 1265 
 1266         while ((qe = TAILQ_FIRST(&tp->segq)) != NULL) {
 1267 #ifdef TCPREASS_DEBUG
 1268                 printf("tcp_freeq[%p,%d]: %u:%u(%u) 0x%02x\n",
 1269                         tp, i++, qe->ipqe_seq, qe->ipqe_seq + qe->ipqe_len,
 1270                         qe->ipqe_len, qe->ipqe_flags & (TH_SYN|TH_FIN|TH_RST));
 1271 #endif
 1272                 TAILQ_REMOVE(&tp->segq, qe, ipqe_q);
 1273                 TAILQ_REMOVE(&tp->timeq, qe, ipqe_timeq);
 1274                 m_freem(qe->ipqe_m);
 1275                 tcpipqent_free(qe);
 1276                 rv = 1;
 1277         }
 1278         tp->t_segqlen = 0;
 1279         KASSERT(TAILQ_EMPTY(&tp->timeq));
 1280         return (rv);
 1281 }
 1282 
 1283 /*
 1284  * Protocol drain routine.  Called when memory is in short supply.
 1285  */
 1286 void
 1287 tcp_drain(void)
 1288 {
 1289         struct inpcb_hdr *inph;
 1290         struct tcpcb *tp;
 1291 
 1292         /*
 1293          * Free the sequence queue of all TCP connections.
 1294          */
 1295         CIRCLEQ_FOREACH(inph, &tcbtable.inpt_queue, inph_queue) {
 1296                 switch (inph->inph_af) {
 1297                 case AF_INET:
 1298                         tp = intotcpcb((struct inpcb *)inph);
 1299                         break;
 1300 #ifdef INET6
 1301                 case AF_INET6:
 1302                         tp = in6totcpcb((struct in6pcb *)inph);
 1303                         break;
 1304 #endif
 1305                 default:
 1306                         tp = NULL;
 1307                         break;
 1308                 }
 1309                 if (tp != NULL) {
 1310                         /*
 1311                          * We may be called from a device's interrupt
 1312                          * context.  If the tcpcb is already busy,
 1313                          * just bail out now.
 1314                          */
 1315                         if (tcp_reass_lock_try(tp) == 0)
 1316                                 continue;
 1317                         if (tcp_freeq(tp))
 1318                                 tcpstat.tcps_connsdrained++;
 1319                         TCP_REASS_UNLOCK(tp);
 1320                 }
 1321         }
 1322 }
 1323 
 1324 /*
 1325  * Notify a tcp user of an asynchronous error;
 1326  * store error as soft error, but wake up user
 1327  * (for now, won't do anything until can select for soft error).
 1328  */
 1329 void
 1330 tcp_notify(struct inpcb *inp, int error)
 1331 {
 1332         struct tcpcb *tp = (struct tcpcb *)inp->inp_ppcb;
 1333         struct socket *so = inp->inp_socket;
 1334 
 1335         /*
 1336          * Ignore some errors if we are hooked up.
 1337          * If connection hasn't completed, has retransmitted several times,
 1338          * and receives a second error, give up now.  This is better
 1339          * than waiting a long time to establish a connection that
 1340          * can never complete.
 1341          */
 1342         if (tp->t_state == TCPS_ESTABLISHED &&
 1343              (error == EHOSTUNREACH || error == ENETUNREACH ||
 1344               error == EHOSTDOWN)) {
 1345                 return;
 1346         } else if (TCPS_HAVEESTABLISHED(tp->t_state) == 0 &&
 1347             tp->t_rxtshift > 3 && tp->t_softerror)
 1348                 so->so_error = error;
 1349         else
 1350                 tp->t_softerror = error;
 1351         wakeup((caddr_t) &so->so_timeo);
 1352         sorwakeup(so);
 1353         sowwakeup(so);
 1354 }
 1355 
 1356 #ifdef INET6
 1357 void
 1358 tcp6_notify(struct in6pcb *in6p, int error)
 1359 {
 1360         struct tcpcb *tp = (struct tcpcb *)in6p->in6p_ppcb;
 1361         struct socket *so = in6p->in6p_socket;
 1362 
 1363         /*
 1364          * Ignore some errors if we are hooked up.
 1365          * If connection hasn't completed, has retransmitted several times,
 1366          * and receives a second error, give up now.  This is better
 1367          * than waiting a long time to establish a connection that
 1368          * can never complete.
 1369          */
 1370         if (tp->t_state == TCPS_ESTABLISHED &&
 1371              (error == EHOSTUNREACH || error == ENETUNREACH ||
 1372               error == EHOSTDOWN)) {
 1373                 return;
 1374         } else if (TCPS_HAVEESTABLISHED(tp->t_state) == 0 &&
 1375             tp->t_rxtshift > 3 && tp->t_softerror)
 1376                 so->so_error = error;
 1377         else
 1378                 tp->t_softerror = error;
 1379         wakeup((caddr_t) &so->so_timeo);
 1380         sorwakeup(so);
 1381         sowwakeup(so);
 1382 }
 1383 #endif
 1384 
 1385 #ifdef INET6
 1386 void
 1387 tcp6_ctlinput(int cmd, struct sockaddr *sa, void *d)
 1388 {
 1389         struct tcphdr th;
 1390         void (*notify)(struct in6pcb *, int) = tcp6_notify;
 1391         int nmatch;
 1392         struct ip6_hdr *ip6;
 1393         const struct sockaddr_in6 *sa6_src = NULL;
 1394         struct sockaddr_in6 *sa6 = (struct sockaddr_in6 *)sa;
 1395         struct mbuf *m;
 1396         int off;
 1397 
 1398         if (sa->sa_family != AF_INET6 ||
 1399             sa->sa_len != sizeof(struct sockaddr_in6))
 1400                 return;
 1401         if ((unsigned)cmd >= PRC_NCMDS)
 1402                 return;
 1403         else if (cmd == PRC_QUENCH) {
 1404                 /* 
 1405                  * Don't honor ICMP Source Quench messages meant for
 1406                  * TCP connections.
 1407                  */
 1408                 return;
 1409         } else if (PRC_IS_REDIRECT(cmd))
 1410                 notify = in6_rtchange, d = NULL;
 1411         else if (cmd == PRC_MSGSIZE)
 1412                 ; /* special code is present, see below */
 1413         else if (cmd == PRC_HOSTDEAD)
 1414                 d = NULL;
 1415         else if (inet6ctlerrmap[cmd] == 0)
 1416                 return;
 1417 
 1418         /* if the parameter is from icmp6, decode it. */
 1419         if (d != NULL) {
 1420                 struct ip6ctlparam *ip6cp = (struct ip6ctlparam *)d;
 1421                 m = ip6cp->ip6c_m;
 1422                 ip6 = ip6cp->ip6c_ip6;
 1423                 off = ip6cp->ip6c_off;
 1424                 sa6_src = ip6cp->ip6c_src;
 1425         } else {
 1426                 m = NULL;
 1427                 ip6 = NULL;
 1428                 sa6_src = &sa6_any;
 1429                 off = 0;
 1430         }
 1431 
 1432         if (ip6) {
 1433                 /*
 1434                  * XXX: We assume that when ip6 is non NULL,
 1435                  * M and OFF are valid.
 1436                  */
 1437 
 1438                 /* check if we can safely examine src and dst ports */
 1439                 if (m->m_pkthdr.len < off + sizeof(th)) {
 1440                         if (cmd == PRC_MSGSIZE)
 1441                                 icmp6_mtudisc_update((struct ip6ctlparam *)d, 0);
 1442                         return;
 1443                 }
 1444 
 1445                 bzero(&th, sizeof(th));
 1446                 m_copydata(m, off, sizeof(th), (caddr_t)&th);
 1447 
 1448                 if (cmd == PRC_MSGSIZE) {
 1449                         int valid = 0;
 1450 
 1451                         /*
 1452                          * Check to see if we have a valid TCP connection
 1453                          * corresponding to the address in the ICMPv6 message
 1454                          * payload.
 1455                          */
 1456                         if (in6_pcblookup_connect(&tcbtable, &sa6->sin6_addr,
 1457                             th.th_dport, (const struct in6_addr *)&sa6_src->sin6_addr,
 1458                             th.th_sport, 0))
 1459                                 valid++;
 1460 
 1461                         /*
 1462                          * Depending on the value of "valid" and routing table
 1463                          * size (mtudisc_{hi,lo}wat), we will:
 1464                          * - recalcurate the new MTU and create the
 1465                          *   corresponding routing entry, or
 1466                          * - ignore the MTU change notification.
 1467                          */
 1468                         icmp6_mtudisc_update((struct ip6ctlparam *)d, valid);
 1469 
 1470                         /*
 1471                          * no need to call in6_pcbnotify, it should have been
 1472                          * called via callback if necessary
 1473                          */
 1474                         return;
 1475                 }
 1476 
 1477                 nmatch = in6_pcbnotify(&tcbtable, sa, th.th_dport,
 1478                     (const struct sockaddr *)sa6_src, th.th_sport, cmd, NULL, notify);
 1479                 if (nmatch == 0 && syn_cache_count &&
 1480                     (inet6ctlerrmap[cmd] == EHOSTUNREACH ||
 1481                      inet6ctlerrmap[cmd] == ENETUNREACH ||
 1482                      inet6ctlerrmap[cmd] == EHOSTDOWN))
 1483                         syn_cache_unreach((const struct sockaddr *)sa6_src,
 1484                                           sa, &th);
 1485         } else {
 1486                 (void) in6_pcbnotify(&tcbtable, sa, 0,
 1487                     (const struct sockaddr *)sa6_src, 0, cmd, NULL, notify);
 1488         }
 1489 }
 1490 #endif
 1491 
 1492 #ifdef INET
 1493 /* assumes that ip header and tcp header are contiguous on mbuf */
 1494 void *
 1495 tcp_ctlinput(int cmd, struct sockaddr *sa, void *v)
 1496 {
 1497         struct ip *ip = v;
 1498         struct tcphdr *th;
 1499         struct icmp *icp;
 1500         extern const int inetctlerrmap[];
 1501         void (*notify)(struct inpcb *, int) = tcp_notify;
 1502         int errno;
 1503         int nmatch;
 1504         struct tcpcb *tp;
 1505         u_int mtu;
 1506         tcp_seq seq;
 1507         struct inpcb *inp;
 1508 #ifdef INET6
 1509         struct in6pcb *in6p;
 1510         struct in6_addr src6, dst6;
 1511 #endif
 1512 
 1513         if (sa->sa_family != AF_INET ||
 1514             sa->sa_len != sizeof(struct sockaddr_in))
 1515                 return NULL;
 1516         if ((unsigned)cmd >= PRC_NCMDS)
 1517                 return NULL;
 1518         errno = inetctlerrmap[cmd];
 1519         if (cmd == PRC_QUENCH)
 1520                 /* 
 1521                  * Don't honor ICMP Source Quench messages meant for
 1522                  * TCP connections.
 1523                  */
 1524                 return NULL;
 1525         else if (PRC_IS_REDIRECT(cmd))
 1526                 notify = in_rtchange, ip = 0;
 1527         else if (cmd == PRC_MSGSIZE && ip && ip->ip_v == 4) {
 1528                 /*
 1529                  * Check to see if we have a valid TCP connection
 1530                  * corresponding to the address in the ICMP message
 1531                  * payload.
 1532                  *
 1533                  * Boundary check is made in icmp_input(), with ICMP_ADVLENMIN.
 1534                  */
 1535                 th = (struct tcphdr *)((caddr_t)ip + (ip->ip_hl << 2));
 1536 #ifdef INET6
 1537                 memset(&src6, 0, sizeof(src6));
 1538                 memset(&dst6, 0, sizeof(dst6));
 1539                 src6.s6_addr16[5] = dst6.s6_addr16[5] = 0xffff;
 1540                 memcpy(&src6.s6_addr32[3], &ip->ip_src, sizeof(struct in_addr));
 1541                 memcpy(&dst6.s6_addr32[3], &ip->ip_dst, sizeof(struct in_addr));
 1542 #endif
 1543                 if ((inp = in_pcblookup_connect(&tcbtable, ip->ip_dst,
 1544                     th->th_dport, ip->ip_src, th->th_sport)) != NULL)
 1545 #ifdef INET6
 1546                         in6p = NULL;
 1547 #else
 1548                         ;
 1549 #endif
 1550 #ifdef INET6
 1551                 else if ((in6p = in6_pcblookup_connect(&tcbtable, &dst6,
 1552                     th->th_dport, &src6, th->th_sport, 0)) != NULL)
 1553                         ;
 1554 #endif
 1555                 else
 1556                         return NULL;
 1557 
 1558                 /*
 1559                  * Now that we've validated that we are actually communicating
 1560                  * with the host indicated in the ICMP message, locate the
 1561                  * ICMP header, recalculate the new MTU, and create the
 1562                  * corresponding routing entry.
 1563                  */
 1564                 icp = (struct icmp *)((caddr_t)ip -
 1565                     offsetof(struct icmp, icmp_ip));
 1566                 if (inp) {
 1567                         if ((tp = intotcpcb(inp)) == NULL)
 1568                                 return NULL;
 1569                 }
 1570 #ifdef INET6
 1571                 else if (in6p) {
 1572                         if ((tp = in6totcpcb(in6p)) == NULL)
 1573                                 return NULL;
 1574                 }
 1575 #endif
 1576                 else
 1577                         return NULL;
 1578                 seq = ntohl(th->th_seq);
 1579                 if (SEQ_LT(seq, tp->snd_una) || SEQ_GT(seq, tp->snd_max))
 1580                         return NULL;
 1581                 /* 
 1582                  * If the ICMP message advertises a Next-Hop MTU
 1583                  * equal or larger than the maximum packet size we have
 1584                  * ever sent, drop the message.
 1585                  */
 1586                 mtu = (u_int)ntohs(icp->icmp_nextmtu);
 1587                 if (mtu >= tp->t_pmtud_mtu_sent)
 1588                         return NULL;
 1589                 if (mtu >= tcp_hdrsz(tp) + tp->t_pmtud_mss_acked) {
 1590                         /* 
 1591                          * Calculate new MTU, and create corresponding
 1592                          * route (traditional PMTUD).
 1593                          */
 1594                         tp->t_flags &= ~TF_PMTUD_PEND;
 1595                         icmp_mtudisc(icp, ip->ip_dst);
 1596                 } else {
 1597                         /*
 1598                          * Record the information got in the ICMP
 1599                          * message; act on it later.
 1600                          * If we had already recorded an ICMP message,
 1601                          * replace the old one only if the new message
 1602                          * refers to an older TCP segment
 1603                          */
 1604                         if (tp->t_flags & TF_PMTUD_PEND) {
 1605                                 if (SEQ_LT(tp->t_pmtud_th_seq, seq))
 1606                                         return NULL;
 1607                         } else
 1608                                 tp->t_flags |= TF_PMTUD_PEND;
 1609                         tp->t_pmtud_th_seq = seq;
 1610                         tp->t_pmtud_nextmtu = icp->icmp_nextmtu;
 1611                         tp->t_pmtud_ip_len = icp->icmp_ip.ip_len;
 1612                         tp->t_pmtud_ip_hl = icp->icmp_ip.ip_hl;
 1613                 }
 1614                 return NULL;
 1615         } else if (cmd == PRC_HOSTDEAD)
 1616                 ip = 0;
 1617         else if (errno == 0)
 1618                 return NULL;
 1619         if (ip && ip->ip_v == 4 && sa->sa_family == AF_INET) {
 1620                 th = (struct tcphdr *)((caddr_t)ip + (ip->ip_hl << 2));
 1621                 nmatch = in_pcbnotify(&tcbtable, satosin(sa)->sin_addr,
 1622                     th->th_dport, ip->ip_src, th->th_sport, errno, notify);
 1623                 if (nmatch == 0 && syn_cache_count &&
 1624                     (inetctlerrmap[cmd] == EHOSTUNREACH ||
 1625                     inetctlerrmap[cmd] == ENETUNREACH ||
 1626                     inetctlerrmap[cmd] == EHOSTDOWN)) {
 1627                         struct sockaddr_in sin;
 1628                         bzero(&sin, sizeof(sin));
 1629                         sin.sin_len = sizeof(sin);
 1630                         sin.sin_family = AF_INET;
 1631                         sin.sin_port = th->th_sport;
 1632                         sin.sin_addr = ip->ip_src;
 1633                         syn_cache_unreach((struct sockaddr *)&sin, sa, th);
 1634                 }
 1635 
 1636                 /* XXX mapped address case */
 1637         } else
 1638                 in_pcbnotifyall(&tcbtable, satosin(sa)->sin_addr, errno,
 1639                     notify);
 1640         return NULL;
 1641 }
 1642 
 1643 /*
 1644  * When a source quench is received, we are being notified of congestion.
 1645  * Close the congestion window down to the Loss Window (one segment).
 1646  * We will gradually open it again as we proceed.
 1647  */
 1648 void
 1649 tcp_quench(struct inpcb *inp, int errno)
 1650 {
 1651         struct tcpcb *tp = intotcpcb(inp);
 1652 
 1653         if (tp) {
 1654                 tp->snd_cwnd = tp->t_segsz;
 1655                 tp->t_bytes_acked = 0;
 1656         }
 1657 }
 1658 #endif
 1659 
 1660 #ifdef INET6
 1661 void
 1662 tcp6_quench(struct in6pcb *in6p, int errno)
 1663 {
 1664         struct tcpcb *tp = in6totcpcb(in6p);
 1665 
 1666         if (tp) {
 1667                 tp->snd_cwnd = tp->t_segsz;
 1668                 tp->t_bytes_acked = 0;
 1669         }
 1670 }
 1671 #endif
 1672 
 1673 #ifdef INET
 1674 /*
 1675  * Path MTU Discovery handlers.
 1676  */
 1677 void
 1678 tcp_mtudisc_callback(struct in_addr faddr)
 1679 {
 1680 #ifdef INET6
 1681         struct in6_addr in6;
 1682 #endif
 1683 
 1684         in_pcbnotifyall(&tcbtable, faddr, EMSGSIZE, tcp_mtudisc);
 1685 #ifdef INET6
 1686         memset(&in6, 0, sizeof(in6));
 1687         in6.s6_addr16[5] = 0xffff;
 1688         memcpy(&in6.s6_addr32[3], &faddr, sizeof(struct in_addr));
 1689         tcp6_mtudisc_callback(&in6);
 1690 #endif
 1691 }
 1692 
 1693 /*
 1694  * On receipt of path MTU corrections, flush old route and replace it
 1695  * with the new one.  Retransmit all unacknowledged packets, to ensure
 1696  * that all packets will be received.
 1697  */
 1698 void
 1699 tcp_mtudisc(struct inpcb *inp, int errno)
 1700 {
 1701         struct tcpcb *tp = intotcpcb(inp);
 1702         struct rtentry *rt = in_pcbrtentry(inp);
 1703 
 1704         if (tp != 0) {
 1705                 if (rt != 0) {
 1706                         /*
 1707                          * If this was not a host route, remove and realloc.
 1708                          */
 1709                         if ((rt->rt_flags & RTF_HOST) == 0) {
 1710                                 in_rtchange(inp, errno);
 1711                                 if ((rt = in_pcbrtentry(inp)) == 0)
 1712                                         return;
 1713                         }
 1714 
 1715                         /*
 1716                          * Slow start out of the error condition.  We
 1717                          * use the MTU because we know it's smaller
 1718                          * than the previously transmitted segment.
 1719                          *
 1720                          * Note: This is more conservative than the
 1721                          * suggestion in draft-floyd-incr-init-win-03.
 1722                          */
 1723                         if (rt->rt_rmx.rmx_mtu != 0)
 1724                                 tp->snd_cwnd =
 1725                                     TCP_INITIAL_WINDOW(tcp_init_win,
 1726                                     rt->rt_rmx.rmx_mtu);
 1727                 }
 1728 
 1729                 /*
 1730                  * Resend unacknowledged packets.
 1731                  */
 1732                 tp->snd_nxt = tp->sack_newdata = tp->snd_una;
 1733                 tcp_output(tp);
 1734         }
 1735 }
 1736 #endif
 1737 
 1738 #ifdef INET6
 1739 /*
 1740  * Path MTU Discovery handlers.
 1741  */
 1742 void
 1743 tcp6_mtudisc_callback(struct in6_addr *faddr)
 1744 {
 1745         struct sockaddr_in6 sin6;
 1746 
 1747         bzero(&sin6, sizeof(sin6));
 1748         sin6.sin6_family = AF_INET6;
 1749         sin6.sin6_len = sizeof(struct sockaddr_in6);
 1750         sin6.sin6_addr = *faddr;
 1751         (void) in6_pcbnotify(&tcbtable, (struct sockaddr *)&sin6, 0,
 1752             (const struct sockaddr *)&sa6_any, 0, PRC_MSGSIZE, NULL, tcp6_mtudisc);
 1753 }
 1754 
 1755 void
 1756 tcp6_mtudisc(struct in6pcb *in6p, int errno)
 1757 {
 1758         struct tcpcb *tp = in6totcpcb(in6p);
 1759         struct rtentry *rt = in6_pcbrtentry(in6p);
 1760 
 1761         if (tp != 0) {
 1762                 if (rt != 0) {
 1763                         /*
 1764                          * If this was not a host route, remove and realloc.
 1765                          */
 1766                         if ((rt->rt_flags & RTF_HOST) == 0) {
 1767                                 in6_rtchange(in6p, errno);
 1768                                 if ((rt = in6_pcbrtentry(in6p)) == 0)
 1769                                         return;
 1770                         }
 1771 
 1772                         /*
 1773                          * Slow start out of the error condition.  We
 1774                          * use the MTU because we know it's smaller
 1775                          * than the previously transmitted segment.
 1776                          *
 1777                          * Note: This is more conservative than the
 1778                          * suggestion in draft-floyd-incr-init-win-03.
 1779                          */
 1780                         if (rt->rt_rmx.rmx_mtu != 0)
 1781                                 tp->snd_cwnd =
 1782                                     TCP_INITIAL_WINDOW(tcp_init_win,
 1783                                     rt->rt_rmx.rmx_mtu);
 1784                 }
 1785 
 1786                 /*
 1787                  * Resend unacknowledged packets.
 1788                  */
 1789                 tp->snd_nxt = tp->sack_newdata = tp->snd_una;
 1790                 tcp_output(tp);
 1791         }
 1792 }
 1793 #endif /* INET6 */
 1794 
 1795 /*
 1796  * Compute the MSS to advertise to the peer.  Called only during
 1797  * the 3-way handshake.  If we are the server (peer initiated
 1798  * connection), we are called with a pointer to the interface
 1799  * on which the SYN packet arrived.  If we are the client (we
 1800  * initiated connection), we are called with a pointer to the
 1801  * interface out which this connection should go.
 1802  *
 1803  * NOTE: Do not subtract IP option/extension header size nor IPsec
 1804  * header size from MSS advertisement.  MSS option must hold the maximum
 1805  * segment size we can accept, so it must always be:
 1806  *       max(if mtu) - ip header - tcp header
 1807  */
 1808 u_long
 1809 tcp_mss_to_advertise(const struct ifnet *ifp, int af)
 1810 {
 1811         extern u_long in_maxmtu;
 1812         u_long mss = 0;
 1813         u_long hdrsiz;
 1814 
 1815         /*
 1816          * In order to avoid defeating path MTU discovery on the peer,
 1817          * we advertise the max MTU of all attached networks as our MSS,
 1818          * per RFC 1191, section 3.1.
 1819          *
 1820          * We provide the option to advertise just the MTU of
 1821          * the interface on which we hope this connection will
 1822          * be receiving.  If we are responding to a SYN, we
 1823          * will have a pretty good idea about this, but when
 1824          * initiating a connection there is a bit more doubt.
 1825          *
 1826          * We also need to ensure that loopback has a large enough
 1827          * MSS, as the loopback MTU is never included in in_maxmtu.
 1828          */
 1829 
 1830         if (ifp != NULL)
 1831                 switch (af) {
 1832                 case AF_INET:
 1833                         mss = ifp->if_mtu;
 1834                         break;
 1835 #ifdef INET6
 1836                 case AF_INET6:
 1837                         mss = IN6_LINKMTU(ifp);
 1838                         break;
 1839 #endif
 1840                 }
 1841 
 1842         if (tcp_mss_ifmtu == 0)
 1843                 switch (af) {
 1844                 case AF_INET:
 1845                         mss = max(in_maxmtu, mss);
 1846                         break;
 1847 #ifdef INET6
 1848                 case AF_INET6:
 1849                         mss = max(in6_maxmtu, mss);
 1850                         break;
 1851 #endif
 1852                 }
 1853 
 1854         switch (af) {
 1855         case AF_INET:
 1856                 hdrsiz = sizeof(struct ip);
 1857                 break;
 1858 #ifdef INET6
 1859         case AF_INET6:
 1860                 hdrsiz = sizeof(struct ip6_hdr);
 1861                 break;
 1862 #endif
 1863         default:
 1864                 hdrsiz = 0;
 1865                 break;
 1866         }
 1867         hdrsiz += sizeof(struct tcphdr);
 1868         if (mss > hdrsiz)
 1869                 mss -= hdrsiz;
 1870 
 1871         mss = max(tcp_mssdflt, mss);
 1872         return (mss);
 1873 }
 1874 
 1875 /*
 1876  * Set connection variables based on the peer's advertised MSS.
 1877  * We are passed the TCPCB for the actual connection.  If we
 1878  * are the server, we are called by the compressed state engine
 1879  * when the 3-way handshake is complete.  If we are the client,
 1880  * we are called when we receive the SYN,ACK from the server.
 1881  *
 1882  * NOTE: Our advertised MSS value must be initialized in the TCPCB
 1883  * before this routine is called!
 1884  */
 1885 void
 1886 tcp_mss_from_peer(struct tcpcb *tp, int offer)
 1887 {
 1888         struct socket *so;
 1889 #if defined(RTV_SPIPE) || defined(RTV_SSTHRESH)
 1890         struct rtentry *rt;
 1891 #endif
 1892         u_long bufsize;
 1893         int mss;
 1894 
 1895 #ifdef DIAGNOSTIC
 1896         if (tp->t_inpcb && tp->t_in6pcb)
 1897                 panic("tcp_mss_from_peer: both t_inpcb and t_in6pcb are set");
 1898 #endif
 1899         so = NULL;
 1900         rt = NULL;
 1901 #ifdef INET
 1902         if (tp->t_inpcb) {
 1903                 so = tp->t_inpcb->inp_socket;
 1904 #if defined(RTV_SPIPE) || defined(RTV_SSTHRESH)
 1905                 rt = in_pcbrtentry(tp->t_inpcb);
 1906 #endif
 1907         }
 1908 #endif
 1909 #ifdef INET6
 1910         if (tp->t_in6pcb) {
 1911                 so = tp->t_in6pcb->in6p_socket;
 1912 #if defined(RTV_SPIPE) || defined(RTV_SSTHRESH)
 1913                 rt = in6_pcbrtentry(tp->t_in6pcb);
 1914 #endif
 1915         }
 1916 #endif
 1917 
 1918         /*
 1919          * As per RFC1122, use the default MSS value, unless they
 1920          * sent us an offer.  Do not accept offers less than 256 bytes.
 1921          */
 1922         mss = tcp_mssdflt;
 1923         if (offer)
 1924                 mss = offer;
 1925         mss = max(mss, 256);            /* sanity */
 1926         tp->t_peermss = mss;
 1927         mss -= tcp_optlen(tp);
 1928 #ifdef INET
 1929         if (tp->t_inpcb)
 1930                 mss -= ip_optlen(tp->t_inpcb);
 1931 #endif
 1932 #ifdef INET6
 1933         if (tp->t_in6pcb)
 1934                 mss -= ip6_optlen(tp->t_in6pcb);
 1935 #endif
 1936 
 1937         /*
 1938          * If there's a pipesize, change the socket buffer to that size.
 1939          * Make the socket buffer an integral number of MSS units.  If
 1940          * the MSS is larger than the socket buffer, artificially decrease
 1941          * the MSS.
 1942          */
 1943 #ifdef RTV_SPIPE
 1944         if (rt != NULL && rt->rt_rmx.rmx_sendpipe != 0)
 1945                 bufsize = rt->rt_rmx.rmx_sendpipe;
 1946         else
 1947 #endif
 1948         {
 1949                 KASSERT(so != NULL);
 1950                 bufsize = so->so_snd.sb_hiwat;
 1951         }
 1952         if (bufsize < mss)
 1953                 mss = bufsize;
 1954         else {
 1955                 bufsize = roundup(bufsize, mss);
 1956                 if (bufsize > sb_max)
 1957                         bufsize = sb_max;
 1958                 (void) sbreserve(&so->so_snd, bufsize, so);
 1959         }
 1960         tp->t_segsz = mss;
 1961 
 1962 #ifdef RTV_SSTHRESH
 1963         if (rt != NULL && rt->rt_rmx.rmx_ssthresh) {
 1964                 /*
 1965                  * There's some sort of gateway or interface buffer
 1966                  * limit on the path.  Use this to set the slow
 1967                  * start threshold, but set the threshold to no less
 1968                  * than 2 * MSS.
 1969                  */
 1970                 tp->snd_ssthresh = max(2 * mss, rt->rt_rmx.rmx_ssthresh);
 1971         }
 1972 #endif
 1973 }
 1974 
 1975 /*
 1976  * Processing necessary when a TCP connection is established.
 1977  */
 1978 void
 1979 tcp_established(struct tcpcb *tp)
 1980 {
 1981         struct socket *so;
 1982 #ifdef RTV_RPIPE
 1983         struct rtentry *rt;
 1984 #endif
 1985         u_long bufsize;
 1986 
 1987 #ifdef DIAGNOSTIC
 1988         if (tp->t_inpcb && tp->t_in6pcb)
 1989                 panic("tcp_established: both t_inpcb and t_in6pcb are set");
 1990 #endif
 1991         so = NULL;
 1992         rt = NULL;
 1993 #ifdef INET
 1994         if (tp->t_inpcb) {
 1995                 so = tp->t_inpcb->inp_socket;
 1996 #if defined(RTV_RPIPE)
 1997                 rt = in_pcbrtentry(tp->t_inpcb);
 1998 #endif
 1999         }
 2000 #endif
 2001 #ifdef INET6
 2002         if (tp->t_in6pcb) {
 2003                 so = tp->t_in6pcb->in6p_socket;
 2004 #if defined(RTV_RPIPE)
 2005                 rt = in6_pcbrtentry(tp->t_in6pcb);
 2006 #endif
 2007         }
 2008 #endif
 2009 
 2010         tp->t_state = TCPS_ESTABLISHED;
 2011         TCP_TIMER_ARM(tp, TCPT_KEEP, tcp_keepidle);
 2012 
 2013 #ifdef RTV_RPIPE
 2014         if (rt != NULL && rt->rt_rmx.rmx_recvpipe != 0)
 2015                 bufsize = rt->rt_rmx.rmx_recvpipe;
 2016         else
 2017 #endif
 2018         {
 2019                 KASSERT(so != NULL);
 2020                 bufsize = so->so_rcv.sb_hiwat;
 2021         }
 2022         if (bufsize > tp->t_ourmss) {
 2023                 bufsize = roundup(bufsize, tp->t_ourmss);
 2024                 if (bufsize > sb_max)
 2025                         bufsize = sb_max;
 2026                 (void) sbreserve(&so->so_rcv, bufsize, so);
 2027         }
 2028 }
 2029 
 2030 /*
 2031  * Check if there's an initial rtt or rttvar.  Convert from the
 2032  * route-table units to scaled multiples of the slow timeout timer.
 2033  * Called only during the 3-way handshake.
 2034  */
 2035 void
 2036 tcp_rmx_rtt(struct tcpcb *tp)
 2037 {
 2038 #ifdef RTV_RTT
 2039         struct rtentry *rt = NULL;
 2040         int rtt;
 2041 
 2042 #ifdef DIAGNOSTIC
 2043         if (tp->t_inpcb && tp->t_in6pcb)
 2044                 panic("tcp_rmx_rtt: both t_inpcb and t_in6pcb are set");
 2045 #endif
 2046 #ifdef INET
 2047         if (tp->t_inpcb)
 2048                 rt = in_pcbrtentry(tp->t_inpcb);
 2049 #endif
 2050 #ifdef INET6
 2051         if (tp->t_in6pcb)
 2052                 rt = in6_pcbrtentry(tp->t_in6pcb);
 2053 #endif
 2054         if (rt == NULL)
 2055                 return;
 2056 
 2057         if (tp->t_srtt == 0 && (rtt = rt->rt_rmx.rmx_rtt)) {
 2058                 /*
 2059                  * XXX The lock bit for MTU indicates that the value
 2060                  * is also a minimum value; this is subject to time.
 2061                  */
 2062                 if (rt->rt_rmx.rmx_locks & RTV_RTT)
 2063                         TCPT_RANGESET(tp->t_rttmin,
 2064                             rtt / (RTM_RTTUNIT / PR_SLOWHZ),
 2065                             TCPTV_MIN, TCPTV_REXMTMAX);
 2066                 tp->t_srtt = rtt /
 2067                     ((RTM_RTTUNIT / PR_SLOWHZ) >> (TCP_RTT_SHIFT + 2));
 2068                 if (rt->rt_rmx.rmx_rttvar) {
 2069                         tp->t_rttvar = rt->rt_rmx.rmx_rttvar /
 2070                             ((RTM_RTTUNIT / PR_SLOWHZ) >>
 2071                                 (TCP_RTTVAR_SHIFT + 2));
 2072                 } else {
 2073                         /* Default variation is +- 1 rtt */
 2074                         tp->t_rttvar =
 2075                             tp->t_srtt >> (TCP_RTT_SHIFT - TCP_RTTVAR_SHIFT);
 2076                 }
 2077                 TCPT_RANGESET(tp->t_rxtcur,
 2078                     ((tp->t_srtt >> 2) + tp->t_rttvar) >> (1 + 2),
 2079                     tp->t_rttmin, TCPTV_REXMTMAX);
 2080         }
 2081 #endif
 2082 }
 2083 
 2084 tcp_seq  tcp_iss_seq = 0;       /* tcp initial seq # */
 2085 #if NRND > 0
 2086 u_int8_t tcp_iss_secret[16];    /* 128 bits; should be plenty */
 2087 #endif
 2088 
 2089 /*
 2090  * Get a new sequence value given a tcp control block
 2091  */
 2092 tcp_seq
 2093 tcp_new_iss(struct tcpcb *tp, tcp_seq addin)
 2094 {
 2095 
 2096 #ifdef INET
 2097         if (tp->t_inpcb != NULL) {
 2098                 return (tcp_new_iss1(&tp->t_inpcb->inp_laddr,
 2099                     &tp->t_inpcb->inp_faddr, tp->t_inpcb->inp_lport,
 2100                     tp->t_inpcb->inp_fport, sizeof(tp->t_inpcb->inp_laddr),
 2101                     addin));
 2102         }
 2103 #endif
 2104 #ifdef INET6
 2105         if (tp->t_in6pcb != NULL) {
 2106                 return (tcp_new_iss1(&tp->t_in6pcb->in6p_laddr,
 2107                     &tp->t_in6pcb->in6p_faddr, tp->t_in6pcb->in6p_lport,
 2108                     tp->t_in6pcb->in6p_fport, sizeof(tp->t_in6pcb->in6p_laddr),
 2109                     addin));
 2110         }
 2111 #endif
 2112         /* Not possible. */
 2113         panic("tcp_new_iss");
 2114 }
 2115 
 2116 /*
 2117  * This routine actually generates a new TCP initial sequence number.
 2118  */
 2119 tcp_seq
 2120 tcp_new_iss1(void *laddr, void *faddr, u_int16_t lport, u_int16_t fport,
 2121     size_t addrsz, tcp_seq addin)
 2122 {
 2123         tcp_seq tcp_iss;
 2124 
 2125 #if NRND > 0
 2126         static int beenhere;
 2127 
 2128         /*
 2129          * If we haven't been here before, initialize our cryptographic
 2130          * hash secret.
 2131          */
 2132         if (beenhere == 0) {
 2133                 rnd_extract_data(tcp_iss_secret, sizeof(tcp_iss_secret),
 2134                     RND_EXTRACT_ANY);
 2135                 beenhere = 1;
 2136         }
 2137 
 2138         if (tcp_do_rfc1948) {
 2139                 MD5_CTX ctx;
 2140                 u_int8_t hash[16];      /* XXX MD5 knowledge */
 2141 
 2142                 /*
 2143                  * Compute the base value of the ISS.  It is a hash
 2144                  * of (saddr, sport, daddr, dport, secret).
 2145                  */
 2146                 MD5Init(&ctx);
 2147 
 2148                 MD5Update(&ctx, (u_char *) laddr, addrsz);
 2149                 MD5Update(&ctx, (u_char *) &lport, sizeof(lport));
 2150 
 2151                 MD5Update(&ctx, (u_char *) faddr, addrsz);
 2152                 MD5Update(&ctx, (u_char *) &fport, sizeof(fport));
 2153 
 2154                 MD5Update(&ctx, tcp_iss_secret, sizeof(tcp_iss_secret));
 2155 
 2156                 MD5Final(hash, &ctx);
 2157 
 2158                 memcpy(&tcp_iss, hash, sizeof(tcp_iss));
 2159 
 2160                 /*
 2161                  * Now increment our "timer", and add it in to
 2162                  * the computed value.
 2163                  *
 2164                  * XXX Use `addin'?
 2165                  * XXX TCP_ISSINCR too large to use?
 2166                  */
 2167                 tcp_iss_seq += TCP_ISSINCR;
 2168 #ifdef TCPISS_DEBUG
 2169                 printf("ISS hash 0x%08x, ", tcp_iss);
 2170 #endif
 2171                 tcp_iss += tcp_iss_seq + addin;
 2172 #ifdef TCPISS_DEBUG
 2173                 printf("new ISS 0x%08x\n", tcp_iss);
 2174 #endif
 2175         } else
 2176 #endif /* NRND > 0 */
 2177         {
 2178                 /*
 2179                  * Randomize.
 2180                  */
 2181 #if NRND > 0
 2182                 rnd_extract_data(&tcp_iss, sizeof(tcp_iss), RND_EXTRACT_ANY);
 2183 #else
 2184                 tcp_iss = arc4random();
 2185 #endif
 2186 
 2187                 /*
 2188                  * If we were asked to add some amount to a known value,
 2189                  * we will take a random value obtained above, mask off
 2190                  * the upper bits, and add in the known value.  We also
 2191                  * add in a constant to ensure that we are at least a
 2192                  * certain distance from the original value.
 2193                  *
 2194                  * This is used when an old connection is in timed wait
 2195                  * and we have a new one coming in, for instance.
 2196                  */
 2197                 if (addin != 0) {
 2198 #ifdef TCPISS_DEBUG
 2199                         printf("Random %08x, ", tcp_iss);
 2200 #endif
 2201                         tcp_iss &= TCP_ISS_RANDOM_MASK;
 2202                         tcp_iss += addin + TCP_ISSINCR;
 2203 #ifdef TCPISS_DEBUG
 2204                         printf("Old ISS %08x, ISS %08x\n", addin, tcp_iss);
 2205 #endif
 2206                 } else {
 2207                         tcp_iss &= TCP_ISS_RANDOM_MASK;
 2208                         tcp_iss += tcp_iss_seq;
 2209                         tcp_iss_seq += TCP_ISSINCR;
 2210 #ifdef TCPISS_DEBUG
 2211                         printf("ISS %08x\n", tcp_iss);
 2212 #endif
 2213                 }
 2214         }
 2215 
 2216         if (tcp_compat_42) {
 2217                 /*
 2218                  * Limit it to the positive range for really old TCP
 2219                  * implementations.
 2220                  * Just AND off the top bit instead of checking if
 2221                  * is set first - saves a branch 50% of the time.
 2222                  */
 2223                 tcp_iss &= 0x7fffffff;          /* XXX */
 2224         }
 2225 
 2226         return (tcp_iss);
 2227 }
 2228 
 2229 #if defined(IPSEC) || defined(FAST_IPSEC)
 2230 /* compute ESP/AH header size for TCP, including outer IP header. */
 2231 size_t
 2232 ipsec4_hdrsiz_tcp(struct tcpcb *tp)
 2233 {
 2234         struct inpcb *inp;
 2235         size_t hdrsiz;
 2236 
 2237         /* XXX mapped addr case (tp->t_in6pcb) */
 2238         if (!tp || !tp->t_template || !(inp = tp->t_inpcb))
 2239                 return 0;
 2240         switch (tp->t_family) {
 2241         case AF_INET:
 2242                 /* XXX: should use currect direction. */
 2243                 hdrsiz = ipsec4_hdrsiz(tp->t_template, IPSEC_DIR_OUTBOUND, inp);
 2244                 break;
 2245         default:
 2246                 hdrsiz = 0;
 2247                 break;
 2248         }
 2249 
 2250         return hdrsiz;
 2251 }
 2252 
 2253 #ifdef INET6
 2254 size_t
 2255 ipsec6_hdrsiz_tcp(struct tcpcb *tp)
 2256 {
 2257         struct in6pcb *in6p;
 2258         size_t hdrsiz;
 2259 
 2260         if (!tp || !tp->t_template || !(in6p = tp->t_in6pcb))
 2261                 return 0;
 2262         switch (tp->t_family) {
 2263         case AF_INET6:
 2264                 /* XXX: should use currect direction. */
 2265                 hdrsiz = ipsec6_hdrsiz(tp->t_template, IPSEC_DIR_OUTBOUND, in6p);
 2266                 break;
 2267         case AF_INET:
 2268                 /* mapped address case - tricky */
 2269         default:
 2270                 hdrsiz = 0;
 2271                 break;
 2272         }
 2273 
 2274         return hdrsiz;
 2275 }
 2276 #endif
 2277 #endif /*IPSEC*/
 2278 
 2279 /*
 2280  * Determine the length of the TCP options for this connection.
 2281  *
 2282  * XXX:  What do we do for SACK, when we add that?  Just reserve
 2283  *       all of the space?  Otherwise we can't exactly be incrementing
 2284  *       cwnd by an amount that varies depending on the amount we last
 2285  *       had to SACK!
 2286  */
 2287 
 2288 u_int
 2289 tcp_optlen(struct tcpcb *tp)
 2290 {
 2291         u_int optlen;
 2292 
 2293         optlen = 0;
 2294         if ((tp->t_flags & (TF_REQ_TSTMP|TF_RCVD_TSTMP|TF_NOOPT)) ==
 2295             (TF_REQ_TSTMP | TF_RCVD_TSTMP))
 2296                 optlen += TCPOLEN_TSTAMP_APPA;
 2297 
 2298 #ifdef TCP_SIGNATURE
 2299         if (tp->t_flags & TF_SIGNATURE)
 2300                 optlen += TCPOLEN_SIGNATURE + 2;
 2301 #endif /* TCP_SIGNATURE */
 2302 
 2303         return optlen;
 2304 }
 2305 
 2306 u_int
 2307 tcp_hdrsz(struct tcpcb *tp)
 2308 {
 2309         u_int hlen;
 2310 
 2311         switch (tp->t_family) {
 2312 #ifdef INET6
 2313         case AF_INET6:
 2314                 hlen = sizeof(struct ip6_hdr);
 2315                 break;
 2316 #endif
 2317         case AF_INET:
 2318                 hlen = sizeof(struct ip);
 2319                 break;
 2320         default:
 2321                 hlen = 0;
 2322                 break;
 2323         }
 2324         hlen += sizeof(struct tcphdr);
 2325 
 2326         if ((tp->t_flags & (TF_REQ_TSTMP|TF_NOOPT)) == TF_REQ_TSTMP &&
 2327             (tp->t_flags & TF_RCVD_TSTMP) == TF_RCVD_TSTMP)
 2328                 hlen += TCPOLEN_TSTAMP_APPA;
 2329 #ifdef TCP_SIGNATURE
 2330         if (tp->t_flags & TF_SIGNATURE)
 2331                 hlen += TCPOLEN_SIGLEN;
 2332 #endif
 2333         return hlen;
 2334 }

Cache object: 9b3b526765a846510d06ca717142e420


[ source navigation ] [ diff markup ] [ identifier search ] [ freetext search ] [ file search ] [ list types ] [ track identifier ]


This page is part of the FreeBSD/Linux Linux Kernel Cross-Reference, and was automatically generated using a modified version of the LXR engine.