TCP(4) BSD Programmer's Manual TCP(4)NAMEtcp - Internet Transmission Control Protocol
SYNOPSIS
#include <sys/socket.h>
#include <netinet/in.h>
int
socket(AF_INET, SOCK_STREAM, 0);
DESCRIPTION
The TCP protocol provides reliable, flow-controlled, two-way transmission
of data. It is a byte-stream protocol used to support the SOCK_STREAM
abstraction. TCP uses the standard Internet address format and, in addi-
tion, provides a per-host collection of ``port addresses''. Thus, each
address is composed of an Internet address specifying the host and net-
work, with a specific TCP port on the host identifying the peer entity.
Sockets utilizing the tcp protocol are either ``active'' or ``passive''.
Active sockets initiate connections to passive sockets. By default TCP
sockets are created active; to create a passive socket the listen(2) sys-
tem call must be used after binding the socket with the bind(2) system
call. Only passive sockets may use the accept(2) call to accept incoming
connections. Only active sockets may use the connect(2) call to initiate
connections.
Passive sockets may ``underspecify'' their location to match incoming
connection requests from multiple networks. This technique, termed
``wildcard addressing'', allows a single server to provide service to
clients on multiple networks. To create a socket which listens on all
networks, the Internet address INADDR_ANY must be bound. The TCP port
may still be specified at this time; if the port is not specified the
system will assign one. Once a connection has been established the sock-
et's address is fixed by the peer entity's location. The address as-
signed the socket is the address associated with the network interface
through which packets are being transmitted and received. Normally this
address corresponds to the peer entity's network.
TCP supports several socket options which are set with setsockopt(2) and
tested with getsockopt(2). They are all defined in <netinet/tcp.h>.
TCP_NODELAY Under most circumstances, TCP sends data when it is pre-
sented; when outstanding data has not yet been acknowl-
edged, it gathers small amounts of output to be sent in a
single packet once an acknowledgement is received. For a
small number of clients, such as window systems that send
a stream of mouse events which receive no replies, this
packetization may cause significant delays. Therefore,
TCP provides a boolean option, TCP_NODELAY to defeat this
algorithm.
TCP_MAXSEG TCP maintains the maximum size packet that can be sent for
each connection. With getsockopt(2), the TCP_MAXSEG op-
tion can be used to get the current value. Using setsock-
opt(2) it can be set to a smaller, but not larger, value.
TCP_STDURG There is an error in the TCP specification which incor-
rectly states that the TCP Urgent pointer points to the
first byte of non-urgent data, and that is how BSD/OS has
traditionally implemented it. When the boolean option
TCP_STDURG is set, TCP will set the Urgent pointer to the
last byte of urgent data.
TCP_WINSHIFT TCP has a 64k byte window, but through the TCP Window
Scale option the window can be shifted to allow larger
values. The TCP_WINSHIFT option can be used to set the
number of bits by which the window should be shifted.
When set to 0 (the default value), the kernel will auto-
matically determine the correct value to send in the Win-
dow Scale option by looking at the size of the socket re-
ceive buffer, and choosing the smallest value that will
still allow the entire receive buffer to be advertised.
A value of -1 will disable both the TCP Window Scale and
Timestamps options.
Values of 1 through 14 will set the shift value to be used
in the Window Scale option, regardless of the size of the
socket receive buffer. This is useful when the initial
socket receive buffer is small (less than 64K bytes), but
may at some later point be increased to a larger value
(more than 64K bytes).
Since the Window Scale option is only sent when a TCP con-
nection is first being established, the TCP_WINSHIFT op-
tion must be used before issuing a connect(2) or accept(2)
call.
The option level for the setsockopt(2) call is the protocol number for
TCP, available from getprotobyname(3).
Options at the IP transport level may be used with TCP; see ip(4). In-
coming connection requests that are source-routed are noted, and the re-
verse source route is used in responding.
SYSCTL VARIABLES
Some TCP options can be read or written via the sysctl(3) facility.
Variables specific to TCP are:
CTL_NET, PF_INET, IPPROTO_TCP
These variables are used to get or set various global TCP options.
Fourth level name Type Changeable
TCPCTL_43MAXSEG integer yes
TCPCTL_CONNTIMEO integer yes
TCPCTL_DO_RFC1323 integer yes
TCPCTL_IFP_MAXSEG integer yes
TCPCTL_KEEPCNT integer yes
TCPCTL_KEEPIDLE integer yes
TCPCTL_KEEPINTVL integer yes
TCPCTL_MAXPERSISTIDLE integer yes
TCPCTL_MSSDFLT integer yes
TCPCTL_PMTU integer yes
TCPCTL_PMTU_EXPIRE integer yes
TCPCTL_PMTU_PROBE integer yes
TCPCTL_PMTU_VALID_LEVEL integer yes
TCPCTL_RECVSPACE integer yes
TCPCTL_SENDSPACE integer yes
TCPCTL_STATS struct no
TCPCTL_SYN_BUCKET_LIMIT integer yes
TCPCTL_SYN_CACHE_INTER integer yes
TCPCTL_SYN_CACHE_LIMIT integer yes
TCPCTL_43MAXSEG
If set, TCP limits the value sent in maximum-segment-size options
to the value that would be used by 4.3BSD (see TCPCTL_MSSDFLT).
Otherwise, TCP uses the largest MTU for a local network inter-
face, and utilizes Path MTU Discovery to determine the largest
packet size suitable for the path in use.
TCPCTL_CONNTIMEO
Returns the length of time (in seconds) before an attempt at es-
tablishing a connection fails.
TCPCTL_DO_RFC1323
Returns non-zero when negotiation of the TCP Timestamps and Win-
dow Scaling options is enabled, zero when negotiation is not en-
abled. Normally the use of these TCP options is negotiated for
each connection, thus this value should normally be non-zero.
Setting it to zero disables negotiation of the options, which
might be needed if peer systems or routers cause problems when
the options are negotiated.
TCPCTL_IFP_MAXSEG
If set, use the mtu of the interface associated with the route to
the destination for filling in the MSS option, rather than using
the maximum MTU across all configured interfaces.
TCPCTL_KEEPCNT
Returns the maximum number of keepalive probes sent with no re-
sponse before a connection is timed out (when keepalives are en-
abled for the socket).
TCPCTL_KEEPIDLE
Returns the number of seconds of inactivity required before a TCP
connection is considered ``idle'', at which time keepalive probes
begin if they are enabled for the socket.
TCPCTL_KEEPINTVL
Returns the interval between keepalive probes, in seconds.
TCPCTL_MAXPERSISTIDLE
Returns the maximum amount of time that a TCP connection attempts
to deliver data if flow control prevents data from being sent and
the peer host does not respond.
TCPCTL_MSSDFLT
The default TCP packet size for use to remote networks when the
routing table does not provide a value. Use of a value larger
than 536 may violate Internet standards unless Path MTU Discovery
is enabled.
TCPCTL_PMTU
If set, enables the use of TCP Path MTU Discovery.
TCPCTL_PMTU_EXPIRE
TCPCTL_PMTU_PROBE
The number of seconds that TCP will use the current segment size
before probing whether a larger size could be used (if the cur-
rent segment size is smaller than could be supported by the out-
going network interface). TCPCTL_PMTU_EXPIRE controls how many
seconds to wait after an unsuccessful probe for a larger size,
and TCPCTL_PMTU_PROBE controls how many seconds to wait after a
successful probe for a larger size.
TCPCTL_PMTU_VALID_LEVEL
This controls how "ICMP Host Unreachable - Fragmentation Needed"
messages are verified before they are used to modify route MTUs.
This should normally be left at its default value of 3. By set-
ting it to 2, TCP Sequence numbers in the returned IP/TCP header
will not be verified. A value of 1 will also skip verifying that
the returned IP/TCP header is associated with a valid TCP connec-
tion. A value of 0 will also skip verifying that the returned
header is a TCP packet, i.e. it disables all checks.
TCPCTL_RECVSPACE
Returns the default value (in bytes) of buffering available per
socket for received TCP data. This value may be changed on a
per-socket basis by manipulating the SO_SNDBUF parameter with
setsockopt(2).
TCPCTL_SENDSPACE
Returns the default value (in bytes) of buffering available per
socket for transmitted TCP data. This value may be changed on a
per-socket basis by manipulating the SO_RCVBUF parameter with
setsockopt(2).
TCPCTL_STATS
Returns the struct tcpstat containing statistics about the TCP
layer.
TCPCTL_SYN_CACHE_LIMIT
Returns the upper limit on the total number of entries that can
be kept in the TCP SYN cache. When set to 0, it disables the TCP
SYN cache.
TCPCTL_SYN_BUCKET_LIMIT
Returns the upper limit on the number of entries that can be kept
in any single bucket of the TCP SYN cache.
TCPCTL_SYN_CACHE_INTER
Returns the number of half seconds between scans of the TCP SYN
cache.
DIAGNOSTICS
A socket operation may fail with one of the following errors returned:
[EISCONN] when trying to establish a connection on a socket which
already has one;
[ENOBUFS] when the system runs out of memory for an internal data
structure;
[ETIMEDOUT] when a connection was dropped due to excessive retrans-
missions;
[ECONNRESET] when the remote peer forces the connection to be closed;
[ECONNREFUSED] when the remote peer actively refuses connection estab-
lishment (usually because no process is listening to the
port);
[EADDRINUSE] when an attempt is made to create a socket with a port
which has already been allocated;
[EADDRNOTAVAIL] when an attempt is made to create a socket with a net-
work address for which no network interface exists.
SEE ALSOgetsockopt(2), socket(2), intro(4), inet(4), ip(4)HISTORY
The tcp protocol stack appeared in 4.2BSD.
4.2 Berkeley Distribution June 5, 1993 4