The tcp provider provides probes for tracing the TCP protocol.
This provider integrated into Solaris Nevada build 142.
The tcp probes are described in the table below.
|state-change||Probe that fires a TCP session changes its TCP state. Previous state is noted in the tcplsinfo_t * probe argument. The tcpinfo_t * and ipinfo_t * arguments are NULL.|
|send||Probe that fires whenever TCP sends a segment (either control or data).|
|receive||Probe that fires whenever TCP receives a segment (either control or data).|
|connect-request||Probe that fires when a TCP active open is initiated by sending an initial SYN segment. The tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers associated with the initial SYN segment sent.|
|connect-established||This probe fires when either of the following occurs: either a TCP active OPEN succeeds - the initial SYN has been sent and a valid SYN,ACK segment has been received in response. TCP enters the ESTABLISHED state, and the tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers associated with the SYN,ACK segment received; or a simultaneous active OPEN succeeds and a final ACK is received from the peer TCP. TCP has entered the ESTABLISHED state and the tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers of the final ACK received. The common thread in these cases is that an active-OPEN connection is established at this point, in contrast with tcp:::accept-established which fires on passive connection establishment. In both cases above, the TCP segment that is presented via the tcpinfo_t * is the segment that triggers the transition to ESTABLISHED - the received SYN,ACK in the first case and the final ACK segment in the second.|
|connect-refused||A TCP active OPEN connection attempt was refused by the peer - a RST segment was received in acknowledgment of the initial SYN. The tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers associated with the RST,ACK segment received.|
|accept-established||A passive open has succeeded - an initial active OPEN initiation SYN has been received, TCP responded with a SYN,ACK and a final ACK has been received. TCP has entered the ESTABLISHED state. The tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers associated with the final ACK segment received.|
|accept-refused||An incoming SYN has arrived for a destination port with no listening connection, so the connection initiation request is rejected by sending a RST segment ACKing the SYN. The tcpinfo_t * and ipinfo_t * probe arguments represent the TCP and IP headers associated with the RST segment sent.|
The send and receive probes trace packets on physical interfaces and also packets on loopback interfaces that are processed by tcp. On Solaris, loopback TCP connections can bypass the TCP layer when transferring data packets - this is a performance feature called tcp fusion; these packets are also traced by the tcp provider.
The argument types for the tcp probes are listed in the table below. The arguments are described in the following section. All probes expect state-change have 5 arguments - state-change has 6.
|state-change||null||csinfo_t *||null||tcpsinfo_t *||null||tcplsinfo_t *|
|send||pktinfo_t *||csinfo_t *||ipinfo_t *||tcpsinfo_t *||tcpinfo_t *|
|receive||pktinfo_t *||csinfo_t *||ipinfo_t *||tcpsinfo_t *||tcpinfo_t *|
|connect-request||pktinfo_t *||csinfo_t *||ipinfo_t *||tcpsinfo_t *||tcpinfo_t *|
|connect-established||pktinfo_t *||csinfo_t *||ipinfo_t *||tcpsinfo_t *||tcpinfo_t *|
|connect-refused||pktinfo_t *||csinfo_t *||ipinfo_t *||tcpsinfo_t *||tcpinfo_t *|
|accept-established||pktinfo_t *||csinfo_t *||ipinfo_t *||tcpsinfo_t *||tcpinfo_t *|
|accept-refused||pktinfo_t *||csinfo_t *||ipinfo_t *||tcpsinfo_t *||tcpinfo_t *|
The pktinfo_t structure is where packet ID info can be made available for deeper analysis if packet IDs become supported by the kernel in the future.
The pkt_addr member is currently always NULL.
The csinfo_t structure is where connection state info is made available. It contains a unique (system-wide) connection ID, and the process ID and zone ID associated with the connection.
|cs_addr||Address of translated ip_xmit_attr_t *.|
|cs_cid||Connection id. A unique per-connection identifier which identifies the connection during its lifetime.|
|cs_pid||Process ID associated with the connection.|
|cs_zoneid||Zone ID associated with the connection.|
The ipinfo_t structure contains common IP info for both IPv4 and IPv6.
These values are read at the time the probe fired in TCP, and so ip_plength is the expected IP payload length - however the IP layer may add headers (such as AH and ESP) which will increase the actual payload length. To examine this, also trace packets using the ip provider.
|ip_ver||IP version number. Currently either 4 or 6.|
|ip_plength||Payload length in bytes. This is the length of the packet at the time of tracing, excluding the IP header.|
|ip_saddr||Source IP address, as a string. For IPv4 this is a dotted decimal quad, IPv6 follows RFC-1884 convention 2 with lower case hexadecimal digits.|
|ip_daddr||Destination IP address, as a string. For IPv4 this is a dotted decimal quad, IPv6 follows RFC-1884 convention 2 with lower case hexadecimal digits.|
The tcpsinfo_t structure contains tcp state info.
It may seem redundant to supply the local and remote ports and addresses here as well as in the tcpinfo_t below, but the tcp:::state-change probes do not have associated tcpinfo_t data, so in order to map the state change to a specific port, we need this data here.
|tcps_addr||Address of translated tcp_t *.|
|tcps_local||is local, boolean. 0: is not delivered locally (uses a physical network interface), 1: is delivered locally (including loopback interfaces, eg lo0),.|
|tcps_active||is an active open, boolean. 0: TCP connection was created from a remote host, 1: TCP connection was created from this host.|
|tcps_lport||local port associated with the TCP connection.|
|tcps_rport||remote port associated with the TCP connection.|
|tcps_laddr||local address associated with the TCP connection, as a string.|
|tcps_raddr||remote address associated with the TCP connection, as a string.|
|tcps_state||TCP state. Inline defintions are provided for the various TCP states: TCP_STATE_CLOSED, TCP_STATE_SYN_SENT, etc. Use inline tcp_state_string to convert state to a string.|
|tcps_iss||Initial sequence number sent.|
|tcps_suna||Lowest sequence number for which we have sent data but not received acknowledgement.|
|tcps_snxt||Next sequence number to send. tcps_snxt - tcps_suna gives the number of bytes pending acknowledgement for the TCP connection|
|tcps_rack||Highest sequence number for which we have received and sent acknowledgement.|
|tcps_rnxt||Next sequence number expected on receive side. tcps_rnxt - tcps_rack gives the number of bytes we have received but not yet acknowledged for the TCP connection.|
|tcps_swnd||TCP send window size.|
|tcps_snd_ws||TCP send window scale. tcps_swnd << tcp_snd_ws gives the scaled window size if window scaling options are in use.|
|tcps_rwnd||TCP receive window size.|
|tcps_rcv_ws||TCP receive window scale. tcps_rwnd << tcp_rcv_ws gives the scaled window size if window scaling options are in use.|
|tcps_cwnd||TCP congestion window size.|
|tcps_cwnd_ssthresh||TCP congestion window threshold. When the congestion window is greater than ssthresh, congestion avoidance begins.|
|tcps_sack_fack||Highest SACK-acked sequence number.|
|tcps_sack_snxt||Next sequence num to be retransmitted using SACK.|
|tcps_rto||Round-trip timeout. If we do not receive acknowledgement of data sent tcps_rto msec ago, retransmit is required.|
|tcps_mss||Maximum segment size.|
|tcps_retransmit||send is a retransmit, boolean. 1 for tcp:::send events that are retransmissions, 0 for tcp events that are not send events, and for send events that are not retransmissions.|
The tcplsinfo_t structure contains the previous tcp state during a state change.
|tcps_state||previous TCP state. Inline defintions are provided for the various TCP states: TCP_STATE_CLOSED, TCP_STATE_SYN_SENT, etc. Use inline tcp_state_string to convert state to a string.|
The tcpinfo_t structure is a DTrace translated version of the TCP header.
|tcp_sport||TCP source port.|
|tcp_dport||TCP destination port.|
|tcp_seq||TCP sequence number.|
|tcp_ack||TCP acknowledgment number.|
|tcp_offset||Payload data offset, in bytes (not 32-bit words).|
|tcp_flags||TCP flags. See the tcp_flags table below for available macros.|
|tcp_window||TCP window size, bytes.|
|tcp_checksum||Checksum of TCP header and payload.|
|tcp_urgent||TCP urgent data pointer, bytes.|
|tcp_hdr||Pointer to raw TCP header at time of tracing.|
|TH_FIN||No more data from sender (finish).|
|TH_SYN||Synchronize sequence numbers (connect).|
|TH_RST||Reset the connection.|
|TH_PUSH||TCP push function.|
|TH_ACK||Acknowledgment field is set.|
|TH_URG||Urgent pointer field is set.|
|TH_ECE||Explicit congestion notification echo (see RFC-3168).|
|TH_CWR||Congestion window reduction.|
See RFC-793 for a detailed explanation of the standard TCP header fields and flags.
Some simple examples of tcp provider usage follow.
This DTrace one-liner counts inbound TCP connections by source IP address:
The output above shows there were 3 TCP connections from 192.168.1.109, a single TCP connection from the IPv6 host fe80::214:4fff:fe8d:59aa, etc.
This DTrace one-liner counts inbound TCP connections by local TCP port:
The output above shows there were 3 TCP connections for port 22 (ssh), a single TCP connection for port 40648 (an RPC port).
Combining the previous two examples produces a useful one liner, to quickly identify who is connecting to what:
The output above shows there were 3 TCP connections from 192.168.1.109 to port 22 (ssh), etc.
It may be useful when troubleshooting connection issues to see who is failing to connect to their requested ports. This is equivalent to seeing where incoming SYNs arrive when no listener is present, as per RFC793:
Here we traced two failed attempts by host 192.168.1.109 to connect to port 23 (telnet).
This DTrace one-liner counts TCP received packets by host address:
The output above shows that 7 TCP packets were recieved from 127.0.0.1, 14 TCP packets from the IPv6 host fe80::214:4fff:fe8d:59aa, etc.
This DTrace one-liner counts TCP received packets by the local TCP port:
The output above shows that 162 packets were received for port 22 (ssh), 36 packets were received for port 40648 (an RPC port), 27 packets for 2049 (NFS), and a few packets to high numbered client ports.
This DTrace one-liner prints distribution plots of IP payload size by destination, for TCP sends:
This DTrace script demonstrates the capability to trace TCP state changes:
This script was run on a system for a couple of minutes:
In the above example output, an inbound connection is traced, It takes 613 us to go from syn-received to
established. An outbound connection attempt is also made to a closed port. It takes 63us to go from bound
to syn-sent, 685 us to go from syn-sent to bound etc.
The fields printed are:
|CPU||CPU id for the event|
|DELTA(us)||time since previous event for that connection, microseconds|
|OLD||old TCP state|
|NEW||new TCP state|
The following DTrace script traces TCP packets and prints various details:
This example output has captured a TCP handshake:
The fields printed are:
|CPU||CPU id that event occurred on|
|LADDR||local IP address|
|LPORT||local TCP port|
|RADDR||remote IP address|
|RPORT||remote TCP port|
|BYTES||TCP payload bytes|
Note: The output may be shuffled slightly on multi-CPU servers due to DTrace per-CPU buffering, and events such as the TCP handshake can be printed out of order. Keep an eye on changes in the CPU column, or add a timestamp column to this script and post sort.
The tcp provider uses DTrace's stability mechanism to describe its stabilities, as shown in the following table. For more information about the stability mechanism, see Chapter 39, Stability.
|Element||Name stability||Data stability||Dependency class|