ibd(7D) Devices ibd(7D)NAMEibd - Infiniband IPoIB device driver
SYNOPSIS
/dev/ibd*
DESCRIPTION
The ibd driver implements the IETF IP over Infiniband protocol and pro‐
vides IPoIB service for all IBA ports present in the system.
The ibd driver is a multi-threaded, loadable, clonable, STREAMS hard‐
ware driver supporting the connection-less Data Link Provider Inter‐
face, dlpi(7P). The ibd driver provides basic support for both the IBA
Unreliable Datagram Queue Pair hardware and the IBA Reliable Connected
Queue Pair hardware. Functions include QP initialization, frame trans‐
mit and receive, multicast and promiscuous mode support, and statistics
reporting.
By default, datagram mode is used by each ibd instance, unless the
enable_rc is set to 1 for that instance in the .conf file. This change
can be made on a per instance basis by changing the corresponding value
of the variable. So the Nth value of enable_rc changes the setting for
the Nth instance of ibd. Any value other than 1, or no .conf file at
all is equivalent to specifying datagram mode.
Because ibd over connected mode attempts to use a large MTU (65520
bytes), your application should adapt to the large MTU to get better
performance. For example, you should adopt a large TCP window size.
Use the cloning, character-special device /dev/ibd to access all ibd
devices installed within the system. The ibd driver is dependent on
GLD, a loadable kernel module that provides the ibd driver with the
DLPI and STREAMS functionality required of a LAN driver. Except as
noted in the Application Programming Interface section of this manual
page, see gld(7D), for more details on the primitives supported by the
driver. The GLD module is located at /kernel/misc/sparcv9/gld on 64 bit
systems and at/kernel/misc/gld on 32 bit systems.
The ibd driver expects certain configuration of the IBA fabric prior to
operation (which also implies the SM must be active and managing the
fabric). Specifically, the IBA multicast group representing the IPv4
limited broadcast address 255.255.255.255 (also defined as broadcast-
GID in IETF documents) should be created prior to initializing the
device. IBA properties (including mtu, qkey and sl) of this group is
used by the driver to create any other IBA multicast group as
instructed by higher level (IP) software. The driver probes for the
existence of this broadcast-GID during attach(9E).
APPLICATION PROGRAMMING INTERFACE (DLPI)
The values returned by the driver in the DL_INFO_ACK primitive in
response to your DL_INFO_REQ are:
o Maximum SDU is the MTU associated with the broadcast-GID
group, less the 4 byte IPoIB header.
o Minimum SDU is 0.
o dlsap address length is 22.
o MAC type is DL_IB.
o The sap length value is -2, meaning the physical address
component is followed immediately by a 2-byte sap component
within the DLSAP address.
o Broadcast address value is the MAC address consisting of the
4 bytes of QPN 00:FF:FF:FF prepended to the IBA multicast
address of the broadcast-GID.
Due to the nature of link address definition for IPoIB, the
DL_SET_PHYS_ADDR_REQ DLPI primitive is not supported.
In the transmit case for streams that have been put in raw
mode via the DLIOCRAW ioctl, the DLPI application must
prepend the 20 byte IPoIB destination address to the data it
wants to transmit over-the-wire. In the receive case, appli‐
cations receive the IP/ARP datagram along with the IETF
defined 4 byte header.
WARNING
This section describes warning messages that might be generated by the
driver. Please note that while the format of these messages can be mod‐
ified in future versions, the same general information is provided.
While joining IBA multicast groups corresponding to IP multicast groups
as part of multicast promiscuous operations as required by IP multicast
routers, or as part of running snoop(1M), it is possible that joins to
some multicast groups can fail due to inherent resource constraints in
the IBA components. In such cases, warning message similar to the fol‐
lowing appear in the system log, indicating the interface on which the
failure occurred:
NOTICE: ibd0: Could not get list of IBA multicast groups
NOTICE: ibd0: IBA promiscuous mode missed multicast group
NOTICE: ibd0: IBA promiscuous mode missed new multicast gid
Also, if the IBA SM indicates that multicast trap support is suspended
or unavailable, the system log contains a message similar to:
NOTICE: ibd0: IBA multicast support degraded due to
unavailability of multicast traps
When the SM indicates trap support is restored:
NOTICE: ibd0: IBA multicast support restored due to
availability of multicast traps
Additionally, if the IBA link transitions to an unavailable state (that
is, the IBA link state becomes Down, Initialize or Armed) and then
becomes active again, the driver tries to rejoin previously joined
groups if required. Failure to rejoin multicast groups triggers mes‐
sages like:
NOTICE: ibd0: Failure on port up to rejoin multicast gid
If the corresponding HCA port is in the unavailable state defined above
when initializing an ibd interface using ifconfig(1M), a message is
emitted by the driver:
NOTICE: ibd0: Port is not active
Further, as described above, if the broadcast-GID is not found, or the
associated MTU is higher than what the HCA port can support, the fol‐
lowing messages are printed to the system log:
NOTICE: ibd0: IPoIB broadcast group absent
NOTICE: ibd0: IPoIB broadcast group MTU 4096 greater than port's
maximum MTU 2048
In all cases of these reported problems when running ifconfig(1M), it
should be checked that IBA cabling is intact, an SM is running on the
fabric, and the broadcast-GID with appropriate properties has been cre‐
ated in the IBA partition.
The MTU of Reliable Connected mode can be larger than the MTU of Unre‐
liable Datagram mode.
When Reliable Connected mode is enabled, ibd still uses Unreliable
Datagram mode to transmit and receive multicast packets. If the payload
size (excluding 4 byte IPoIB header) of a multicast packet is larger
than the IP link MTU specified by the broadcast group, ibd drops it. A
message appears in the system log when drops occur:
NOTICE: ibd0: Reliable Connected mode is on. Multicast packet
length (<packet length> ><IP_LINK_MTU>) is too long to send
If only one side has enabled Reliable Connected mode, communication
falls back to datagram mode. The connected mode instance uses Path MTU
discovery to automatically adjust the MTU of a unicast packet if an MTU
difference exists. Before Path MTU discovery reduces the MTU for a spe‐
cific destination, several packets which's size exceed the MTU of Unre‐
liable Datagram mode is dropped.
CONFIGURATION
The IPoIB service comes preconfigured on all HCA ports in the system.
To turn the service off, or back on after turning it off, refer to doc‐
umentation in cfgadm_ib(1M).
EXAMPLES
Example 1 An Example Driver .conf File
# 1: unicast packets will be sent over Reliable Connected Mode
# 0: unicast packets will be sent over Unreliable Datagram Mode
#
# Each element in the list below maps to the corresponding ibd
# instance; the first element is for ibd instance 0, the second
# element is for instance 1 and so on.
#
enable_rc=1,1,0,0;
This example driver .conf file enables Connected Mode for ibd instances
0 and 1. Instances 2 and 3 use datagram mode.
FILES
/dev/ibd* Special character device
/kernel/drv/ib.conf Configuration file to start IPoIB service
/kernel/drv/ibd.conf Configuration file of IPoIB driver
/kernel/drv/sparcv9/ibd 64-bit SPARC device driver
/kernel/drv/amd64/ibd 64-bit x86 device driver
/kernel/drv/ibd 32-bit x86 device driver
SEE ALSOcfgadm(1M), cfgadm_ib(1M), ifconfig(1M), syslogd(1M), gld(7D), ib(7D),
kstat(7D), streamio(7I), dlpi(7P), attributes(5), attach(9E)NOTES
IBD is a GLD-based driver and provides the statistics described by
gld(7D). Valid received packets not accepted by any stream (long)
increase when IBD transmits broadcast IP packets. This happens because
the infiniband hardware copies and loops back the transmitted broadcast
packets to the source. These packets are discarded by GLD and are
recorded as unknowns.
SunOS 5.10 7 Jun 2010 ibd(7D)