nVIDIA ConnectX-6 VPI InfiniBand Adapter Card User Manual
- August 14, 2024
- Nvidia
Table of Contents
- nVIDIA ConnectX-6 VPI InfiniBand Adapter Card
- Specifications
- Hardware Requirements
- Airflow Requirement
- Installing PCIe x8/16 Cards
- Uninstalling the Card
- Socket Direct (2x PCIe x16) Cards Installation
- FAQs
- Product Overview
- ./MLNX_OFED_LINUX-x.x-x-rhel7.3-x86_64/mlnx_add_kernel_support.sh -m
- mount -o ro,loop MLNX_OFED_LINUX---.iso /mnt
- wget http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox –2018-01-25
- sudo rpm –import RPM-GPG-KEY-Mellanox
- rpm -q gpg-pubkey –qf ‘%{NAME}-%{VERSION}-%{RELEASE}t%{SUMMARY}n’ | grep
- yum repolist
- mount -o ro,loop MLNX_OFED_LINUX---.iso /mnt c.
- /mnt/mlnx_add_kernel_support.sh –make-tgz <optional –kmp> -k $(uname -r) -m
- cd /tmp/ # tar -xvf /tmp/MLNX_OFED_LINUX-5.2-0.5.5.0-rhel7.6-x86_64-ext.tgz
- yum repolist Loaded plugins: product-id, security, subscription-manager This
- yum search mlnx-ofedmlnx-ofed-all.noarch : MLNX_OFED all installer package
- Installing WinOF-2 Driver
- > esxcli software vib install d / Example:
- > esxcli software vib install -d /tmp/MLNX-NATIVE-ESX-
- > esxcli software vib remove -n nmlx5-rdma #> esxcli software vib remove -n
- Troubleshooting
- References
- Read User Manual Online (PDF format)
- Download This Manual (PDF format)
nVIDIA ConnectX-6 VPI InfiniBand Adapter Card
Specifications
- Model: NVIDIA ConnectX-6 InfiniBand/Ethernet Adapter Cards
- SKU Numbers:
- 900-9X603-0016-DT0
- 900-9X603-0056-DT0
- 900-9X628-0016-ST0
- Interfaces: InfiniBand, Ethernet
- Features:
- HDR100, EDR IB and 100GbE support
- Single-port QSFP56
- PCIe4.0 x8
- Dual-port QSFP56 (specific SKUs)
- PCIe3.0/4.0 x16
Hardware Requirements
Ensure your system meets the hardware requirements specified for the NVIDIA ConnectX-6 Adapter Cards.
Airflow Requirement
Proper airflow is crucial for the optimal performance of the adapter cards. Maintain adequate airflow as per the provided specifications.
Installing PCIe x8/16 Cards
- Identify the card in your system using the provided guidelines.
- Follow the detailed installation instructions for PCIe x8/16 cards as outlined in the manual.
- Securely install the card into the appropriate PCIe slot.
Uninstalling the Card
When necessary, follow the steps outlined in the manual to safely uninstall the adapter card from your system.
Socket Direct (2x PCIe x16) Cards Installation
- Refer to the specific installation instructions provided for Socket Direct cards.
- Carefully install the card ensuring it is securely seated in the designated slot.
FAQs
Q: Who is the intended audience for this manual?
A: This manual is intended for installers and users of NVIDIA ConnectX-6
InfiniBand/Ethernet Adapter Cards who have basic familiarity with InfiniBand
and Ethernet network specifications.
Q: How can customers contact technical support?
A: Customers who purchased NVIDIA products directly from NVIDIA can contact
technical support through the provided methods in the manual.
About This Manual This User Manual describes NVIDIA® ConnectX®-6 InfiniBand/Ethernet adapter cards. It provides details as to the interfaces of the board, specifications, required software and firmware for operating the board, and relevant documentation. Ordering Part Numbers The table below provides the ordering part numbers (OPN) for the available ConnectX-6 InfiniBand/Ethernet adapter cards.
NVIDIA SKU 900-9X603-0016-DT0 900-9X603-0056-DT0
900-9X628-0016-ST0
Legacy OPN MCX653105A-EFAT MCX653106A-EFAT MCX651105A-EDAT
Marketing Description
Lifecycle
ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE), single-port QSFP56, PCIe3.0/4.0 Socket Direct 2×8 in a row, tall bracket
Mass Production
ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IBand100GbE), dual-port QSFP56, PCIe3.0/4.0 Socket Direct 2×8 in a row, tall bracket
Mass Production
ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s
Mass Production
(HDR100, EDR IB and 100GbE, single-port QSFP56, PCIe4.0 x8,
tall bracket
900-9X6AF-0016-ST1 900-9X6AF-0056-MT1 900-9X6AF-0018-MT2 900-9X6AF-0058-ST1 900-9X6B4-0058-DT0
MCX653105A-ECAT MCX653106A-ECAT MCX653105A-HDAT MCX653106A-HDAT MCX654106A- HCAT
ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE), single-port QSFP56, PCIe3.0/4.0 x16, tall bracket
Mass Production
ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s
Mass Production
(HDR100, EDR IB and 100GbE), dual-port QSFP56, PCIe3.0/4.0
x16, tall bracket
ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, single-port QSFP56, PCIe3.0/4.0 x16, tall bracket
Mass Production
ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, PCIe3.0/4.0 x16, tall bracket
Mass Production
ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, Socket Direct 2x PCIe3.0/4.0×16, tall bracket
Mass Production
7
NVIDIA SKU 900-9X6AF-0018-SS0
900-9X6AF-0058-SS0
900-9X0BC-001H-ST1
Legacy OPN MCX653105A-HDAL
MCX653106A-HDAL
MCX683105AN-HDAT
Marketing Description
Lifecycle
ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE,
single-port QSFP56, PCIe3.0/4.0 x16, cold plate for liquid-cooled Intel®
Server System D50TNP platforms, tall bracket, ROHS R6
ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE,
dual-port QSFP56, PCIe3.0/4.0 x16, cold plate for liquid-cooled Intel® Server
System D50TNP platforms, tall bracket, ROHS R6
ConnectX®-6 DE adapter card, HDR IB (200Gb/s), single-port QSFP, PCIe4.0 x16,
no crypto, tall bracket
Mass Production Mass Production Mass Production
EOL’ed (End of Life) Ordering Part Numbers
NVIDIA SKU 900-9X6B4-0056-DT0
Legacy OPN MCX654106A-ECAT
900-9X6B4-0018-DT2
MCX654105A-HCAT
Marketing Description
Lifecycle
ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR InfiniBand and 100GbE), dual-port QSFP56, Socket Direct 2x PCIe 3.0/4.0 x16, tall bracket
End of Life
ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB
End of Life
(200Gb/s) and 200GbE, single-port QSFP56, Socket Direct 2x
PCIe3.0/4.0×16, tall bracket
Intended Audience This manual is intended for the installer and user of these cards. The manual assumes basic familiarity with InfiniBand and Ethernet network and architecture specifications. Technical Support Customers who purchased NVIDIA products directly from NVIDIA are invited to contact us through the following methods:
8
· URL: https://www.nvidia.com > Support · E-mail: enterprisesupport@nvidia.com Customers who purchased NVIDIA M-1 Global Support Services, please see your contract for details regarding Technical Support. Customers who purchased NVIDIA products through a NVIDIA approved reseller should first seek assistance through their reseller. Related Documentation
MLNX_OFED for Linux User Manual and Release Notes WinOF-2 for Windows User Manual and Release Notes NVIDIA VMware for Ethernet User Manual NVIDIA Firmware Utility (mlxup) User Manual and Release Notes NVIDIA Firmware Tools (MFT) User Manual
User Manual describing OFED features, performance, band diagnostic, tools
content, and configuration. See MLNX_OFED for Linux Documentation.
User Manual describing WinOF-2 features, performance, Ethernet diagnostic,
tools content, and configuration. See WinOF-2 for Windows Documentation.
User Manual and release notes describing the various components of the NVIDIA
ConnectX® NATIVE ESXi stack. See VMware® ESXi Drivers Documentation.
NVIDIA firmware update and query utility used to update the firmware. Refer to
Firmware Utility (mlxup) Documentation.
User Manual describing the set of MFT firmware management tools for a single
node. See MFT User Manual.
InfiniBand Architecture Specification Release 1.2.1, Vol 2 – Release 1.3 IEEE Std 802.3 Specification PCI Express Specifications LinkX Interconnect Solutions
InfiniBand Specifications
IEEE Ethernet Specifications
Industry Standard PCI Express Base and Card Electromechanical Specifications.
Refer to PCI-SIG Specifications.
LinkX InfiniBand cables and transceivers are designed to maximize the
performance of High-Performance Computing networks, requiring high-bandwidth,
low-latency connections between compute nodes and switch nodes. NVIDIA offers
one of the industry’s broadest portfolio of QDR/FDR10 (40Gb/s), FDR (56Gb/s),
EDR/HDR100 (100Gb/s), HDR (200Gb/s) and NDR (400Gb/s) cables, including Direct
Attach Copper cables (DACs), copper splitter cables, Active Optical Cables
(AOCs) and transceivers in a wide range of lengths from 0.5m to 10km. In
addition to meeting IBTA standards, NVIDIA tests every product in an end-to-
end environment ensuring a Bit Error Rate of less than 1E-15. Read more at
LinkX Cables and Transceivers.
Document Conventions
When discussing memory sizes, MB and MBytes are used in this document to mean
size in MegaBytes. The use of Mb or Mbits (small b) indicates the size in
MegaBits. In this document, PCIe is used to mean PCI Express.
Revision History
9
A list of the changes made to this document are provided in Document Revision History. 10
Introduction
Product Overview
This is the user guide for InfiniBand/Ethernet adapter cards based on the
ConnectX-6 integrated circuit device. ConnectX-6 connectivity provides the
highest performing low latency and most flexible interconnect solution for PCI
Express Gen 3.0/4.0 servers used in enterprise datacenters and highperformance
computing environments.
ConnectX-6 Virtual Protocol Interconnect® adapter cards provide up to two
ports of 200Gb/s for InfiniBand and Ethernet connectivity, sub-600ns latency
and 200 million messages per second, enabling the highest performance and most
flexible solution for the most demanding High-Performance Computing (HPC),
storage, and datacenter applications.
ConnectX-6 is a groundbreaking addition to the NVIDIA ConnectX series of
industry-leading adapter cards. In addition to all the existing innovative
features of past ConnectX versions, ConnectX-6 offers a number of enhancements
that further improve the performance and scalability of datacenter
applications. In addition, specific PCIe stand-up cards are available with a
cold plate for insertion into liquid-cooled Intel® Server System D50TNP
platforms.
ConnectX-6 is available in two form factors: low-profile stand-up PCIe and
Open Compute Project (OCP) Spec 3.0 cards with QSFP connectors. Single-port,
HDR, stand-up PCIe adapters are available based on either ConnectX-6 or
ConnectX-6 DE (ConnectX-6 Dx enhanced for HPC applications).
Make sure to use a PCIe slot that is capable of supplying the required power and airflow to the ConnectX-6 as stated in Specifications.
Configuration ConnectX-6 PCIe x8 Card ConnectX-6 PCIe x16 Card
OPN
Marketing Description
MCX651105A-EDAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE, single-port QSFP56, PCIe4.0 x8, tall bracket
MCX653105AHDAT
ConnectX-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, single-port QSFP56, PCIe4.0 x16, tall bracket
MCX653106AHDAT
ConnectX-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, PCIe3.0/4.0 x16, tall bracket
MCX653105A-ECAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE), single-port QSFP56, PCIe3.0/4.0 x16, tall bracket
11
Configuration
OPN
Marketing Description
MCX653106A-ECAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE), dual-port QSFP56, PCIe3.0/4.0 x16, tall bracket
ConnectX-6 DE PCIe x16 Card
MCX683105ANHDAT
ConnectX-6 DE InfiniBand adapter card, HDR, single-port QSFP, PCIe 3.0/4.0 x16, No Crypto, Tall Bracket
ConnectX-6 PCIe x16 Cards for liquid-cooled Intel® MCX653105A-
Server System D50TNP platforms
HDAL
ConnectX-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, single-port QSFP56, PCIe4.0 x16, cold plate for liquid-cooled Intel® Server System D50TNP platforms, tall bracket, ROHS R6
MCX653106AHDAL
ConnectX-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, PCIe4.0 x16, cold plate for liquid-cooled Intel® Server System D50TNP platforms, tall bracket, ROHS R6
ConnectX-6 Dual-slot Socket Direct Cards (2x PCIe MCX654105A-
x16)
HCAT
ConnectX-6 InfiniBand/Ethernet adapter card kit, HDR IB (200Gb/s) and 200GbE, single-port QSFP56, Socket Direct 2x PCIe3.0 x16, tall brackets
MCX654106AHCAT
ConnectX-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, Socket Direct 2x PCIe3.0/4.0×16, tall bracket
MCX654106A-ECAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR InfiniBand and 100GbE), dual-port QSFP56, Socket Direct 2x PCIe3.0/4.0 x16, tall bracket
ConnectX-6 Single-slot Socket Direct Cards (2x PCIe MCX653105A-EFAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE), single- port QSFP56,
x8 in a row)
PCIe3.0/4.0 Socket Direct 2×8 in a row, tall bracket
MCX653106A-EFAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IBand100GbE), dual-port QSFP56, PCIe3.0/4.0 Socket Direct 2×8 in a row, tall bracket
ConnectX-6 PCIe x8 Card
ConnectX-6 with a single PCIe x8 slot can support a bandwidth of up to 100Gb/s
in a PCIe Gen 4.0 slot.
12
Part Number
MCX651105A-EDAT
Form Factor/Dimensions
PCIe Half Height, Half Length / 167.65mm x 68.90mm
Data Transmission Rate
Ethernet: 10/25/40/50/100 Gb/s InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100
Network Connector Type
Single-port QSFP56
PCIe x8 through Edge Connector PCIe Gen 3.0 / 4.0 SERDES @ 8.0GT/s / 16.0GT/s
RoHS
RoHS Compliant
Adapter IC Part Number
MT28908A0-XCCF-HVM
13
ConnectX-6 PCIe x16 Card
ConnectX-6 with a single PCIe x16 slot can support a bandwidth of up to
100Gb/s in a PCIe Gen 3.0 slot, or up to 200Gb/s in a PCIe Gen 4.0 slot. This
formfactor is available also for Intel® Server System D50TNP Platforms where
an Intel liquid-cooled cold plate is used for adapter cooling mechanism.
Part Number Form Factor/Dimensions Data Transmission Rate
Network Connector Type
MCX653105A-ECAT
MCX653106A-ECAT
PCIe Half Height, Half Length / 167.65mm x 68.90mm
Ethernet: 10/25/40/50/100 Gb/s InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100
Single-port QSFP56
Dual-port QSFP56
MCX653105A-HDAT
MCX653106A-HDAT
Ethernet: 10/25/40/50/100/200 Gb/s InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100, HDR
Single-port QSFP56
Dual-port QSFP56
14
Part Number PCIe x16 through Edge Connector RoHS Adapter IC Part Number
MCX653105A-ECAT
MCX653106A-ECAT
PCIe Gen 3.0 / 4.0 SERDES @ 8.0GT/s / 16.0GT/s
RoHS Compliant
MT28908A0-XCCF-HVM
MCX653105A-HDAT
MCX653106A-HDAT
ConnectX-6 DE PCIe x16 Card
ConnectX-6 DE (ConnectX-6 Dx enhanced for HPC applications) with a single PCIe
x16 slot can support a bandwidth of up to 100Gb/s in a PCIe Gen 3.0 slot, or
up to 200Gb/s in a PCIe Gen 4.0 slot.
Part Number
MCX683105AN-HDAT
Form Factor/Dimensions
PCIe Half Height, Half Length / 167.65mm x 68.90mm
Data Transmission Rate
InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100, HDR
Network Connector Type
Single-port QSFP56
PCIe x16 through Edge Connector
PCIe Gen 3.0 / 4.0 SERDES @ 8.0GT/s / 16.0GT/s
RoHS
RoHS Compliant
Adapter IC Part Number
MT28924A0-NCCF-VE
15
ConnectX-6 for Liquid-Cooled Intel® Server System D50TNP Platforms
The below cards are available with a cold plate for insertion into liquid-
cooled Intel® Server System D50TNP platforms.
Part Number
MCX653105A-HDAL
MCX653106A-HDAL
Form Factor/Dimensions
PCIe Half Height, Half Length / 167.65mm x 68.90mm
Data Transmission Rate
Ethernet: 10/25/40/50/100/200 Gb/s InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100, HDR
Network Connector Type
Single-port QSFP56 Dual-port QSFP56
PCIe x16 through Edge Connector PCIe Gen 3.0 / 4.0 SERDES @ 8.0GT/s / 16.0GT/s
RoHS
RoHS Compliant
16
Part Number Adapter IC Part Number
MCX653105A-HDAL MT28908A0-XCCF-HVM
MCX653106A-HDAL
ConnectX-6 Socket DirectTM Cards
The Socket Direct technology offers improved performance to dual-socket
servers by enabling direct access from each CPU in a dual-socket server to the
network through its dedicated PCIe interface. Please note that ConnectX-6
Socket Direct cards do not support Multi-Host functionality (i.e. connectivity
to two independent CPUs). For ConnectX-6 Socket Direct card with Multi-Host
functionality, please contact NVIDIA.
ConnectX-6 Socket Direct cards are available in two configurations: Dual-slot
Configuration (2x PCIe x16) and Single-slot Configuration (2x PCIe x8).
ConnectX-6 Dual-slot Socket Direct Cards (2x PCIe x16)
In order to obtain 200Gb/s speed, NVIDIA offers ConnectX-6 Socket Direct that
enable 200Gb/s connectivity also for servers with PCIe Gen 3.0 capability. The
adapter’s 32-lane PCIe bus is split into two 16-lane buses, with one bus
accessible through a PCIe x16 edge connector and the other bus through an x16
Auxiliary PCIe Connection card. The two cards should be installed into two
PCIe x16 slots and connected using two Cabline SA-II Plus harnesses, as shown
in the below figure.
17
Part Number
MCX654105A-HCAT MCX654106A-HCAT
MCX654106A-ECAT
Form Factor/Dimensions
Adapter Card: PCIe Half Height, Half Length / 167.65mm x 68.90mm Auxiliary PCIe Connection Card: 5.09 in. x 2.32 in. (129.30mm x 59.00mm) Two 35cm Cabline CA-II Plus harnesses
Data Transmission Rate
Ethernet: 10/25/40/50/100/200 Gb/s
Ethernet: 10/25/40/50/100 Gb/s
InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100, HDR InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100
Network Connector Type
Single-port QSFP56
Dual-port QSFP56
PCIe x16 through Edge Connector PCIe Gen 3.0 / 4.0SERDES@ 8.0GT/s / 16.0GT/s
PCIe x16 through Auxiliary Card PCIe Gen 3.0SERDES@ 8.0GT/s
18
Part Number RoHS Adapter IC Part Number
MCX654105A-HCAT RoHS Compliant MT28908A0-XCCF-HVM
MCX654106A-HCAT
MCX654106A-ECAT
ConnectX-6 Single-slot Socket Direct Cards (2x PCIe x8 in a row)
The PCIe x16 interface comprises two PCIe x8 in a row, such that each of the
PCIe x8 lanes can be connected to a dedicated CPU in a dual-socket server. In
such a configuration, Socket Direct brings lower latency and lower CPU
utilization as the direct connection from each CPU to the network means the
interconnect can bypass a QPI (UPI) and the other CPU, optimizing performance
and improving latency. CPU utilization is improved as each CPU handles only
its own traffic and not traffic from the other CPU.
A system with a custom PCI Express x16 slot that includes special signals is
required for installing the card. Please refer to PCI Express Pinouts
Description for Single-Slot Socket Direct Card for pinout definitions.
19
Part Number
MCX653105A-EFAT
MCX653106A-EFAT
Form Factor/Dimensions
PCIe Half Height, Half Length / 167.65mm x 68.90mm
Data Transmission Rate
Ethernet: 10/25/40/50/100 Gb/s InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100
Network Connector Type
Single-port QSFP56
Dual-port QSFP56
PCIe x16 through Edge Connector PCIe Gen 3.0 / 4.0 SERDES @ 8.0GT/s / 16.0GT/s Socket Direct 2×8 in a row
RoHS
RoHS Compliant
Adapter IC Part Number
MT28908A0-XCCF-HVM
Package Contents
ConnectX-6 PCIe x8/x16 Adapter Cards
Applies to MCX651105A-EDAT, MCX653105A-ECAT, MCX653106A-ECAT, MCX653105A-HDAT,
MCX653106A-HDAT, MCX653105A-EFAT, MCX653106A-EFAT,
and MCX683105AN-HDAT.
Cards Accessories
Category
Qty 1 1
ConnectX-6 adapter card Adapter card short bracket
Item
20
Category
Qty 1
Item Adapter card tall bracket (shipped assembled on the card)
ConnectX-6 PCIe x16 Adapter Card for liquid-cooled Intel® Server System D50TNP Platforms
Applies to MCX653105A-HDAL and MCX653106A-HDAL.
Cards Accessories
Category
Qty 1 1 1 1
ConnectX-6 adapter card
Item
Adapter card short bracket
Adapter card tall bracket (shipped assembled on the card)
Accessory Kit with two 2 TIMs (MEB000386)
ConnectX-6 Socket Direct Cards (2x PCIe x16)
Applies to MCX654105A-HCAT, MCX654106A-HCAT and MCX654106A-ECAT.
Cards
Category
Qty. 1 1
ConnectX-6 adapter card PCIe Auxiliary Card
Item
21
Harnesses
Category
Accessories
Qty. 1 1 2 1 1 1 1
35cm Cabline CA-II Plus harness (white)
Item
35cm Cabline CA-II Plus harness (black)
Retention Clip for Cablline harness (optional accessory) Adapter card short bracket
Adapter card tall bracket (shipped assembled on the card)
PCIe Auxiliary card short bracket
PCIe Auxiliary card tall bracket (shipped assembled on the card)
Features and Benefits
Make sure to use a PCIe slot that is capable of supplying the required power and airflow to the ConnectX-6 cards as stated in Specifications.
PCI Express (PCIe)
Uses the following PCIe interfaces:
· PCIe x8/x16 configurations: PCIe Gen 3.0 (8GT/s) and Gen 4.0 (16GT/s)
through an x8/x16 edge connector.
· 2x PCIe x16 configurations: PCIe Gen 3.0/4.0 SERDES @ 8.0/16.0 GT/s through
Edge Connector PCIe Gen 3.0 SERDES @ 8.0GT/s through PCIe Auxiliary Connection
Card
200Gb/s InfiniBand/
ConnectX-6 offers the highest throughput InfiniBand/Ethernet adapter, supporting HDR 200b/s InfiniBand and 200Gb/s Ethernet and enabling any standard networking, clustering, or storage to operate seamlessly over any converged network leveraging a consolidated software stack.
22
Ethernet Adapte r
InfiniBand Architecture Specification v1.3 compliant
ConnectX-6 delivers low latency, high bandwidth, and computing efficiency for performance-driven server and storage clustering applications. ConnectX-6 is InfiniBand Architecture Specification v1.3 compliant.
Up to 200 Gigabit Ethernet
NVIDIA adapters comply with the following IEEE 802.3 standards: 200GbE / 100GbE / 50GbE / 40GbE / 25GbE / 10GbE / 1GbE – IEEE 802.3bj, 802.3bm 100 Gigabit Ethernet – IEEE 802.3by, Ethernet Consortium25, 50 Gigabit Ethernet, supporting all FEC modes – IEEE 802.3ba 40 Gigabit Ethernet – IEEE 802.3by 25 Gigabit Ethernet – IEEE 802.3ae 10 Gigabit Ethernet – IEEE 802.3ap based auto- negotiation and KR startup – IEEE 802.3ad, 802.1AX Link Aggregation – IEEE 802.1Q, 802.1P VLAN tags and priority – IEEE 802.1Qau (QCN) – Congestion Notification – IEEE 802.1Qaz (ETS) – IEEE 802.1Qbb (PFC) – IEEE 802.1Qbg – IEEE 1588v2 – Jumbo frame support (9.6KB)
InfiniBand HDR100
A standard InfiniBand data rate, where each lane of a 2X port runs a bit rate of 53.125Gb/s with a 64b/66b encoding, resulting in an effective bandwidth of 100Gb/s.
InfiniBand HDR A standard InfiniBand data rate, where each lane of a 4X port runs a bit rate of 53.125Gb/s with a 64b/66b encoding, resulting in an effective bandwidth of 200Gb/s.
Memory Components
· SPI Quad – includes 256Mbit SPI Quad Flash device (MX25L25645GXDI-08G device
by Macronix) · FRU EEPROM – Stores the parameters and personality of the card.
The EEPROM capacity is 128Kbit. FRU I2C address is (0x50) and is accessible
through the PCIe
SMBus. (Note: Address 0x58 is reserved.)
23
Overlay Networks
In order to better scale their networks, datacenter operators often create overlay networks that carry traffic from individual virtual machines over logical tunnels in encapsulated formats such as NVGRE and VXLAN. While this solves network scalability issues, it hides the TCP packet from the hardware offloading engines, placing higher loads on the host CPU. ConnectX-6 effectively addresses this by providing advanced NVGRE and VXLAN hardware offloading engines that encapsulate and decapsulate the overlay protocol.
RDMA and RDMA over Converged Ethernet (RoCE)
ConnectX-6, utilizing IBTA RDMA (Remote Data Memory Access) and RoCE (RDMA over Converged Ethernet) technology, delivers low-latency and high-performance over InfiniBand and Ethernet networks. Leveraging datacenter bridging (DCB) capabilities as well as ConnectX-6 advanced congestion control hardware mechanisms, RoCE provides efficient low-latency RDMA services over Layer 2 and Layer 3 networks.
NVIDIA PeerDirectTM
PeerDirectTM communication provides high efficiency RDMA access by eliminating unnecessary internal data copies between components on the PCIe bus (for example, from GPU to CPU), and therefore significantly reduces application run time. ConnectX-6 advanced acceleration technology enables higher cluster efficiency and scalability to tens of thousands of nodes.
CPU Offload
Adapter functionality enables reduced CPU overhead leaving more CPU resources
available for computation tasks.
Open vSwitch (OVS) offload using ASAP2(TM) · Flexible match-action flow tables
· Tunneling encapsulation/decapsulation
Quality of Service (QoS)
Support for port-based Quality of Service enabling various application requirements for latency and SLA.
Hardware-based I/O
Virtualization
ConnectX-6 provides dedicated adapter resources and guaranteed isolation and protection for virtual machines within the server.
Storage Acceleration
SR-IOV
A consolidated compute and storage network achieves significant cost-
performance advantages over multi-fabric networks. Standard block and file
access protocols can leverage:
· RDMA for high-performance storage access · NVMe over Fabric offloads for
target machine · Erasure Coding · T10-DIF Signature Handover
ConnectX-6 SR-IOV technology provides dedicated adapter resources and
guaranteed isolation and protection for virtual machines (VM) within the
server.
24
HighPerformance Accelerations
· Tag Matching and Rendezvous Offloads · Adaptive Routing on Reliable Transport · Burst Buffer Offloads for Background Checkpointing
Operating Systems/Distributions
ConnectX-6 Socket Direct cards 2x PCIe x16 (OPNs: MCX654105A-HCAT, MCX654106A-
HCAT and MCX654106A-ECAT) are not supported in Windows
and WinOF-2. · OpenFabrics Enterprise Distribution (OFED) · RHEL/CentOS ·
Windows · FreeBSD · VMware · OpenFabrics Enterprise Distribution (OFED) ·
OpenFabrics Windows Distribution (WinOF-2)
Connectivity
· Interoperable with 1/10/25/40/50/100/200 Gb/s InfiniBand and Ethernet
switches · Passive copper cable with ESD protection · Powered connectors for
optical and active cable support
25
Manageability
ConnectX-6 technology maintains support for manageability through a BMC.
ConnectX-6 PCIe stand-up adapter can be connected to a BMC using MCTP over
SMBus or MCTP over PCIe protocols as if it is a standard NVIDIA PCIe stand-up
adapter. For configuring the adapter for the specific manageability solution
in use by the server, please contact NVIDIA Support.
26
Interfaces
InfiniBand Interface
The network ports of the ConnectX®-6 adapter cards are compliant with the
InfiniBand Architecture Specification, Release 1.3. InfiniBand traffic is
transmitted through the cards’ QSFP56 connectors.
Ethernet Interfaces
The adapter card includes special circuits to protect from ESD shocks to the
card/server when plugging copper cables.
The network ports of the ConnectX-6 adapter card are compliant with the IEEE
802.3 Ethernet standards listed in Features and Benefits. Ethernet traffic is
transmitted through the QSFP56/QSFP connectors on the adapter card.
PCI Express Interface
ConnectX®-6 adapter cards support PCI Express Gen 3.0/4.0 (1.1 and 2.0
compatible) through x8/x16 edge connectors. The device can be either a master
initiating the PCI Express bus operations or a subordinate responding to PCI
bus operations. The following lists PCIe interface features:
· PCIe Gen 3.0 and 4.0 compliant, 2.0 and 1.1 compatible · 2.5, 5.0, 8.0, or
16.0 GT/s link rate x16/x32 · Auto-negotiates to x32, x16, x8, x4, x2, or x1 ·
Support for MSI/MSI-X mechanisms
27
LED Interface
The adapter card includes special circuits to protect from ESD shocks to the
card/server when plugging copper cables.
There are two I/O LEDs per port: · LED 1 and 2: Bi-color I/O LED which
indicates link status. LED behavior is described below for Ethernet and
InfiniBand port configurations. · LED 3 and 4: Reserved for future use.
LED1 and LED2 Link Status Indications – Ethernet Protocol:
LED Color and State Off Beacon command for locating the adapter card Error
Description A link has not been established 1Hz blinking Yellow 4Hz blinking
Yellow Indicates an error with the link. The error can be one of the
following:
28
Solid green Blinking green
LED1 and LED2 Link Status Indications – InfiniBand Protocol:
LED Color and State Off Beacon command for locating the adapter card Error
Solid amber Solid green
Error Type I2C Over-current
Description
I2C access to the networking ports fails
Over-current condition of the networking ports
LED Behavior Blinks until error is fixed
Blinks until error is fixed
Indicates a valid link with no active traffic Indicates a valid link with active traffic
Description
A link has not been established
1Hz blinking Yellow 4Hz blinking Yellow Indicates an error with the link. The error can be one of the following:
Error Type I2C Over-current
Description
I2C access to the networking ports fails
Over-current condition of the networking ports
LED Behavior Blinks until error is fixed
Blinks until error is fixed
Indicates an active link Indicates a valid (data activity) link with no active traffic
29
Blinking green
Indicates a valid link with active traffic
Heatsink Interface
The heatsink is attached to the ConnectX-6 IC to dissipate the heat from the
ConnectX-6 IC. It is attached either by using four spring-loaded push pins
that insert into four mounting holes or by screws. ConnectX-6 IC has a thermal
shutdown safety mechanism that automatically shuts down the ConnectX-6 card in
cases of high-temperature events, improper thermal coupling or heatsink
removal. For the required airflow (LFM) per OPN, please refer to
Specifications. For MCX653105A-HDAL and MCX653106A-HDAL cards, the heatsink is
compatible with a cold plate for liquid-cooled Intel® Server System D50TNP
platforms only.
SMBus Interface
ConnectX-6 technology maintains support for manageability through a BMC.
ConnectX-6 PCIe stand-up adapter can be connected to a BMC using MCTP over
SMBus protocol as if it is a standard NVIDIA PCIe stand-up adapter. For
configuring the adapter for the specific manageability solution in use by the
server, please contact NVIDIA Support.
Voltage Regulators
The voltage regulator power is derived from the PCI Express edge connector 12V
supply pins. These voltage supply pins feed on-board regulators that provide
the necessary power to the various components on the card.
30
Hardware Installation
Installation and initialization of ConnectX-6 adapter cards require attention
to the mechanical attributes, power specification, and precautions for
electronic equipment.
Safety Warnings
Safety warnings are provided here in the English language. For safety warnings
in other languages, refer to the Adapter Installation Safety
Instructions document available on nvidia.com. Please observe all safety
warnings to avoid injury and prevent damage to system components. Note that
not all warnings are relevant to all models.
General Installation Instructions Read all installation instructions before
connecting the equipment to the power source.
Jewelry Removal Warning Before you install or remove equipment that is
connected to power lines, remove jewelry such as bracelets, necklaces, rings,
watches, and so on. Metal objects heat up when connected to power and ground
and can meltdown, causing serious burns and/or welding the metal object to the
terminals. Over-temperature This equipment should not be operated in an area
with an ambient temperature exceeding the maximum recommended: 55°C (131°F).
An airflow of 200LFM at this maximum ambient temperature is required for HCA
cards and NICs. To guarantee proper airflow, allow at least 8cm (3 inches) of
clearance around the ventilation openings. During Lightning – Electrical
Hazard During periods of lightning activity, do not work on the equipment or
connect or disconnect cables.
31
Copper Cable Connecting/Disconnecting Some copper cables are heavy and not
flexible, as such, they should be carefully attached to or detached from the
connectors. Refer to the cable manufacturer for special warnings and
instructions.
Equipment Installation This equipment should be installed, replaced, or
serviced only by trained and qualified personnel.
Equipment Disposal The disposal of this equipment should be in accordance to
all national laws and regulations.
Local and National Electrical Codes This equipment should be installed in
compliance with local and national electrical codes.
Hazardous Radiation Exposure
· Caution Use of controls or adjustment or performance of procedures other
than those specified herein may result in hazardous radiation exposure.For
products with optical ports.
· CLASS 1 LASER PRODUCT and reference to the most recent laser standards: IEC
60 825-1:1993 + A1:1997 + A2:2001 and EN 60825-1:1994+A1:1996+ A2:20
Installation Procedure Overview
The installation procedure of ConnectX-6 adapter cards involves the following
steps:
Step 1 2 3
Procedure Check the system’s hardware and software requirements. Pay attention to the airflow consideration within the host system Follow the safety precautions
System Requirements Airflow Requirements Safety Precautions
Direct Link
32
Step 4 5 6 7
8 9
Procedure
Direct Link
Unpack the package
Unpack the package
Follow the pre-installation checklist
Pre-Installation Checklist
(Optional) Replace the full-height mounting bracket with the supplied short bracket Bracket Replacement Instructions
Install the ConnectX-6 PCIe x8/x16 adapter card in the system
ConnectX-6 PCIe x8/x16 Adapter Cards Installation Instructions
Install the ConnectX-6 2x PCIe x16 Socket Direct adapter card in the system
Socket Direct (2x PCIe x16) Cards Installation Instructions
Install the ConnectX-6 card for Intel Liquid-cooled platforms
Cards for Intel Liquid-Cooled Platforms Installation Instructions
Connect cables or modules to the card
Cables and Modules
Identify ConnectX-6 in the system
Identifying Your Card
System Requirements
Hardware Requirements
Unless otherwise specified, NVIDIA products are designed to work in an
environmentally controlled data center with low levels of gaseous and dust
(particulate) contamination. The operating environment should meet severity
level G1 as per ISA 71.04 for gaseous contamination and ISO 14644-1 class 8
for cleanliness level.
For proper operation and performance, please make sure to use a PCIe slot with
a corresponding bus width and that can supply sufficient power to
your card. Refer to the Specifications section of the manual for more power
requirements.
Please make sure to install the ConnectX-6 cards in a PCIe slot that is
capable of supplying the required power as stated in Specifications.
33
PCIe x8/x16
ConnectX-6 Configuration
Cards for liquid-cooled Intel® Server System D50TNP platforms
Socket Direct 2x PCIe x8 in a row (single slot)
Socket Direct 2x PCIe x16 (dual slots)
Hardware Requirements
A system with a PCI Express x8/x16 slot is required for installing the card.
Intel® Server System D50TNP Platform with an available PCI Express x16 slot is
required for installing the card. A system with a custom PCI Express x16 slot
(four special pins) is required for installing the card. Please refer to PCI
Express Pinouts Description for Single-Slot Socket Direct Card for pinout
definitions.
A system with two PCIe x16 slots is required for installing the cards.
Airflow Requirements
ConnectX-6 adapter cards are offered with two airflow patterns: from the
heatsink to the network ports, and vice versa, as shown below. Please refer to
the Specifications section for airflow numbers for each specific card model.
34
Airflow from the heatsink to the network ports
Airflow from the network ports to the heatsink
All cards in the system should be planned with the same airflow direction.
Software Requirements
· See Operating Systems/Distributions section under the Introduction section.
· Software Stacks – NVIDIA OpenFabric software package MLNX_OFED for Linux,
WinOF-2 for Windows, and VMware. See the Driver Installation section.
Safety Precautions
The adapter is being installed in a system that operates with voltages that
can be lethal. Before opening the case of the system, observe the following
precautions to avoid injury and prevent damage to system components.
· Remove any metallic objects from your hands and wrists.
35
· Make sure to use only insulated tools. · Verify that the system is powered
off and is unplugged. · It is strongly recommended to use an ESD strap or
other antistatic devices.
Pre-Installation Checklist
· Unpack the ConnectX-6 Card; Unpack and remove the ConnectX-6 card. Check
against the package contents list that all the parts have been sent. Check the
parts for visible damage that may have occurred during shipping. Please note
that the cards must be placed on an antistatic surface. For package contents
please refer to Package Contents.
Please note that if the card is removed hastily from the antistatic bag, the
plastic ziplock may harm the EMI fingers on the networking
connector. Carefully remove the card from the antistatic bag to avoid damaging
the EMI fingers. · Shut down your system if active; Turn off the power to the
system, and disconnect the power cord. Refer to the system documentation for
instructions. Before you install the ConnectX-6 card, make sure that the
system is disconnected from power. · (Optional) Check the mounting bracket on
the ConnectX-6 or PCIe Auxiliary Connection Card; If required for your system,
replace the full-height
mounting bracket that is shipped mounted on the card with the supplied low-
profile bracket. Refer to Bracket Replacement Instructions.
Bracket Replacement Instructions
The ConnectX-6 card and PCIe Auxiliary Connection card are usually shipped
with an assembled high-profile bracket. If this form factor is suitable for
your requirements, you can skip the remainder of this section and move to
Installation Instructions. If you need to replace the high-profile bracket
with the short bracket that is included in the shipping box, please follow the
instructions in this section.
Due to risk of damaging the EMI gasket, it is not recommended to replace the
bracket more than three times.
To replace the bracket you will need the following parts: · The new brackets
of the proper height · The 2 screws saved from the removal of the bracket
Removing the Existing Bracket
36
1. Using a torque driver, remove the two screws holding the bracket in place.
2. Separate the bracket from the ConnectX-6 card.
Be careful not to put stress on the LEDs on the adapter card.
3. Save the two screws. Installing the New Bracket
1. Place the bracket onto the card until the screw holes line up.
Do not force the bracket onto the adapter card.
2. Screw on the bracket using the screws saved from the bracket removal
procedure above.
Use a torque driver to apply up to 2 lbs-in torque on the screws.
Installation Instructions
This section provides detailed instructions on how to install your adapter
card in a system. Choose the installation instructions according to the
ConnectX-6 configuration you have purchased.
OPNs
MCX651105A-EDAT MCX653105A-HDAT MCX653106A-HDAT MCX653105A-ECAT MCX653106A-
ECAT MCX653105A-EFAT MCX653106A-EFAT MCX683105AN-HDAT
Installation Instructions PCIe x8/16 Cards Installation Instructions
37
MCX654105A-HCAT MCX654106A-HCAT MCX654106A-ECAT
MCX653105A-HDAL MCX653106A-HDAL
Socket Direct (2x PCIe x16) Cards Installation Instructions Cards for Intel Liquid-Cooled Platforms Installation Instructions
Cables and Modules
Cable Installation
1. All cables can be inserted or removed with the unit powered on. 2. To
insert a cable, press the connector into the port receptacle until the
connector is firmly seated.
a. Support the weight of the cable before connecting the cable to the adapter
card. Do this by using a cable holder or tying the cable to the rack.
b. Determine the correct orientation of the connector to the card before
inserting the connector. Do not try and insert the connector upside down. This
may damage the adapter card.
c. Insert the connector into the adapter card. Be careful to insert the
connector straight into the cage. Do not apply any torque, up or down, to the
connector cage in the adapter card.
d. Make sure that the connector locks in place.
When installing cables make sure that the latches engage.
Always install and remove cables by pushing or pulling the cable and connector
in a straight line with the card.
3. After inserting a cable into a port, the Green LED indicator will light
when the physical connection is established (that is, when the unit is powered
on and a cable is plugged into the port with the other end of the connector
plugged into a functioning port). See LED Interface under the Interfaces
section.
4. After plugging in a cable, lock the connector using the latching mechanism
particular to the cable vendor. When data is being transferred the Green LED
will blink. See LED Interface under the Interfaces section.
5. Care should be taken as not to impede the air exhaust flow through the
ventilation holes. Use cable lengths which allow for routing horizontally
around to the side of the chassis before bending upward or downward in the
rack.
38
6. To remove a cable, disengage the locks and slowly pull the connector away from the port receptacle. LED indicator will turn off when the cable is unseated.
Identifying the Card in Your System
On Linux Get the device location on the PCI bus by running lspci and locating lines with the string “Mellanox Technologies”:
ConnectX-6 Card Configuration
Single-port Socket Direct Card (2x PCIe x16)
lspci Command Output Example
[root@mftqa-009 ~]# lspci |grep mellanox -i a3:00.0 Infiniband controller:
Mellanox Technologies MT28908 Family [ConnectX-6] e3:00.0 Infiniband
controller: Mellanox Technologies MT28908 Family [ConnectX-6]
Dual-port Socket Direct Card (2x PCIe x16)
[root@mftqa-009 ~]# lspci |grep mellanox -i 05:00.0 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 05:00.1 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 82:00.0 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 82:00.1 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6]
Single-port PCIe x8/x16 Card
In the output example above, the first two rows indicate that one card is
installed in a PCI slot with PCI Bus address 05 (hexadecimal), PCI Device
number 00 and PCI Function number 0 and 1. The other card is installed in a
PCI slot with PCI Bus address 82 (hexa-decimal), PCI Device number 00 and PCI
Function number 0 a nd 1.
Since the two PCIe cards are installed in two PCIe slots, each card gets a
unique PCI Bus and Device number. Each of the PCIe x16 busses sees two network
ports; in effect, the two physical ports of the ConnectX-6 Socket Direct
adapter are viewed as four net devices by the system.
[root@mftqa-009 ~]# lspci |grep mellanox -i 3:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]
39
Dual-port PCIe x16 Card
[root@mftqa-009 ~]# lspci |grep mellanox -i 86:00.0 Network controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 86:00.1 Network controller: Mellanox Technologies MT28908A0 Family [ConnectX-6]
On Windows 1. Open Device Manager on the server. Click Start => Run, and then enter devmgmt.msc. 2. Expand System Devices and locate your NVIDIA ConnectX-6 adapter card. 3. Right click the mouse on your adapter’s row and select Properties to display the adapter card properties window. 4. Click the Details tab and select Hardware Ids (Windows 2012/R2/2016) from the Property pull-down menu.
40
PCI Device (Example)
5. In the Value display box, check the fields VEN and DEV (fields are separated by `&’). In the display example above, notice the sub-string “PCIVEN_15B3&DEV_1003”: VEN is equal to 0x15B3 this is the Vendor ID of NVIDIA; and DEV is equal to 1018 (for ConnectX-6) this is a valid NVIDIA PCI Device ID.
If the PCI device does not have a NVIDIA adapter ID, return to Step 2 to check another device.
41
The list of NVIDIA PCI Device IDs can be found in the PCI ID repository at http://pci-ids.ucw.cz/read/PC/15b3.
PCIe x8/16 Cards Installation Instructions
Installing the Card
Applies to OPNs MCX651105A-EDAT, MCX654105A-HCAT, MCX654106A-HCAT,
MCX683105AN-HDAT, MCX653106A-ECAT and MCX653105A-ECAT.
Please make sure to install the ConnectX-6 cards in a PCIe slot that is
capable of supplying the required power and airflow as stated in
Specifications.
Connect the adapter Card in an available PCI Express x16 slot in the chassis. Step 1: Locate an available PCI Express x16 slot and insert the adapter card to the chassis.
42
Step 2: Applying even pressure at both corners of the card, insert the adapter card in a PCI Express slot until firmly seated. 43
Do not use excessive force when seating the card, as this may damage the
chassis.
Secure the adapter card to the chassis. Step 1: Secure the bracket to the
chassis with the bracket screw.
Uninstalling the Card
Safety Precautions 44
The adapter is installed in a system that operates with voltages that can be
lethal. Before uninstalling the adapter card, please observe the following
precautions to avoid injury and prevent damage to system components.
1. Remove any metallic objects from your hands and wrists. 2. It is strongly
recommended to use an ESD strap or other antistatic devices. 3. Turn off the
system and disconnect the power cord from the server. Card Removal
Please note that the following images are for illustration purposes only.
1. Verify that the system is powered off and unplugged. 2. Wait 30 seconds.
3. To remove the card, disengage the retention mechanisms on the bracket
(clips or screws).
4. Holding the adapter card from its center, gently pull the ConnectX-6 and
Auxiliary Connections cards out of the PCI Express slot.
45
Socket Direct (2x PCIe x16) Cards Installation Instructions
The hardware installation section uses the terminology of white and black
harnesses to differentiate between the two supplied cables. Due to supply
chain variations, some cards may be supplied with two black harnesses instead.
To clarify the difference between these two harnesses, one black harness was
marked with a “WHITE” label and the other with a “BLACK” label. The Cabline
harness marked with “WHITE” label should be connected to the connector on the
ConnectX-6 and PCIe card engraved with “White Cable” while the one marked with
“BLACK” label should be connected to the connector on the ConnectX-6 and PCIe
card engraved with “Black Cable”.
The harnesses’ minimal bending radius is 10[mm].
46
Installing the Card
Applies to MCX654105A-HCAT, MCX654106A-HCAT and MCX654106A-ECAT. The
installation instructions include steps that involve a retention clip to be
used while connecting the Cabline harnesses to the cards. Please note
that this is an optional accessory.
Please make sure to install the ConnectX-6 cards in a PCIe slot that is
capable of supplying the required power and airflow as stated in
Specifications. Connect the adapter card with the Auxiliary connection card
using the supplied Cabline CA-II Plus harnesses. Step 1: Slide the black and
white Cabline CA-II Plus harnesses through the retention clip while making
sure the clip opening is facing the plugs.
47
Step 2: Plug the Cabline CA-II Plus harnesses on the ConnectX-6 adapter card
while paying attention to the color-coding. As indicated on both sides of the
card; plug the black harness to the component side and the white harness to
the print side.
Step 2: Verify the plugs are locked.
48
Step 3: Slide the retention clip latches through the cutouts on the PCB. The latches should face the annotation on the PCB. 49
Step 4: Clamp the retention clip. Verify both latches are firmly locked. 50
Step 5: Slide the Cabline CA-II Plus harnesses through the retention clip. Make sure that the clip opening is facing the plugs. 51
Step 6: Plug the Cabline CA-II Plus harnesses on the PCIe Auxiliary Card. As
indicated on both sides of the Auxiliary connection card; plug the black
harness to the component side and the white harness to the print side.
Step 7: Verify the plugs are locked. 52
Step 8: Slide the retention clip through the cutouts on the PCB. Make sure
latches are facing “Black Cable” annotation as seen in the below picture.
Step 9: Clamp the retention clip. Verify both latches are firmly locked. 53
Connect the ConnectX-6 adapter and PCIe Auxiliary Connection cards in
available PCI Express x16 slots in the chassis. Step 1: Locate two available
PCI Express x16 slots. Step 2: Applying even pressure at both corners of the
cards, insert the adapter card in the PCI Express slots until firmly seated.
54
Do not use excessive force when seating the cards, as this may damage the system or
the cards.
Step 3: Applying even pressure at both corners of the cards, insert the Auxiliary Connection card in the PCI Express slots until firmly seated.
55
Secure the ConnectX-6 adapter and PCIe Auxiliary Connection Cards to the
chassis. Step 1: Secure the brackets to the chassis with the bracket screw.
56
Uninstalling the Card
Safety Precautions The adapter is installed in a system that operates with
voltages that can be lethal. Before uninstalling the adapter card, please
observe the following precautions to avoid injury and prevent damage to system
components.
1. Remove any metallic objects from your hands and wrists. 2. It is strongly
recommended to use an ESD strap or other antistatic devices. 3. Turn off the
system and disconnect the power cord from the server. Card Removal
Please note that the following images are for illustration purposes only.
57
1. Verify that the system is powered off and unplugged. 2. Wait 30 seconds.
3. To remove the card, disengage the retention mechanisms on the bracket
(clips or screws).
4. Holding the adapter card from its center, gently pull the ConnectX-6 and
Auxiliary Connections cards out of the PCI Express slot.
58
Cards for Intel Liquid-Cooled Platforms Installation Instructions
The below instructions apply to ConnectX-6 cards designed for Intel liquid-
cooled platforms with ASIC interposer cooling mechanism. OPNs: MCX653105AHDAL
and MCX653106A-HDAL.
The below figures are for illustration purposes only. The below instructions
should be used in conjunction with the server’s documentation.
Installing the Card
Please make sure the system is capable of supplying the required power as
stated in Specifications.
59
Pay extra attention to the black bumpers located on the print side of the
card. Failure to do so may harm the bumpers.
Apply the supplied thermal pad (one of the two) on top of the ASIC interposer
or onto the coldplate.
The thermal pads are shipped with two protective liners covering the pad on
both sides. It is highly important to peel the liners as instructed
below prior to applying them to the card. 1. Gently peel the liner from the
thermal pad’s tacky side. 2. Carefully apply the thermal pad on the cool block
(ASIC interposer) while ensuring it thoroughly covers it. Extra care should be
taken not to damage
the pad. The thermal pad should be applied on the cool block from its tacky
(wet) side. The pad should be applied with its non-tacky side facing up.
OR Carefully apply the thermal pad on the coldplate while ensuring it
thoroughly covers it. The below figure indicates the position of the thermal
pad. Extra care should be taken not to damage the pad. The thermal pad should
be applied on the coldplate from its tacky (wet) side. The pad should be
applied with its non-tacky side facing up.
60
2. Ensure the thermal pad is in place and intact. 3. Once the thermal pad is
applied to the ASIC interposer, the non-tacky side should be visible on the
card’s faceplate.
61
4. Gently peel the liner of the pad’s non-tacky side visible on the card’s
faceplate. Failure to do so may degrade the thermal performance of the
product.
Install the adapter into the riser and attach the card to the PCIe x16 slot.
- Disengage the adapter riser from the blade. Please refer to the blade
documentation for instructions.
2. Applying even pressure at both corners of the card, insert the adapter card into the adapter riser until firmly seated. Care must be taken to not harm the black bumpers located on the print side of the card.
62
Vertically insert the riser that populates the adapter card into the server blade. 63
1. Applying even pressure on the riser, gently insert the riser into the
server. 2. Secure the riser with the supplied screws. Please refer to the
server blade documentation for more information.
64
Driver Installation
Please use the relevant driver installation section.
ConnectX-6 Socket Direct cards 2x PCIe x16 (OPNs: MCX654106A-HCAT and
MCX654106A-ECAT) are not supported in Windows and WinOF-2.
· Linux Driver Installation · Windows Driver Installation · VMware Driver Installation
Linux Driver Installation
This section describes how to install and test the MLNX_OFED for Linux package
on a single server with a NVIDIA ConnectX-6 adapter card installed.
Prerequisites
Platforms
Requirements
Required Disk Space for Installation Operating System
Installer Privileges
Description
A server platform with a ConnectX-6 InfiniBand/Ethernet adapter card
installed.
1GB Linux operating system. For the list of supported operating system
distributions and kernels, please refer to the MLNX_OFED Release Notes The
installation requires administrator (root) privileges on the target machine.
65
Downloading NVIDIA OFED
1. Verify that the system has a NVIDIA network adapter installed by running
lscpi command. The below table provides output examples per ConnectX-6 card
configuration.
ConnectX-6 Card Configuration
Single-port Socket Direct Card (2x PCIe x16)
[root@mftqa-009 ~]# lspci |grep mellanox -i a3:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6] e3:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]
Dual-port Socket Direct Card (2x PCIe x16)
Single-port PCIe x16 Card
[root@mftqa-009 ~]# lspci |grep mellanox -i 05:00.0 Infiniband controller:
Mellanox Technologies MT28908A0 Family [ConnectX-6] 05:00.1 Infiniband
controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 82:00.0
Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6]
82:00.1 Infiniband controller: Mellanox Technologies MT28908A0 Family
[ConnectX-6] In the output example above, the first two rows indicate that one
card is installed in a PCI slot with PCI Bus address 05 (hexadecimal), PCI
Device number 00 and PCI Function number 0 and 1. The other card is installed
in a PCI slot with PCI Bus address 82 (hexadecimal), PCI Device number 00 and
PCI Function number 0 and 1. Since the two PCIe cards are installed in two
PCIe slots, each card gets a unique PCI Bus and Device number. Each of the
PCIe x16 busses sees two network ports; in effect, the two physical ports of
the ConnectX-6 Socket Direct adapter are viewed as four net devices by the
system.
[root@mftqa-009 ~]# lspci |grep mellanox -ia 3:00.0 Infiniband controller:
Mellanox Technologies MT28908 Family [ConnectX-6]
66
ConnectX-6 Card Configuration Dual-port PCIe x16 Card
[root@mftqa-009 ~]# lspci |grep mellanox -ia 86:00.0 Network controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 86:00.1 Network controller: Mellanox Technologies MT28908A0 Family [ConnectX-6]
2. Download the ISO image to your host. The image’s name has the format
MLNX_OFED_LINUX-
nvidia.com/en-us/networking Products Software InfiniBand Drivers NVIDIA
MLNX_OFED i. Scroll down to the Download wizard, and click the Download tab.
ii. Choose your relevant package depending on your host operating system.
iii. Click the desired ISO/tgz package. iv. To obtain the download link,
accept the End User License Agreement (EULA). 3. Use the Hash utility to
confirm the file integrity of your ISO image. Run the following command and
compare the result to the value provided on the download page.
SHA256 MLNX_OFED_LINUX-
Installing MLNX_OFED
Installation Script
The installation script, mlnxofedinstall, performs the following: · Discovers
the currently installed kernel · Uninstalls any software stacks that are part
of the standard operating system distribution or another vendor’s commercial
stack
67
· Installs the MLNX_OFED_LINUX binary RPMs (if they are available for the
current kernel) · Identifies the currently installed InfiniBand and Ethernet
network adapters and automatically upgrades the firmware
Note: To perform a firmware upgrade using customized firmware binaries, a path
can be provided to the folder that contains the firmware binary files, by
running –fw-image-dir. Using this option, the firmware version embedded in the
MLNX_OFED package will be ignored. Example:
./mlnxofedinstall –fw-image-dir /tmp/my_fw_bin_files
If the driver detects unsupported cards on the system, it will abort the
installation procedure. To avoid this, make sure to add –skip-
unsupported-devices-check flag during installation. Usage
./mnt/mlnxofedinstall [OPTIONS] The installation script removes all previously
installed OFED packages and re-installs from scratch. You will be prompted to
acknowledge the deletion of the old packages.
Pre-existing configuration files will be saved with the extension
“.conf.rpmsave”.
· If you need to install OFED on an entire (homogeneous) cluster, a common
strategy is to mount the ISO image on one of the cluster nodes and then copy
it to a shared file system such as NFS. To install on all the cluster nodes,
use cluster-aware tools (suchaspdsh).
· If your kernel version does not match with any of the offered pre-built
RPMs, you can add your kernel version by using the
“mlnx_add_kernel_support.sh” script located inside the MLNX_OFED package.
On Redhat and SLES distributions with errata kernel installed there is no need
to use the mlnx_add_kernel_support.sh script. The
regular installation can be performed and weak-updates mechanism will create
symbolic links to the MLNX_OFED kernel modules.
68
If you regenerate kernel modules for a custom kernel (using –add-kernel-
support), the packages installation will not involve
automatic regeneration of the initramfs. In some cases, such as a system with
a root filesystem mounted over a ConnectX card, not regenerating the initramfs
may even cause the system to fail to reboot. In such cases, the installer will
recommend running the following command to update the initramfs:
dracut -f On some OSs, dracut -f might result in the following error message
which can be safely ignore. libkmod: kmod_module_new_from_path: kmod_module
‘mdev’ already exists with different path
The “mlnx_add_kernel_support.sh” script can be executed directly from the
mlnxofedinstall script. For further information, please see ‘-add-kernel-
support’ option below.
On Ubuntu and Debian distributions drivers installation use Dynamic Kernel
Module Support (DKMS) framework. Thus, the drivers’
compilation will take place on the host during MLNX_OFED installation.
Therefore, using “mlnx_add_kernel_support.sh” is irrelevant on Ubuntu and
Debian distributions.
Example: The following command will create a MLNX_OFED_LINUX ISO image for
RedHat 7.3 under the /tmp directory.
./MLNX_OFED_LINUX-x.x-x-rhel7.3-x86_64/mlnx_add_kernel_support.sh -m
/tmp/MLNX_OFED_LINUX-x.x-xrhel7.3-x86_64/ –make-tgz Note: This program will
create MLNX_OFED_LINUX TGZ for rhel7.3 under /tmp directory. All Mellanox,
OEM, OFED, or Distribution IB packages will be removed. Do you want to
continue?[y/N]:y See log file /tmp/mlnx_ofed_iso.21642.log
Building OFED RPMs. Please wait… Removing OFED RPMs… Created
/tmp/MLNX_OFED_LINUX-x.x-x-rhel7.3-x86_64-ext.tgz · The script adds the
following lines to /etc/security/limits.conf for the userspace components such
as MPI:
· soft memlock unlimited · hard memlock unlimited
69
· These settings set the amount of memory that can be pinned by a userspace
application to unlimited. If desired, tune the value unlimited to a specific
amount of RAM.
For your machine to be part of the InfiniBand/VPI fabric, a Subnet Manager
must be running on one of the fabric nodes. At this point, OFED for Linux has
already installed the OpenSM Subnet Manager on your machine. For the list of
installation options, run:
./mlnxofedinstall –h
Installation Procedure
This section describes the installation procedure of MLNX_OFED on NVIDIA
adapter cards. a. Log in to the installation machine as root. b. Mount the ISO
image on your machine.
host1# mount -o ro,loop MLNX_OFED_LINUX-
/mnt/mlnxofedinstall Logs dir: /tmp/MLNX_OFED_LINUX-x.x-x.logs This program
will install the MLNX_OFED_LINUX package on your machine. Note that all other
Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed. Those
packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall
them. Starting MLNX_OFED_LINUX-x.x.x installation … …….. …….. Installation
finished successfully. Attempting to perform Firmware update… Querying
Mellanox devices firmware …
70
For unattended installation, use the –force installation option while running
the MLNX_OFED installation script:
/mnt/mlnxofedinstall –force
MLNXOFED for Ubuntu should be installed with the following flags in chroot
environment:
./mlnxofedinstall –without-dkms –add-kernel-support –kernel <kernel version in
chroot> –without-fw-update –force For example: ./mlnxofedinstall –without-dkms
–add-kernel-support –kernel 3.13.0-85-generic –without-fw-update –force Note
that the path to kernel sources (–kernel-sources) should be added if the
sources are not in their default location.
In case your machine has the latest firmware, no firmware update will occur
and the installation script will print at the end of
installation a message similar to the following: Device #1: ———Device Type:
ConnectX-X Part Number: MCXXXX-XXX PSID: MT
In case your machine has an unsupported network adapter device, no firmware
update will occur and one of the error messages
below will be printed. Please contact your hardware vendor for help with
firmware updates. Error message #1: Device #1: ———Device Type: ConnectX-X Part
Number: MCXXXX-XXX PSID: MT_
71
Base MAC: 0000e41d2d5cf810 Versions: Current Available FW XX.XX.XXXX Status: No matching image found Error message #2: The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor.
d. Case A: If the installation script has performed a firmware update on your network adapter, you need to either restart the driver or reboot your system before the firmware update can take effect. Refer to the table below to find the appropriate action for your specific card.
Action Adapter
Driver Restart
Standard Reboot (Soft Reset)
Cold Reboot (Hard Reset)
Standard ConnectX-4/ConnectX-4 Lx or –
–
higher
Adapters with Multi-Host Support
–
–
Socket Direct Cards
–
–
Case B: If the installations script has not performed a firmware upgrade on
your network adapter, restart the driver by running: “/etc/init.d/ openibd
restart”.
e. (InfiniBand only) Run the hca_self_test.ofed utility to verify whether or
not the InfiniBand link is up. The utility also checks for and displays
additional information such as: · HCA firmware version · Kernel architecture ·
Driver version · Number of active HCA ports along with their states · Node
GUID For more details on hca_self_test.ofed, see the file
docs/readme_and_user_manual/hca_self_test.readme.
After installation completion, information about the OFED installation, such
as prefix, kernel version, and installation parameters can be retrieved by
running the command /etc/infiniband/info. Most of the OFED components can be
configured or reconfigured after the installation, by modifying
72
the relevant configuration files. See the relevant chapters in this manual for
details. The list of the modules that will be loaded automatically upon boot
can be found in the /etc/infiniband/openib.conf file.
Installing OFED will replace the RDMA stack and remove existing 3rd party RDMA
connectors.
Installation Results
Software
Firmware
· Most of MLNX_OFED packages are installed under the “/usr” directory except
for the following packages which are installed under the “/opt” directory: ·
fca and ibutils · iproute2 (rdma tool) – installed under
/opt/Mellanox/iproute2/sbin/rdma
· The kernel modules are installed under · /lib/modules/uname -r
/updates on
SLES and Fedora Distributions · /lib/modules/uname -r
/extra/mlnx-ofa_kernel
on RHEL and other RedHat like Distributions · /lib/modules/uname -r
/updates/dkms/ on Ubuntu
· The firmware of existing network adapter devices will be updated if the
following two conditions are fulfilled: · The installation script is run in
default mode; that is, without the option `–without- fw-update’ · The firmware
version of the adapter device is older than the firmware version included with
the OFED ISO image Note: If an adapter’s Flash was originally programmed with
an Expansion ROM image, the automatic firmware update will also burn an
Expansion ROM image.
· In case your machine has an unsupported network adapter device, no firmware
update will occur and the error message below will be printed. “The firmware
for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID:
IBM2150110033) To obtain firmware for this device, please contact your HW
vendor.”
Installation Logging
While installing MLNX_OFED, the install log for each selected package will be
saved in a separate log file. The path to the directory containing the log
files will be displayed after running the installation script in the following
format: Example:
Logs dir: /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0.IBMM2150110033.logs
73
Driver Load Upon System Boot
Upon system boot, the NVIDIA drivers will be loaded automatically. To prevent
the automatic load of the NVIDIA drivers upon system boot:
a. Add the following lines to the “/etc/modprobe.d/mlnx.conf” file.
blacklist mlx5_core blacklist mlx5_ib
b. Set “ONBOOT=no” in the “/etc/infiniband/openib.conf” file. c. If the
modules exist in the initramfs file, they can automatically be loaded by the
kernel. To prevent this behavior, update the initramfs using
the operating systems’ standard tools. Note: The process of updating the
initramfs will add the blacklists from step 1, and will prevent the kernel
from loading the modules automatically.
mlnxofedinstall Return Codes
The table below lists the mlnxofedinstall script return codes and their meanings.
Return Code
Meaning
0
The Installation ended successfully
1
The installation failed
2
No firmware was found for the adapter device
22
Invalid parameter
28
Not enough free space
171
Not applicable to this system configuration. This can occur when the required hardware is not present on the system
172
Prerequisites are not met. For example, missing the required software installed or the hardware is not configured correctly
173
Failed to start the mst driver
74
Software Firmware
· Most of MLNX_OFED packages are installed under the “/usr” directory except
for the following packages which are installed under the “/opt” directory: ·
fca and ibutils · iproute2 (rdma tool) – installed under
/opt/Mellanox/iproute2/sbin/rdma
· The kernel modules are installed under · /lib/modules/uname -r
/updates on
SLES and Fedora Distributions · /lib/modules/uname -r
/extra/mlnx-ofa_kernel
on RHEL and other RedHat like Distributions · /lib/modules/uname -r
/updates/dkms/ on Ubuntu
· The firmware of existing network adapter devices will be updated if the
following two conditions are fulfilled: · The installation script is run in
default mode; that is, without the option `–without- fw-update’ · The firmware
version of the adapter device is older than the firmware version included with
the OFED ISO image Note: If an adapter’s Flash was originally programmed with
an Expansion ROM image, the automatic firmware update will also burn an
Expansion ROM image.
· In case your machine has an unsupported network adapter device, no firmware
update will occur and the error message below will be printed. “The firmware
for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID:
IBM2150110033) To obtain firmware for this device, please contact your HW
vendor.”
Installation Logging
While installing MLNX_OFED, the install log for each selected package will be
saved in a separate log file. The path to the directory containing the log
files will be displayed after running the installation script in the following
format: Example:
Logs dir: /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0.IBMM2150110033.logs
Uninstalling MLNX_OFED
Use the script /usr/sbin/ofed_uninstall.sh to uninstall the MLNX_OFED package.
The script is part of the ofed-scripts RPM.
75
Additional Installation Procedures
Installing MLNX_OFED Using YUM
This type of installation is applicable to RedHat/OL and Fedora operating
systems.
Setting up MLNX_OFED YUM Repository
a. Log into the installation machine as root. b. Mount the ISO image on your
machine and copy its content to a shared location in your network.
mount -o ro,loop MLNX_OFED_LINUX---.iso /mnt
c. Download and install NVIDIA’s GPG-KEY: The key can be downloaded via the following link: http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox
wget http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox –2018-01-25
13:52:30– http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox Resolving www.mellanox.com… 72.3.194.0 Connecting to www.mellanox.com|72.3.194.0|:80… connected. HTTP request sent, awaiting response… 200 OK Length: 1354 (1.3K) [text/plain] Saving to: ?RPM-GPG-KEY- Mellanox?
100%[=================================================>] 1,354
–.-K/s in 0s
2018-01-25 13:52:30 (247 MB/s) – ?RPM-GPG-KEY-Mellanox? saved [1354/1354]
d. Install the key.
sudo rpm –import RPM-GPG-KEY-Mellanox
76
warning: rpmts_HdrFromFdno: Header V3 DSA/SHA1 Signature, key ID 6224c050:
NOKEY Retrieving key from file:///repos/MLNX_OFED/
Userid: “Mellanox Technologies (Mellanox Technologies – Signing Key v2)
support@mellanox.com” From : /repos/MLNX_OFED/
e. Check that the key was successfully imported.
rpm -q gpg-pubkey –qf ‘%{NAME}-%{VERSION}-%{RELEASE}t%{SUMMARY}n’ | grep
Mellanox gpg-pubkey-a9e4b643-520791ba gpg(Mellanox Technologies
support@mellanox.com)
f. Create a yum repository configuration file called
“/etc/yum.repos.d/mlnx_ofed.repo” with the following content:
[mlnx_ofed] name=MLNX_OFED Repository baseurl=file:///<path to extracted
MLNX_OFED package>/RPMS enabled=1 gpgkey=file:///<path to the downloaded key
RPM-GPG-KEY-Mellanox> gpgcheck=1
g. Check that the repository was successfully added.
yum repolist
Loaded plugins: product-id, security, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to
register.
repo id repo name
status
mlnx_ofed MLNX_OFED Repository
108
rpmforge RHEL 6Server – RPMforge.net – dag
4,597
repolist: 8,351
Setting up MLNX_OFED YUM Repository Using –add-kernel-support
a. Log into the installation machine as root. b. Mount the ISO image on your
machine and copy its content to a shared location in your network.
77
mount -o ro,loop MLNX_OFED_LINUX---.iso /mnt c.
Build the packages with kernel support and create the tarball.
/mnt/mlnx_add_kernel_support.sh –make-tgz <optional –kmp> -k $(uname -r) -m
/mnt/ Note: This program will create MLNX_OFED_LINUX TGZ for rhel7.6 under
/tmp directory. Do you want to continue?[y/N]:y See log file
/tmp/mlnx_iso.4120_logs/mlnx_ofed_iso.4120.log
Checking if all needed packages are installed… Building MLNX_OFED_LINUX RPMS .
Please wait… Creating metadata-rpms for 3.10.0-957.21.3.el7.x86_64 … WARNING:
If you are going to configure this package as a repository, then please note
WARNING: that it contains unsigned rpms, therefore, you need to disable the
gpgcheck WARNING: by setting ‘gpgcheck=0’ in the repository conf file. Created
/tmp/MLNX_OFED_LINUX-5.2-0.5.5.0-rhel7.6-x86_64-ext.tgz d. Open the tarball.
cd /tmp/ # tar -xvf /tmp/MLNX_OFED_LINUX-5.2-0.5.5.0-rhel7.6-x86_64-ext.tgz
e. Create a YUM repository configuration file called
“/etc/yum.repos.d/mlnx_ofed.repo” with the following content:
[mlnx_ofed] name=MLNX_OFED Repository baseurl=file:///<path to extracted
MLNX_OFED package>/RPMS enabled=1 gpgcheck=0 f. Check that the repository was
successfully added.
yum repolist Loaded plugins: product-id, security, subscription-manager This
system is not registered to Red Hat Subscription Management. You can use
subscription-manager to register.
78
repo id repo name mlnx_ofed MLNX_OFED Repository rpmforge RHEL 6Server –
RPMforge.net – dag
repolist: 8,351
status 108 4,597
Installing MLNX_OFED Using the YUM Tool
After setting up the YUM repository for MLNX_OFED package, perform the
following: a. View the available package groups by invoking:
yum search mlnx-ofedmlnx-ofed-all.noarch : MLNX_OFED all installer package
(with KMP support) mlnx-ofed-all-user-only.noarch : MLNX_OFED all-user-only
installer package (User Space packages only) mlnx-ofed-basic.noarch :
MLNX_OFED basic installer package (with KMP support) mlnx-ofed-basic-user-
only.noarch : MLNX_OFED basic-user-only installer package (User Space packages
only) mlnx-ofed-bluefield.noarch : MLNX_OFED bluefield installer package (with
KMP support) mlnx-ofed-bluefield-user-only.noarch : MLNX_OFED bluefield-user-
only installer package (User Space packages only) mlnx-ofed-dpdk.noarch :
MLNX_OFED dpdk installer package (with KMP support) mlnx-ofed-dpdk-upstream-
libs.noarch : MLNX_OFED dpdk-upstream-libs installer package (with KMP
support) mlnx-ofed-dpdk-upstream-libs-user-only.noarch : MLNX_OFED dpdk-
upstream-libs-user-only installer package
(User Space packages only) mlnx-ofed-dpdk-user-only.noarch : MLNX_OFED dpdk-
user-only installer package (User Space packages only) mlnx-ofed-eth-only-
user-only.noarch : MLNX_OFED eth-only-user-only installer package (User Space
packages only) mlnx-ofed-guest.noarch : MLNX_OFED guest installer package
(with KMP support) mlnx-ofed-guest-user-only.noarch : MLNX_OFED guest-user-
only installer package (User Space packages only) mlnx-ofed-hpc.noarch :
MLNX_OFED hpc installer package (with KMP support) mlnx-ofed-hpc-user-
only.noarch : MLNX_OFED hpc-user-only installer package (User Space packages
only) mlnx-ofed-hypervisor.noarch : MLNX_OFED hypervisor installer package
(with KMP support) mlnx-ofed-hypervisor-user-only.noarch : MLNX_OFED
hypervisor-user-only installer package (User Space packages only) mlnx-ofed-
kernel-only.noarch : MLNX_OFED kernel-only installer package (with KMP
support) mlnx-ofed-vma.noarch : MLNX_OFED vma installer package (with KMP
support) mlnx-ofed-vma-eth.noarch : MLNX_OFED vma-eth installer package (with
KMP support)
79
mlnx-ofed-vma-eth-user-only.noarch : MLNX_OFED vma-eth-user-only installer package (User Space packages only) mlnx-ofed-vma-user-only.noarch : MLNX_OFED vma-user-only installer package (User Space packages only) mlnx-ofed-vma- vpi.noarch : MLNX_OFED vma-vpi installer package (with KMP support) mlnx-ofed- vma-vpi-user-only.noarch : MLNX_OFED vma-vpi-user-only installer package (User Space packages only
where:
mlnx-ofed-all
mlnx-ofed-basic
Installs all available packages in MLNX_OFED Installs basic packages required for running NVIDIA cards
mlnx-ofed-guest
Installs packages required by guest OS
mlnx-ofed-hpc
Installs packages required for HPC
mlnx-ofed-hypervisor
Installs packages required by hypervisor OS
mlnx-ofed-vma
Installs packages required by VMA
mlnx-ofed-vma-eth
Installs packages required by VMA to work over Ethernet
mlnx-ofed-vma-vpi
Installs packages required by VMA to support VPI
bluefield
Installs packages required for BlueField
dpdk
Installs packages required for DPDK
dpdk-upstream-libs
Installs packages required for DPDK using RDMA-Core
kernel-only
Installs packages required for a non-default kernel
Note: MLNX_OFED provides kernel module RPM packages with KMP support for RHEL
and SLES. For other operating systems, kernel module
RPM packages are provided only for the operating system’s default kernel. In
this case, the group RPM packages have the supported kernel version in their
package’s name. Example:
mlnx-ofed-all-3.17.4-301.fc21.x86_64.noarch : MLNX_OFED all installer package for kernel 3.17.4-301.fc21 .x86_64 (without KMP support) mlnx-ofed- basic-3.17.4-301.fc21.x86_64.noarch : MLNX_OFED basic installer package for kernel 3.17.4-301. fc21.x86_64 (without KMP support)
where:
mlnx-ofed-all
MLNX_OFED all installer package
mlnx-ofed-basic
MLNX_OFED basic installer package
mlnx-ofed-vma
MLNX_OFED vma installer package
mlnx-ofed-hpc
MLNX_OFED HPC installer package
mlnx-ofed-vma-eth
MLNX_OFED vma-eth installer package
mlnx-ofed-vma-vpi
MLNX_OFED vma-vpi installer package
knem-dkms
MLNX_OFED DKMS support for mlnx-ofed kernel modules
kernel-dkms
MLNX_OFED kernel-dkms installer package
kernel-only
MLNX_OFED kernel-only installer package
bluefield
MLNX_OFED bluefield installer package
mlnx-ofed-all-exact
MLNX_OFED mlnx-ofed-all-exact installer package
dpdk
MLNX_OFED dpdk installer package
mlnx-ofed-basic-exact
MLNX_OFED mlnx-ofed-basic-exact installer package
dpdk-upstream-libs
MLNX_OFED dpdk-upstream-libs installer package
b. Install the desired group.
apt-get install ‘
85
Example: apt-get install mlnx-ofed-all
Installing MLNX_OFED using the “apt-get” tool does not automatically update
the firmware.
To update the firmware to the version included in MLNX_OFED package, run: #
apt-get install mlnx-fw-updater
Performance Tuning
Depending on the application of the user’s system, it may be necessary to
modify the default configuration of network adapters based on the ConnectX®
adapters. In case that tuning is required, please refer to the Performance
Tuning Guide for NVIDIA Network Adapters.
Windows Driver Installation
Windows driver is currently not supported in the following ConnectX-6 OPNs:
· MCX654106A-HCAT · MCX654106A-ECAT
For Windows, download and install the latest WinOF-2 for Windows software
package available via the NVIDIA website at: WinOF-2 webpage. Follow the
installation instructions included in the download package (also available
from the download page). The snapshots in the following sections are presented
for illustration purposes only. The installation interface may slightly vary,
depending on the operating system in use.
86
Software Requirements
Description Windows Server 2022 Windows Server 2019 Windows Server 2016
Windows Server 2012 R2 Windows 11 Client (64 bit only) Windows 10 Client (64
bit only) Windows 8.1 Client (64 bit only)
Package MLNX_WinOF2-
Note: The Operating System listed above must run with administrator privileges.
Downloading WinOF-2 Driver
To download the .exe file according to your Operating System, please follow
the steps below: 1. Obtain the machine architecture.
a. To go to the Start menu, position your mouse in the bottom-right corner of
the Remote Desktop of your screen. b. Open a CMD console (Click Task
Manager–>File –> Run new task and enter CMD). c. Enter the following command.
echo %PROCESSOR_ARCHITECTURE%
On an x64 (64-bit) machine, the output will be “AMD64”.
87
2. Go to the WinOF-2 web page at: https://www.nvidia.com/en-us/networking/ >
Products > Software > InfiniBand Drivers (Learn More) > Nvidia WinOF-2.
3. Download the .exe image according to the architecture of your machine (see
Step 1). The name of the .exe is in the following format:
MLNXWinOF2-
Installing the incorrect .exe file is prohibited. If you do so, an error
message will be displayed.
For example, if you install a 64-bit .exe on a 32-bit machine, the wizard will
display the following (or a similar) error message: “The installation package
is not supported by this processor type. Contact your vendor”
Installing WinOF-2 Driver
The snapshots in the following sections are for illustration purposes only.
The installation interface may slightly vary, depending on the used operating
system. This section provides instructions for two types of installation
procedures, and both require administrator privileges:
· Attended Installation An installation procedure that requires frequent user
intervention.
· Unattended Installation An automated installation procedure that requires no
user intervention.
Attended Installation
The following is an example of an installation session. 1. Double click the
.exe and follow the GUI instructions to install MLNX_WinOF2. 2. [Optional]
Manually configure your setup to contain the logs option (replace “LogFile”
with the relevant directory).
MLNXWinOF2
88
MLNXWinOF2
MLNXWinOF2
The Rshim driver installanion will fail if a prior Rshim driver is already
installed. The following fail message will be displayed in the log:
“ERROR!!! Installation failed due to following errors: MlxRshim drivers
installation disabled and MlxRshim drivers Installed, Please remove the
following oem inf files from driver store:
89
7. Read and accept the license agreement and click Next. 90
8. Select the target folder for the installation. 91
9. The firmware upgrade screen will be displayed in the following cases: · If
the user has an OEM card. In this case, the firmware will not be displayed. ·
If the user has a standard NVIDIA® card with an older firmware version, the
firmware will be updated accordingly. However, if the user has both an OEM
card and a NVIDIA® card, only the NVIDIA® card will be updated.
92
10. Select a Complete or Custom installation, follow Step a onward. 93
a. Select the desired feature to install: · Performances tools – install the
performance tools that are used to measure performance in user environment ·
Documentation – contains the User Manual and Release Notes · Management tools
– installation tools used for management, such as mlxstat · Diagnostic Tools –
installation tools used for diagnostics, such as mlx5cmd
94
b. Click Next to install the desired tools. 95
11. Click Install to start the installation.
12. In case firmware upgrade option was checked in Step 7, you will be
notified if a firmware upgrade is required (see ). 96
97
13. Click Finish to complete the installation.
Unattended Installation
If no reboot options are specified, the installer restarts the computer
whenever necessary without displaying any prompt or warning to the user.
To control the reboots, use the /norestart or /forcerestart standard command-
line options. The following is an example of an unattended installation
session.
1. Open a CMD console-> Click Start-> Task Manager File-> Run new task-> and
enter CMD. 98
2. Install the driver. Run: MLNXWinOF2-[Driver/Version]<revision_version
All-Arch.exe /S /v/qn
3. [Optional] Manually configure your setup to contain the logs option:
MLNXWinOF2-[Driver/Version]All-Arch.exe /S /v/qn /v”/lvx [LogFile]” 4. [Optional] if you wish to control whether to install ND provider or not (i.e., MT_NDPROPERTY default value is True).
MLNXWinOF2-[Driver/Version]_All_Arch.exe /vMT_NDPROPERTY=1 5. [Optional] If you do not wish to upgrade your firmware version (i.e.,MT_SKIPFWUPGRD default value is False). vx C:Users
MLNXWinOF2-[Driver/Version]_All_Arch.exe /vMT_SKIPFWUPGRD=1 6. [Optional] If you do not want to install the Rshim driver, run.
MLNXWinOF2_All_Arch.exe /v” MT_DISABLE_RSHIM_INSTALL=1″
The Rshim driver installanion will fail if a prior Rshim driver is already installed. The following fail message will be displayed in the log:
“ERROR!!! Installation failed due to following errors: MlxRshim drivers installation disabled and MlxRshim drivers Installed, Please remove the following oem inf files from driver store:” 7. [Optional] If you want to enable the default configuration for Rivermax, run. MLNXWinOF2 _All_Arch.exe /v”MT_RIVERMAX=1 /l log.txt ” 8. [Optional] If you want to skip the check for unsupported devices, run/
99
MLNXWinOF2
Firmware Upgrade
If the machine has a standard NVIDIA® card with an older firmware version, the
firmware will be automatically updated as part of the NVIDIA® WinOF-2 package
installation. For information on how to upgrade firmware manually, please
refer to MFT User Manual.
If the machine has a DDA (pass through) facility, firmware update is supported
only in the Host. Therefore, to update the firmware, the following must be
performed:
1. Return the network adapters to the Host. 2. Update the firmware according
to the steps in the MFT User Manual. 3. Attach the adapters back to VM with
the DDA tools.
VMware Driver Installation
This section describes VMware Driver Installation.
Software Requirements
Platforms Operating System Installer Privileges
Requirement
Description A server platform with an adapter card based on ConnectX®-6 (InfiniBand/EN) (firmware: fw-ConnectX6) ESXi 6.5 The installation requires administrator privileges on the target machine.
100
Installing NATIVE ESXi Driver for VMware vSphere
Please uninstall all previous driver packages prior to installing the new
version.
To install the driver:
1. Log into the ESXi server with root permissions. 2. Install the driver.
> esxcli software vib install d / Example:
> esxcli software vib install -d /tmp/MLNX-NATIVE-ESX-
ConnectX-4-5_4.16.8.8-10EM-650.0.0.4240417.zipesxcli
3. Reboot the machine. 4. Verify the driver was installed successfully.
esxcli software vib list | grep nmlx
nmlx5-core
4.16.8.8-1OEM.650.0.0.4240417 MEL PartnerSupported 2017-01-31
nmlx5-rdma
4.16.8.8-1OEM.650.0.0.4240417 MEL PartnerSupported 2017-01-31
After the installation process, all kernel modules are loaded automatically upon boot.
Removing Earlier NVIDIA Drivers
Please unload the previously installed drivers before removing them.
To remove all the drivers:
101
1. Log into the ESXi server with root permissions. 2. List all the existing NATIVE ESXi driver modules. (See Step 4 in Installing NATIVE ESXi Driver for VMware vSphere.) 3. Remove each module:
> esxcli software vib remove -n nmlx5-rdma #> esxcli software vib remove -n
nmlx5-core
To remove the modules, you must run the command in the same order as shown in
the example above.
4. Reboot the server.
Firmware Programming
1. Download the VMware bootable binary images v4.6.0 from the Firmware Tools
(MFT) site. a. ESXi 6.5 File: mft-4.6.0.48-10EM-650.0.0.4598673.x86_64.vib b.
MD5SUM: 0804cffe30913a7b4017445a0f0adbe1
2. Install the image according to the steps described in the MFT User Manual.
The following procedure requires custom boot image downloading, mounting and
booting from a USB device.
102
Troubleshooting
General Troubleshooting
Server unable to find the adapter
The adapter no longer works Adapters stopped working after installing another
adapter
Link indicator light is off Link light is on, but with no communication
established
Event message received of insufficient power
· Ensure that the adapter is placed correctly · Make sure the adapter slot and
the adapter are compatible
Install the adapter in a different PCI Express slot · Use the drivers that
came with the adapter or download the latest · Make sure your motherboard has
the latest BIOS · Try to reboot the server
· Reseat the adapter in its slot or a different slot, if necessary · Try using
another cable · Reinstall the drivers for the network driver files may be
damaged or deleted · Reboot the server
· Try removing and re-installing all adapters · Check that cables are
connected properly · Make sure your motherboard has the latest BIOS
· Try another port on the switch · Make sure the cable is securely attached ·
Check you are using the proper cables that do not exceed the recommended
lengths · Verify that your switch and adapter port are compatible
· Check that the latest driver is loaded · Check that both the adapter and its
link are set to the same speed and duplex settings
· When [ adapter’s current power consumption ] > [ PCIe slot advertised power
limit ] a warning message appears in the server’s system even logs (Eg.
dmesg: “Detected insufficient power on the PCIe slow”)
· It’s recommended to use a PCIe slot that can supply enough power. · If a
message of the following format appears “mlx5_core 0003:01:00.0:
port_module:254:(pid 0): Port
module event[error]: module 0, Cable error, One or more network ports have
been powered down due to insufficient/unadvertised power on the PCIe slot”
please upgrade your Adapter’s firmware. · If the message remains please
consider switching from Active Optical Cable (AOC) or transceiver to Direct
Attached Copper (DAC) connectivity.
103
Linux Troubleshooting
Environment Information
Card Detection Mellanox Firmware Tool (MFT)
Ports Information Firmware Version Upgrade Collect Log File
cat /etc/issue uname -a cat /proc/cupinfo | grep `model name’ | uniq ofed_info
-s ifconfig -a ip link show ethtool
Windows Troubleshooting
Environment Information Mellanox Firmware Tool (MFT)
Ports Information Firmware Version Upgrade Collect Log File
From the Windows desktop choose the Start menu and run: msinfo32 To export
system information to a text file, choose the Export option from the File
menu. Assign a file name and save.
Download and install MFT: MFT Documentation Refer to the User Manual for
installation instructions. Once installed, open a CMD window and run: WinMFT
mst start mst status flint d
vstat
Download the latest firmware version using the PSID/board ID from here. flint
d
· Event log viewer · MST device logs:
· mst start · mst status · flint d
105
Updating Adapter Firmware
Each adapter card is shipped with the latest version of qualified firmware at
the time of manufacturing. However, NVIDIA issues firmware updates
occasionally that provide new features and bug fixes. To check that your card
is programmed with the latest available firmware version, download the mlxup
firmware update and query utility. The utility can query for available NVIDIA
adapters and indicate which adapters require a firmware update. If the user
confirms, mlxup upgrades the firmware using embedded images. The latest mlxup
executable and documentation are available in mlxup – Update and Query
Utility.
[server1]# ./mlxup
Querying Mellanox devices firmware …
Device Type:
ConnectX-6
Part Number:
MCX654106A-HCAT
Description:
ConnectX®-6 VPI adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, Socket Direct 2x
PCIe3.0 x16, tall bracket
PSID:
MT_2190110032
PCI Device Name: 0000:06:00.0
Base GUID:
e41d2d0300fd8b8a
Versions:
Current
Available
FW 16.23.1020
16.24.1000
Status:
Update required
Device Type:
ConnectX-6
Part Number:
MCX654106A-HCAT
Description:
ConnectX®-6 VPI adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, Socket Direct 2x
PCIe3.0 x16, tall bracket
PSID:
MT_2170110021
PCI Device Name: 0000:07:00.0
Base MAC:
0000e41d2da206d4
Versions:
Current
Available
FW 16.24.1000 16.24.1000
Status:
Up to date
Perform FW update? [y/N]: y Device #1: Up to date Device #2: Updating FW … Done
106
Restart needed for updates to take effect. Log File: /var/log/mlxup/mlxup-
yyyymmdd.log
107
Monitoring
The adapter card incorporates the ConnectX IC, which operates in the range of
temperatures between 0C and 105C. There are three thermal threshold
definitions that impact the overall system operation state: Warning 105°C:
On managed systems only: When the device crosses the 105°C threshold, a
Warning Threshold message will be issued by the management SW, indicating to
system administration that the card has crossed the Warning threshold. Note
that this temperature threshold does not require nor lead to any action by
hardware (such as adapter card shutdown). Critical 115°C: When the device
crosses this temperature, the firmware will automatically shut down the
device. Emergency 130°C: If the firmware fails to shut down the device upon
crossing the Critical threshold, the device will auto-shutdown upon crossing
the Emergency (130°C) threshold. The card’s thermal sensors can be read
through the system’s SMBus. The user can read these thermal sensors and adapt
the system airflow in accordance with the readouts and the needs of the above-
mentioned IC thermal requirements.
108
Specifications
MCX651105A-EDAT Specifications
Please make sure to install the ConnectX-6 card in a PCIe slot that is capable
of supplying the required power and airflow as stated in the below
table.
Physical Protocol Support
Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm)
Connector: Single QSFP56 InfiniBand and Ethernet (copper and optical)
InfiniBand: IBTA v1.4a
Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR
(10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR (14.0625Gb/s per lane),
EDR (25Gb/s per lane) port, HDR100 (2 lane x 50Gb/s per lane), HDR (50Gb/s per
lane) port
Ethernet: 200GBASE-CR4, 200GBASE-KR4, 200GBASE-SR4, 100GBASE-CR4, 100GBASE- CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASECR4, 40GBASE- KR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE- KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR
Data Rate
InfiniBand
SDR/DDR/QDR/FDR/EDR/HDR100
Ethernet
1/10/25/40/50/100 Gb/s
PCI Express Gen3.0/4.0
SERDES @ 8.0GT/s/16GT/s, x8 lanes (2.0 and 1.1 compatible)
Voltage: 3.3Aux Maximum current: 100mA
109
Power and Airflow
Environmental Regulatory
Power
Cable
Typical Powerb
Passive Cables
10.1W
Maximum Power
Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)
Maximum power available through QSFP56 port: 5W
Airflow (LFM) / Ambient Temperature
Cable Type Passive Cables
Temperature Humidity
Operational Non-operational Operational Non-operational
Altitude (Operational) Safety: CB / cTUVus / CE
3050m
EMC: CE / FCC / VCCI / ICES / RCM / KC
RoHS: RoHS Compliant
Airflow Direction
Heatsink to Port
TBD
0°C to 55°C -40°C to 70°Cc 10% to 85% relative humidity 10% to 90% relative
humidity
Port to Heatsink TBD
Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. The non- operational storage temperature specifications apply to the product without its package.
110
MCX653105A-HDAT Specifications
Please make sure to install the ConnectX-6 card in a PCIe slot that is capable
of supplying the required power and airflow as stated in the below
table.
Physical Protocol Support
Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm)
Connector: Single QSFP56 InfiniBand and Ethernet (copper and optical)
InfiniBand: IBTA v1.4a Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR
(5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR
(14.0625Gb/s per lane), EDR (25Gb/s per lane) port, HDR100 (2 lane x 50Gb/s
per lane), HDR (50Gb/s per lane) port
Ethernet: 200GBASE-CR4, 200GBASE-KR4, 200GBASE-SR4, 100GBASE-CR4, 100GBASE- CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASE- KR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE- KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASEKR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR
Data Rate
InfiniBand
SDR/DDR/QDR/FDR/EDR/HDR100/HDR
PCI Express Gen3/4:
Ethernet
1/10/25/40/50/100/200 Gb/s
SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)
111
Power and Airflow
Voltage: 3.3Aux Maximum current: 100mA
Power
Cable
Typical Powerb
Passive Cables
19.3W
Maximum Power
Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)
Maximum power available through QSFP56 port: 5W
Environmental Regulatory
Airflow (LFM) / Ambient Temperature
Cable Type
Passive Cables NVIDIA Active 4.7W Cables
Temperature Humidity
Altitude (Operational) Safety: CB / cTUVus / CE
Operational Non-operational Operational Non-operational 3050m
EMC: CE / FCC / VCCI / ICES / RCM / KC
RoHS: RoHS Compliant
Airflow Direction
Heatsink to Port
Port to Heatsink
350 LFM / 55°C
250 LFM / 35°C
500 LFM / 55°Cc
250 LFM / 35°C
0°C to 55°C -40°C to 70°C 10% to 85% relative humidity 10% to 90% relative humidityd
112
Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. For engineering samples – add 250LFM d. The non-operational storage temperature specifications apply to the product without its package.
MCX653106A-HDAT Specifications
Please make sure to install the ConnectX-6 card in a PCIe slot that is capable
of supplying the required power and airflow as stated in the below
table.
Physical Protocol Support
Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm)
Connector: Dual QSFP56 InfiniBand and Ethernet (copper and optical)
InfiniBand: IBTA v1.4a Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR
(5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR
(14.0625Gb/s per
lane), EDR (25Gb/s per lane) port, HDR100 (2 lane x 50Gb/s per lane), HDR
(50Gb/s per lane) port
Ethernet: 200GBASE-CR4, 200GBASE-KR4, 200GBASE-SR4, 100GBASE-CR4, 100GBASE- CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASECR4, 40GBASE- KR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE- KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR
Data Rate
InfiniBand
SDR/DDR/QDR/FDR/EDR/HDR100/HDR
Ethernet
1/10/25/40/50/100/200 Gb/s
PCI Express Gen3/4: SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)
113
Power and Airflow
Environmental Regulatory
Voltage: 3.3Aux Maximum current: 100mA
Power
Cable
Typical Powerb
Passive Cables
23.6W
Maximum Power
Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)
Maximum power available through QSFP56 port: 5W
Airflow (LFM) / Ambient Temperature
Temperature Humidity Altitude (Operational) Safety: CB / cTUVus / CE EMC: CE /
FCC / VCCI / ICES / RCM / KC
Cable Type
Passive Cables NVIDIA Active 4.7W Cables
Operational Non-operational Operational Non-operational 3050m
Airflow Direction
Heatsink to Port
Port to Heatsink
400 LFM / 55°C
300 LFM / 35°C
950 LFM / 55°C 600 LFM / 48°Cd
300 LFM / 35°C
0°C to 55°C -40°C to 70°C 10% to 85% relative humidity 10% to 90% relative humidityc
RoHS: RoHS Compliant
114
Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. For both operational and non-operational states.
MCX653105A-HDAL Specifications
Please make sure to install the ConnectX-6 card in an liquid-cooled Intel® Server System D50TNP platform.
Physical
Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm) Connector: Single QSFP56 InfiniBand and Ethernet (copper and optical)
Protocol Support
InfiniBand: IBTA v1.4a Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR
(5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR
(14.0625Gb/s per lane), EDR (25Gb/s
per lane) port, HDR100 (2 lane x 50Gb/s per lane), HDR (50Gb/s per lane) port
Ethernet: 200GBASE-CR4, 200GBASE-KR4, 200GBASE-SR4, 100GBASE-CR4, 100GBASE- CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASEKR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE-KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASECX, 1000BASE-KX, 10GBASE-SR
Data Rate
InfiniBand
SDR/DDR/QDR/FDR/EDR/HDR100/HDR
Ethernet
1/10/25/40/50/100/200 Gb/s
PCI Express Gen3/4: SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)
Power and Airflow
Voltage: 3.3Aux Maximum current: 100mA
Power
Cable
115
Typical Powerb
Passive Cables
18.5W
Maximum Power
Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)
Maximum power available through QSFP56 port: 5W
Airflow (LFM) / Ambient Temperature
Cable Type
Passive Cables
NVIDIA Active 4.7W Cables
Environmen Temperature tal
Humidity
Regulatory
Altitude (Operational) Safety: CB / cTUVus / CE
Operational Non-operational Operational Non-operational 3050m
EMC: CE / FCC / VCCI / ICES / RCM / KC
RoHS: RoHS Compliant
Airflow Direction
Heatsink to Port
TBD
TBD
0°C to 55°C -40°C to 70°C 10% to 85% relative humidity 10% to 90% relative
humidityc
Port to Heatsink TBD TBD
Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. For both operational and non-operational states.
116
MCX653106A-HDAL Specifications
Please make sure to install the ConnectX-6 card in an liquid-cooled Intel® Server System D50TNP platform.
Physical
Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm) Connector: Dual QSFP56 InfiniBand and Ethernet (copper and optical)
Protocol Support
InfiniBand: IBTA v1.4a Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR
(5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR
(14.0625Gb/s per lane), EDR (25Gb/s
per lane) port, HDR100 (2 lane x 50Gb/s per lane), HDR (50Gb/s per lane) port
Ethernet: 200GBASE-CR4, 200GBASE-KR4, 200GBASE-SR4, 100GBASE-CR4, 100GBASE- CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASEKR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE-KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASECX, 1000BASE-KX, 10GBASE-SR
Data Rate
InfiniBand
SDR/DDR/QDR/FDR/EDR/HDR100/HDR
Ethernet
1/10/25/40/50/100/200 Gb/s
PCI Express Gen3/4: SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)
117
Power and Airflow
Voltage: 3.3Aux Maximum current: 100mA
Power
Cable
Typical Powerb
Passive Cables
20.85W
Maximum Power
Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)
Maximum power available through QSFP56 port: 5W
Airflow (LFM) / Ambient Temperature
Cable Type
Environmen Temperature tal
Humidity
Passive Cables
NVIDIA Active 4.7W Cables
Operational Non-operational Operational Non-operational
Regulatory
Altitude (Operational) Safety: CB / cTUVus / CE
3050m
EMC: CE / FCC / VCCI / ICES / RCM / KC
RoHS: RoHS Compliant
Airflow Direction
Heatsink to Port
TBD
TBD
0°C to 55°C -40°C to 70°C 10% to 85% relative humidity 10% to 90% relative
humidityc
Port to Heatsink TBD TBD
118
Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. For both operational and non-operational states.
MCX653105A-ECAT Specifications
Please make sure to install the ConnectX-6 card in a PCIe slot that is capable
of supplying the required power and airflow as stated in the below
table.
Physical Protocol Support
Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm)
Connector: Single QSFP56 InfiniBand and Ethernet (copper and optical)
InfiniBand: IBTA v1.4a Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR (14.0625Gb/s per lane), EDR (25Gb/s per lane) port, HDR100 (2 lane x 50Gb/s per lane)
Ethernet: 100GBASE-CR4, 100GBASE-CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASE-KR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE-KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR
Data Rate
InfiniBand
SDR/DDR/QDR/FDR/EDR/HDR100
Ethernet
1/10/25/40/50/100 Gb/s
PCIe Gen3/4: SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)
119
Power and Airflow
Voltage: 3.3Aux Maximum current: 100mA
Power
Cable
Typical Powerb
Passive Cables
15.6W
Maximum Power
Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)
Maximum power available through QSFP56 port: 5W
Environmental Regulatory
Airflow (LFM) / Ambient Temperature
Cable Type Passive Cables
NVIDIA Active 2.7W Cables
Temperature Humidity
Operational Non-operational Operational Non-operational
Altitude (Operational) Safety: CB / cTUVus / CE
3050m
EMC: CE / FCC / VCCI / ICES / RCM / KC
RoHS: RoHS Compliant
Heatsink to Port
300 LFM / 55°C
300 LFM / 55°C
0°C to 55°C -40°C to 70°Cc 10% to 85% relative humidity 10% to 90% relative
humidity
Airflow Direction Port to Heatsink
200 LFM / 35°C 200 LFM / 35°C
120
Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. The non- operational storage temperature specifications apply to the product without its package.
MCX653106A-ECAT Specifications
Please make sure to install the ConnectX-6 card in a PCIe slot that is capable
of supplying the required power and airflow as stated in the below
table.
For power specifications when using a single-port configuration, please refer to MCX653105A-ECAT Specifications
Physical
Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm) Connector: Dual QSFP56 InfiniBand and Ethernet (copper and optical)
Protocol Support
InfiniBand: IBTA v1.4a
Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR
(10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR (14.0625Gb/s per lane),
EDR (25Gb/s per lane) port, HDR100 (2 lane x 50Gb/s per lane) port
Ethernet: 100GBASE-CR4, 100GBASE-CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASE-KR4, 40GBASE-SR4, 40GBASELR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE-KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR
Data Rate
InfiniBand
SDR/DDR/QDR/FDR/EDR
Ethernet
1/10/25/40/50/100 Gb/s
Gen3/4: SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)
121
Power and Airflow
Environmental Regulatory
Voltage: 12V, 3.3VAUX Maximum current: 100mA
Power
Cable
Typical Powerb
Passive Cables
21.0W
Maximum Power
Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)
Maximum power available through QSFP56 port: 5W
Airflow (LFM) / Ambient Temperature
Temperature Humidity
Altitude (Operational) Safety: CB / cTUVus / CE EMC: CE / FCC / VCCI / ICES /
RCM / KC RoHS: RoHS Compliant
Cable Type
Passive Cables NVIDIA Active 2.7W Cables Operational Non-operational
Operational Non-operational 3050m
Airflow Direction
Heatsink to Port
350 LFM / 55°C
550 LFM / 55°C
0°C to 55°C -40°C to 70°Cc 10% to 85% relative humidity 10% to 90% relative
humidity
Port to Heatsink 250 LFM / 35°C 250 LFM / 35°C
122
Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. The non- operational storage temperature specifications apply to the product without its package.
MCX654105A-HCAT Specifications
Please make sure to install the ConnectX-6 card in a PCIe slot that is capable of supplying the required power and air
References
- End-to-End Networking Solutions | NVIDIA
- PCI Devices
- IEEE SA - The IEEE Standards Association - Home
- End-to-End Networking Solutions | NVIDIA
- End-to-End Networking Solutions | NVIDIA
- mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox
- Firmware Management
- Firmware Management
- ESPCommunity
- NVIDIA Firmware Tools (MFT)
- Linux InfiniBand Drivers
- mlxup - Mellanox Update and Query Utility
- InfiniBand Trade Association
- mlxup - Mellanox Update and Query Utility
- End-to-End Networking Solutions | NVIDIA
- LinkX Cables and Transceivers | NVIDIA
Read User Manual Online (PDF format)
Read User Manual Online (PDF format) >>