nVIDIA ConnectX-6 VPI InfiniBand Adapter Card User Manual

August 14, 2024
Nvidia

nVIDIA ConnectX-6 VPI InfiniBand Adapter Card

Specifications

  • Model: NVIDIA ConnectX-6 InfiniBand/Ethernet Adapter Cards
  • SKU Numbers:
    • 900-9X603-0016-DT0
    • 900-9X603-0056-DT0
    • 900-9X628-0016-ST0
  • Interfaces: InfiniBand, Ethernet
  • Features:
    • HDR100, EDR IB and 100GbE support
    • Single-port QSFP56
    • PCIe4.0 x8
    • Dual-port QSFP56 (specific SKUs)
    • PCIe3.0/4.0 x16

Hardware Requirements

Ensure your system meets the hardware requirements specified for the NVIDIA ConnectX-6 Adapter Cards.

Airflow Requirement

Proper airflow is crucial for the optimal performance of the adapter cards. Maintain adequate airflow as per the provided specifications.

Installing PCIe x8/16 Cards

  1. Identify the card in your system using the provided guidelines.
  2. Follow the detailed installation instructions for PCIe x8/16 cards as outlined in the manual.
  3. Securely install the card into the appropriate PCIe slot.

Uninstalling the Card

When necessary, follow the steps outlined in the manual to safely uninstall the adapter card from your system.

Socket Direct (2x PCIe x16) Cards Installation

  1. Refer to the specific installation instructions provided for Socket Direct cards.
  2. Carefully install the card ensuring it is securely seated in the designated slot.

FAQs

Q: Who is the intended audience for this manual?
A: This manual is intended for installers and users of NVIDIA ConnectX-6 InfiniBand/Ethernet Adapter Cards who have basic familiarity with InfiniBand and Ethernet network specifications.

Q: How can customers contact technical support?
A: Customers who purchased NVIDIA products directly from NVIDIA can contact technical support through the provided methods in the manual.

About This Manual This User Manual describes NVIDIA® ConnectX®-6 InfiniBand/Ethernet adapter cards. It provides details as to the interfaces of the board, specifications, required software and firmware for operating the board, and relevant documentation. Ordering Part Numbers The table below provides the ordering part numbers (OPN) for the available ConnectX-6 InfiniBand/Ethernet adapter cards.

NVIDIA SKU 900-9X603-0016-DT0 900-9X603-0056-DT0
900-9X628-0016-ST0

Legacy OPN MCX653105A-EFAT MCX653106A-EFAT MCX651105A-EDAT

Marketing Description

Lifecycle

ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE), single-port QSFP56, PCIe3.0/4.0 Socket Direct 2×8 in a row, tall bracket

Mass Production

ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IBand100GbE), dual-port QSFP56, PCIe3.0/4.0 Socket Direct 2×8 in a row, tall bracket

Mass Production

ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s

Mass Production

(HDR100, EDR IB and 100GbE, single-port QSFP56, PCIe4.0 x8,

tall bracket

900-9X6AF-0016-ST1 900-9X6AF-0056-MT1 900-9X6AF-0018-MT2 900-9X6AF-0058-ST1 900-9X6B4-0058-DT0

MCX653105A-ECAT MCX653106A-ECAT MCX653105A-HDAT MCX653106A-HDAT MCX654106A- HCAT

ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE), single-port QSFP56, PCIe3.0/4.0 x16, tall bracket

Mass Production

ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s

Mass Production

(HDR100, EDR IB and 100GbE), dual-port QSFP56, PCIe3.0/4.0

x16, tall bracket

ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, single-port QSFP56, PCIe3.0/4.0 x16, tall bracket

Mass Production

ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, PCIe3.0/4.0 x16, tall bracket

Mass Production

ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, Socket Direct 2x PCIe3.0/4.0×16, tall bracket

Mass Production

7

NVIDIA SKU 900-9X6AF-0018-SS0
900-9X6AF-0058-SS0
900-9X0BC-001H-ST1

Legacy OPN MCX653105A-HDAL
MCX653106A-HDAL
MCX683105AN-HDAT

Marketing Description

Lifecycle

ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, single-port QSFP56, PCIe3.0/4.0 x16, cold plate for liquid-cooled Intel® Server System D50TNP platforms, tall bracket, ROHS R6
ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, PCIe3.0/4.0 x16, cold plate for liquid-cooled Intel® Server System D50TNP platforms, tall bracket, ROHS R6
ConnectX®-6 DE adapter card, HDR IB (200Gb/s), single-port QSFP, PCIe4.0 x16, no crypto, tall bracket

Mass Production Mass Production Mass Production

EOL’ed (End of Life) Ordering Part Numbers

NVIDIA SKU 900-9X6B4-0056-DT0

Legacy OPN MCX654106A-ECAT

900-9X6B4-0018-DT2

MCX654105A-HCAT

Marketing Description

Lifecycle

ConnectX®-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR InfiniBand and 100GbE), dual-port QSFP56, Socket Direct 2x PCIe 3.0/4.0 x16, tall bracket

End of Life

ConnectX®-6 InfiniBand/Ethernet adapter card, HDR IB

End of Life

(200Gb/s) and 200GbE, single-port QSFP56, Socket Direct 2x

PCIe3.0/4.0×16, tall bracket

Intended Audience This manual is intended for the installer and user of these cards. The manual assumes basic familiarity with InfiniBand and Ethernet network and architecture specifications. Technical Support Customers who purchased NVIDIA products directly from NVIDIA are invited to contact us through the following methods:

8

· URL: https://www.nvidia.com > Support · E-mail: enterprisesupport@nvidia.com Customers who purchased NVIDIA M-1 Global Support Services, please see your contract for details regarding Technical Support. Customers who purchased NVIDIA products through a NVIDIA approved reseller should first seek assistance through their reseller. Related Documentation

MLNX_OFED for Linux User Manual and Release Notes WinOF-2 for Windows User Manual and Release Notes NVIDIA VMware for Ethernet User Manual NVIDIA Firmware Utility (mlxup) User Manual and Release Notes NVIDIA Firmware Tools (MFT) User Manual

User Manual describing OFED features, performance, band diagnostic, tools content, and configuration. See MLNX_OFED for Linux Documentation.
User Manual describing WinOF-2 features, performance, Ethernet diagnostic, tools content, and configuration. See WinOF-2 for Windows Documentation.
User Manual and release notes describing the various components of the NVIDIA ConnectX® NATIVE ESXi stack. See VMware® ESXi Drivers Documentation.
NVIDIA firmware update and query utility used to update the firmware. Refer to Firmware Utility (mlxup) Documentation.
User Manual describing the set of MFT firmware management tools for a single node. See MFT User Manual.

InfiniBand Architecture Specification Release 1.2.1, Vol 2 – Release 1.3 IEEE Std 802.3 Specification PCI Express Specifications LinkX Interconnect Solutions

InfiniBand Specifications
IEEE Ethernet Specifications
Industry Standard PCI Express Base and Card Electromechanical Specifications. Refer to PCI-SIG Specifications.
LinkX InfiniBand cables and transceivers are designed to maximize the performance of High-Performance Computing networks, requiring high-bandwidth, low-latency connections between compute nodes and switch nodes. NVIDIA offers one of the industry’s broadest portfolio of QDR/FDR10 (40Gb/s), FDR (56Gb/s), EDR/HDR100 (100Gb/s), HDR (200Gb/s) and NDR (400Gb/s) cables, including Direct Attach Copper cables (DACs), copper splitter cables, Active Optical Cables (AOCs) and transceivers in a wide range of lengths from 0.5m to 10km. In addition to meeting IBTA standards, NVIDIA tests every product in an end-to- end environment ensuring a Bit Error Rate of less than 1E-15. Read more at LinkX Cables and Transceivers.

Document Conventions
When discussing memory sizes, MB and MBytes are used in this document to mean size in MegaBytes. The use of Mb or Mbits (small b) indicates the size in MegaBits. In this document, PCIe is used to mean PCI Express.
Revision History

9

A list of the changes made to this document are provided in Document Revision History. 10

Introduction

Product Overview

This is the user guide for InfiniBand/Ethernet adapter cards based on the ConnectX-6 integrated circuit device. ConnectX-6 connectivity provides the highest performing low latency and most flexible interconnect solution for PCI Express Gen 3.0/4.0 servers used in enterprise datacenters and highperformance computing environments.
ConnectX-6 Virtual Protocol Interconnect® adapter cards provide up to two ports of 200Gb/s for InfiniBand and Ethernet connectivity, sub-600ns latency and 200 million messages per second, enabling the highest performance and most flexible solution for the most demanding High-Performance Computing (HPC), storage, and datacenter applications.

ConnectX-6 is a groundbreaking addition to the NVIDIA ConnectX series of industry-leading adapter cards. In addition to all the existing innovative features of past ConnectX versions, ConnectX-6 offers a number of enhancements that further improve the performance and scalability of datacenter applications. In addition, specific PCIe stand-up cards are available with a cold plate for insertion into liquid-cooled Intel® Server System D50TNP platforms.
ConnectX-6 is available in two form factors: low-profile stand-up PCIe and Open Compute Project (OCP) Spec 3.0 cards with QSFP connectors. Single-port, HDR, stand-up PCIe adapters are available based on either ConnectX-6 or ConnectX-6 DE (ConnectX-6 Dx enhanced for HPC applications).

Make sure to use a PCIe slot that is capable of supplying the required power and airflow to the ConnectX-6 as stated in Specifications.

Configuration ConnectX-6 PCIe x8 Card ConnectX-6 PCIe x16 Card

OPN

Marketing Description

MCX651105A-EDAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE, single-port QSFP56, PCIe4.0 x8, tall bracket

MCX653105AHDAT

ConnectX-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, single-port QSFP56, PCIe4.0 x16, tall bracket

MCX653106AHDAT

ConnectX-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, PCIe3.0/4.0 x16, tall bracket

MCX653105A-ECAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE), single-port QSFP56, PCIe3.0/4.0 x16, tall bracket

11

Configuration

OPN

Marketing Description

MCX653106A-ECAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE), dual-port QSFP56, PCIe3.0/4.0 x16, tall bracket

ConnectX-6 DE PCIe x16 Card

MCX683105ANHDAT

ConnectX-6 DE InfiniBand adapter card, HDR, single-port QSFP, PCIe 3.0/4.0 x16, No Crypto, Tall Bracket

ConnectX-6 PCIe x16 Cards for liquid-cooled Intel® MCX653105A-

Server System D50TNP platforms

HDAL

ConnectX-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, single-port QSFP56, PCIe4.0 x16, cold plate for liquid-cooled Intel® Server System D50TNP platforms, tall bracket, ROHS R6

MCX653106AHDAL

ConnectX-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, PCIe4.0 x16, cold plate for liquid-cooled Intel® Server System D50TNP platforms, tall bracket, ROHS R6

ConnectX-6 Dual-slot Socket Direct Cards (2x PCIe MCX654105A-

x16)

HCAT

ConnectX-6 InfiniBand/Ethernet adapter card kit, HDR IB (200Gb/s) and 200GbE, single-port QSFP56, Socket Direct 2x PCIe3.0 x16, tall brackets

MCX654106AHCAT

ConnectX-6 InfiniBand/Ethernet adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, Socket Direct 2x PCIe3.0/4.0×16, tall bracket

MCX654106A-ECAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR InfiniBand and 100GbE), dual-port QSFP56, Socket Direct 2x PCIe3.0/4.0 x16, tall bracket

ConnectX-6 Single-slot Socket Direct Cards (2x PCIe MCX653105A-EFAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IB and 100GbE), single- port QSFP56,

x8 in a row)

PCIe3.0/4.0 Socket Direct 2×8 in a row, tall bracket

MCX653106A-EFAT ConnectX-6 InfiniBand/Ethernet adapter card, 100Gb/s (HDR100, EDR IBand100GbE), dual-port QSFP56, PCIe3.0/4.0 Socket Direct 2×8 in a row, tall bracket

ConnectX-6 PCIe x8 Card
ConnectX-6 with a single PCIe x8 slot can support a bandwidth of up to 100Gb/s in a PCIe Gen 4.0 slot.

12

Part Number

MCX651105A-EDAT

Form Factor/Dimensions

PCIe Half Height, Half Length / 167.65mm x 68.90mm

Data Transmission Rate

Ethernet: 10/25/40/50/100 Gb/s InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100

Network Connector Type

Single-port QSFP56

PCIe x8 through Edge Connector PCIe Gen 3.0 / 4.0 SERDES @ 8.0GT/s / 16.0GT/s

RoHS

RoHS Compliant

Adapter IC Part Number

MT28908A0-XCCF-HVM

13

ConnectX-6 PCIe x16 Card
ConnectX-6 with a single PCIe x16 slot can support a bandwidth of up to 100Gb/s in a PCIe Gen 3.0 slot, or up to 200Gb/s in a PCIe Gen 4.0 slot. This formfactor is available also for Intel® Server System D50TNP Platforms where an Intel liquid-cooled cold plate is used for adapter cooling mechanism.

Part Number Form Factor/Dimensions Data Transmission Rate
Network Connector Type

MCX653105A-ECAT

MCX653106A-ECAT

PCIe Half Height, Half Length / 167.65mm x 68.90mm

Ethernet: 10/25/40/50/100 Gb/s InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100

Single-port QSFP56

Dual-port QSFP56

MCX653105A-HDAT

MCX653106A-HDAT

Ethernet: 10/25/40/50/100/200 Gb/s InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100, HDR

Single-port QSFP56

Dual-port QSFP56

14

Part Number PCIe x16 through Edge Connector RoHS Adapter IC Part Number

MCX653105A-ECAT

MCX653106A-ECAT

PCIe Gen 3.0 / 4.0 SERDES @ 8.0GT/s / 16.0GT/s

RoHS Compliant

MT28908A0-XCCF-HVM

MCX653105A-HDAT

MCX653106A-HDAT

ConnectX-6 DE PCIe x16 Card
ConnectX-6 DE (ConnectX-6 Dx enhanced for HPC applications) with a single PCIe x16 slot can support a bandwidth of up to 100Gb/s in a PCIe Gen 3.0 slot, or up to 200Gb/s in a PCIe Gen 4.0 slot.

Part Number

MCX683105AN-HDAT

Form Factor/Dimensions

PCIe Half Height, Half Length / 167.65mm x 68.90mm

Data Transmission Rate

InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100, HDR

Network Connector Type

Single-port QSFP56

PCIe x16 through Edge Connector

PCIe Gen 3.0 / 4.0 SERDES @ 8.0GT/s / 16.0GT/s

RoHS

RoHS Compliant

Adapter IC Part Number

MT28924A0-NCCF-VE

15

ConnectX-6 for Liquid-Cooled Intel® Server System D50TNP Platforms
The below cards are available with a cold plate for insertion into liquid- cooled Intel® Server System D50TNP platforms.

Part Number

MCX653105A-HDAL

MCX653106A-HDAL

Form Factor/Dimensions

PCIe Half Height, Half Length / 167.65mm x 68.90mm

Data Transmission Rate

Ethernet: 10/25/40/50/100/200 Gb/s InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100, HDR

Network Connector Type

Single-port QSFP56 Dual-port QSFP56

PCIe x16 through Edge Connector PCIe Gen 3.0 / 4.0 SERDES @ 8.0GT/s / 16.0GT/s

RoHS

RoHS Compliant

16

Part Number Adapter IC Part Number

MCX653105A-HDAL MT28908A0-XCCF-HVM

MCX653106A-HDAL

ConnectX-6 Socket DirectTM Cards
The Socket Direct technology offers improved performance to dual-socket servers by enabling direct access from each CPU in a dual-socket server to the network through its dedicated PCIe interface. Please note that ConnectX-6 Socket Direct cards do not support Multi-Host functionality (i.e. connectivity to two independent CPUs). For ConnectX-6 Socket Direct card with Multi-Host functionality, please contact NVIDIA.
ConnectX-6 Socket Direct cards are available in two configurations: Dual-slot Configuration (2x PCIe x16) and Single-slot Configuration (2x PCIe x8).
ConnectX-6 Dual-slot Socket Direct Cards (2x PCIe x16)
In order to obtain 200Gb/s speed, NVIDIA offers ConnectX-6 Socket Direct that enable 200Gb/s connectivity also for servers with PCIe Gen 3.0 capability. The adapter’s 32-lane PCIe bus is split into two 16-lane buses, with one bus accessible through a PCIe x16 edge connector and the other bus through an x16 Auxiliary PCIe Connection card. The two cards should be installed into two PCIe x16 slots and connected using two Cabline SA-II Plus harnesses, as shown in the below figure.

17

Part Number

MCX654105A-HCAT MCX654106A-HCAT

MCX654106A-ECAT

Form Factor/Dimensions

Adapter Card: PCIe Half Height, Half Length / 167.65mm x 68.90mm Auxiliary PCIe Connection Card: 5.09 in. x 2.32 in. (129.30mm x 59.00mm) Two 35cm Cabline CA-II Plus harnesses

Data Transmission Rate

Ethernet: 10/25/40/50/100/200 Gb/s

Ethernet: 10/25/40/50/100 Gb/s

InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100, HDR InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100

Network Connector Type

Single-port QSFP56

Dual-port QSFP56

PCIe x16 through Edge Connector PCIe Gen 3.0 / 4.0SERDES@ 8.0GT/s / 16.0GT/s

PCIe x16 through Auxiliary Card PCIe Gen 3.0SERDES@ 8.0GT/s

18

Part Number RoHS Adapter IC Part Number

MCX654105A-HCAT RoHS Compliant MT28908A0-XCCF-HVM

MCX654106A-HCAT

MCX654106A-ECAT

ConnectX-6 Single-slot Socket Direct Cards (2x PCIe x8 in a row)
The PCIe x16 interface comprises two PCIe x8 in a row, such that each of the PCIe x8 lanes can be connected to a dedicated CPU in a dual-socket server. In such a configuration, Socket Direct brings lower latency and lower CPU utilization as the direct connection from each CPU to the network means the interconnect can bypass a QPI (UPI) and the other CPU, optimizing performance and improving latency. CPU utilization is improved as each CPU handles only its own traffic and not traffic from the other CPU.
A system with a custom PCI Express x16 slot that includes special signals is required for installing the card. Please refer to PCI Express Pinouts Description for Single-Slot Socket Direct Card for pinout definitions.

19

Part Number

MCX653105A-EFAT

MCX653106A-EFAT

Form Factor/Dimensions

PCIe Half Height, Half Length / 167.65mm x 68.90mm

Data Transmission Rate

Ethernet: 10/25/40/50/100 Gb/s InfiniBand: SDR, DDR, QDR, FDR, EDR, HDR100

Network Connector Type

Single-port QSFP56

Dual-port QSFP56

PCIe x16 through Edge Connector PCIe Gen 3.0 / 4.0 SERDES @ 8.0GT/s / 16.0GT/s Socket Direct 2×8 in a row

RoHS

RoHS Compliant

Adapter IC Part Number

MT28908A0-XCCF-HVM

Package Contents

ConnectX-6 PCIe x8/x16 Adapter Cards

Applies to MCX651105A-EDAT, MCX653105A-ECAT, MCX653106A-ECAT, MCX653105A-HDAT, MCX653106A-HDAT, MCX653105A-EFAT, MCX653106A-EFAT,
and MCX683105AN-HDAT.

Cards Accessories

Category

Qty 1 1

ConnectX-6 adapter card Adapter card short bracket

Item

20

Category

Qty 1

Item Adapter card tall bracket (shipped assembled on the card)

ConnectX-6 PCIe x16 Adapter Card for liquid-cooled Intel® Server System D50TNP Platforms

Applies to MCX653105A-HDAL and MCX653106A-HDAL.

Cards Accessories

Category

Qty 1 1 1 1

ConnectX-6 adapter card

Item

Adapter card short bracket

Adapter card tall bracket (shipped assembled on the card)

Accessory Kit with two 2 TIMs (MEB000386)

ConnectX-6 Socket Direct Cards (2x PCIe x16)

Applies to MCX654105A-HCAT, MCX654106A-HCAT and MCX654106A-ECAT.

Cards

Category

Qty. 1 1

ConnectX-6 adapter card PCIe Auxiliary Card

Item

21

Harnesses

Category

Accessories

Qty. 1 1 2 1 1 1 1

35cm Cabline CA-II Plus harness (white)

Item

35cm Cabline CA-II Plus harness (black)

Retention Clip for Cablline harness (optional accessory) Adapter card short bracket

Adapter card tall bracket (shipped assembled on the card)

PCIe Auxiliary card short bracket

PCIe Auxiliary card tall bracket (shipped assembled on the card)

Features and Benefits

Make sure to use a PCIe slot that is capable of supplying the required power and airflow to the ConnectX-6 cards as stated in Specifications.

PCI Express (PCIe)

Uses the following PCIe interfaces:
· PCIe x8/x16 configurations: PCIe Gen 3.0 (8GT/s) and Gen 4.0 (16GT/s) through an x8/x16 edge connector.
· 2x PCIe x16 configurations: PCIe Gen 3.0/4.0 SERDES @ 8.0/16.0 GT/s through Edge Connector PCIe Gen 3.0 SERDES @ 8.0GT/s through PCIe Auxiliary Connection Card

200Gb/s InfiniBand/

ConnectX-6 offers the highest throughput InfiniBand/Ethernet adapter, supporting HDR 200b/s InfiniBand and 200Gb/s Ethernet and enabling any standard networking, clustering, or storage to operate seamlessly over any converged network leveraging a consolidated software stack.

22

Ethernet Adapte r

InfiniBand Architecture Specification v1.3 compliant

ConnectX-6 delivers low latency, high bandwidth, and computing efficiency for performance-driven server and storage clustering applications. ConnectX-6 is InfiniBand Architecture Specification v1.3 compliant.

Up to 200 Gigabit Ethernet

NVIDIA adapters comply with the following IEEE 802.3 standards: 200GbE / 100GbE / 50GbE / 40GbE / 25GbE / 10GbE / 1GbE – IEEE 802.3bj, 802.3bm 100 Gigabit Ethernet – IEEE 802.3by, Ethernet Consortium25, 50 Gigabit Ethernet, supporting all FEC modes – IEEE 802.3ba 40 Gigabit Ethernet – IEEE 802.3by 25 Gigabit Ethernet – IEEE 802.3ae 10 Gigabit Ethernet – IEEE 802.3ap based auto- negotiation and KR startup – IEEE 802.3ad, 802.1AX Link Aggregation – IEEE 802.1Q, 802.1P VLAN tags and priority – IEEE 802.1Qau (QCN) – Congestion Notification – IEEE 802.1Qaz (ETS) – IEEE 802.1Qbb (PFC) – IEEE 802.1Qbg – IEEE 1588v2 – Jumbo frame support (9.6KB)

InfiniBand HDR100

A standard InfiniBand data rate, where each lane of a 2X port runs a bit rate of 53.125Gb/s with a 64b/66b encoding, resulting in an effective bandwidth of 100Gb/s.

InfiniBand HDR A standard InfiniBand data rate, where each lane of a 4X port runs a bit rate of 53.125Gb/s with a 64b/66b encoding, resulting in an effective bandwidth of 200Gb/s.

Memory Components

· SPI Quad – includes 256Mbit SPI Quad Flash device (MX25L25645GXDI-08G device by Macronix) · FRU EEPROM – Stores the parameters and personality of the card. The EEPROM capacity is 128Kbit. FRU I2C address is (0x50) and is accessible through the PCIe
SMBus. (Note: Address 0x58 is reserved.)

23

Overlay Networks

In order to better scale their networks, datacenter operators often create overlay networks that carry traffic from individual virtual machines over logical tunnels in encapsulated formats such as NVGRE and VXLAN. While this solves network scalability issues, it hides the TCP packet from the hardware offloading engines, placing higher loads on the host CPU. ConnectX-6 effectively addresses this by providing advanced NVGRE and VXLAN hardware offloading engines that encapsulate and decapsulate the overlay protocol.

RDMA and RDMA over Converged Ethernet (RoCE)

ConnectX-6, utilizing IBTA RDMA (Remote Data Memory Access) and RoCE (RDMA over Converged Ethernet) technology, delivers low-latency and high-performance over InfiniBand and Ethernet networks. Leveraging datacenter bridging (DCB) capabilities as well as ConnectX-6 advanced congestion control hardware mechanisms, RoCE provides efficient low-latency RDMA services over Layer 2 and Layer 3 networks.

NVIDIA PeerDirectTM

PeerDirectTM communication provides high efficiency RDMA access by eliminating unnecessary internal data copies between components on the PCIe bus (for example, from GPU to CPU), and therefore significantly reduces application run time. ConnectX-6 advanced acceleration technology enables higher cluster efficiency and scalability to tens of thousands of nodes.

CPU Offload

Adapter functionality enables reduced CPU overhead leaving more CPU resources available for computation tasks.
Open vSwitch (OVS) offload using ASAP2(TM) · Flexible match-action flow tables · Tunneling encapsulation/decapsulation

Quality of Service (QoS)

Support for port-based Quality of Service enabling various application requirements for latency and SLA.

Hardware-based I/O
Virtualization

ConnectX-6 provides dedicated adapter resources and guaranteed isolation and protection for virtual machines within the server.

Storage Acceleration
SR-IOV

A consolidated compute and storage network achieves significant cost- performance advantages over multi-fabric networks. Standard block and file access protocols can leverage:
· RDMA for high-performance storage access · NVMe over Fabric offloads for target machine · Erasure Coding · T10-DIF Signature Handover
ConnectX-6 SR-IOV technology provides dedicated adapter resources and guaranteed isolation and protection for virtual machines (VM) within the server.

24

HighPerformance Accelerations

· Tag Matching and Rendezvous Offloads · Adaptive Routing on Reliable Transport · Burst Buffer Offloads for Background Checkpointing

Operating Systems/Distributions
ConnectX-6 Socket Direct cards 2x PCIe x16 (OPNs: MCX654105A-HCAT, MCX654106A- HCAT and MCX654106A-ECAT) are not supported in Windows
and WinOF-2. · OpenFabrics Enterprise Distribution (OFED) · RHEL/CentOS · Windows · FreeBSD · VMware · OpenFabrics Enterprise Distribution (OFED) · OpenFabrics Windows Distribution (WinOF-2)
Connectivity
· Interoperable with 1/10/25/40/50/100/200 Gb/s InfiniBand and Ethernet switches · Passive copper cable with ESD protection · Powered connectors for optical and active cable support

25

Manageability
ConnectX-6 technology maintains support for manageability through a BMC. ConnectX-6 PCIe stand-up adapter can be connected to a BMC using MCTP over SMBus or MCTP over PCIe protocols as if it is a standard NVIDIA PCIe stand-up adapter. For configuring the adapter for the specific manageability solution in use by the server, please contact NVIDIA Support.
26

Interfaces
InfiniBand Interface
The network ports of the ConnectX®-6 adapter cards are compliant with the InfiniBand Architecture Specification, Release 1.3. InfiniBand traffic is transmitted through the cards’ QSFP56 connectors.
Ethernet Interfaces
The adapter card includes special circuits to protect from ESD shocks to the card/server when plugging copper cables.
The network ports of the ConnectX-6 adapter card are compliant with the IEEE 802.3 Ethernet standards listed in Features and Benefits. Ethernet traffic is transmitted through the QSFP56/QSFP connectors on the adapter card.
PCI Express Interface
ConnectX®-6 adapter cards support PCI Express Gen 3.0/4.0 (1.1 and 2.0 compatible) through x8/x16 edge connectors. The device can be either a master initiating the PCI Express bus operations or a subordinate responding to PCI bus operations. The following lists PCIe interface features:
· PCIe Gen 3.0 and 4.0 compliant, 2.0 and 1.1 compatible · 2.5, 5.0, 8.0, or 16.0 GT/s link rate x16/x32 · Auto-negotiates to x32, x16, x8, x4, x2, or x1 · Support for MSI/MSI-X mechanisms
27

LED Interface
The adapter card includes special circuits to protect from ESD shocks to the card/server when plugging copper cables.
There are two I/O LEDs per port: · LED 1 and 2: Bi-color I/O LED which indicates link status. LED behavior is described below for Ethernet and InfiniBand port configurations. · LED 3 and 4: Reserved for future use.

LED1 and LED2 Link Status Indications – Ethernet Protocol:
LED Color and State Off Beacon command for locating the adapter card Error

Description A link has not been established 1Hz blinking Yellow 4Hz blinking Yellow Indicates an error with the link. The error can be one of the following:
28

Solid green Blinking green
LED1 and LED2 Link Status Indications – InfiniBand Protocol:
LED Color and State Off Beacon command for locating the adapter card Error
Solid amber Solid green

Error Type I2C Over-current

Description
I2C access to the networking ports fails
Over-current condition of the networking ports

LED Behavior Blinks until error is fixed
Blinks until error is fixed

Indicates a valid link with no active traffic Indicates a valid link with active traffic

Description

A link has not been established

1Hz blinking Yellow 4Hz blinking Yellow Indicates an error with the link. The error can be one of the following:

Error Type I2C Over-current

Description
I2C access to the networking ports fails
Over-current condition of the networking ports

LED Behavior Blinks until error is fixed
Blinks until error is fixed

Indicates an active link Indicates a valid (data activity) link with no active traffic

29

Blinking green

Indicates a valid link with active traffic

Heatsink Interface
The heatsink is attached to the ConnectX-6 IC to dissipate the heat from the ConnectX-6 IC. It is attached either by using four spring-loaded push pins that insert into four mounting holes or by screws. ConnectX-6 IC has a thermal shutdown safety mechanism that automatically shuts down the ConnectX-6 card in cases of high-temperature events, improper thermal coupling or heatsink removal. For the required airflow (LFM) per OPN, please refer to Specifications. For MCX653105A-HDAL and MCX653106A-HDAL cards, the heatsink is compatible with a cold plate for liquid-cooled Intel® Server System D50TNP platforms only.
SMBus Interface
ConnectX-6 technology maintains support for manageability through a BMC. ConnectX-6 PCIe stand-up adapter can be connected to a BMC using MCTP over SMBus protocol as if it is a standard NVIDIA PCIe stand-up adapter. For configuring the adapter for the specific manageability solution in use by the server, please contact NVIDIA Support.
Voltage Regulators
The voltage regulator power is derived from the PCI Express edge connector 12V supply pins. These voltage supply pins feed on-board regulators that provide the necessary power to the various components on the card.

30

Hardware Installation
Installation and initialization of ConnectX-6 adapter cards require attention to the mechanical attributes, power specification, and precautions for electronic equipment.
Safety Warnings
Safety warnings are provided here in the English language. For safety warnings in other languages, refer to the Adapter Installation Safety
Instructions document available on nvidia.com. Please observe all safety warnings to avoid injury and prevent damage to system components. Note that not all warnings are relevant to all models.
General Installation Instructions Read all installation instructions before connecting the equipment to the power source.
Jewelry Removal Warning Before you install or remove equipment that is connected to power lines, remove jewelry such as bracelets, necklaces, rings, watches, and so on. Metal objects heat up when connected to power and ground and can meltdown, causing serious burns and/or welding the metal object to the terminals. Over-temperature This equipment should not be operated in an area with an ambient temperature exceeding the maximum recommended: 55°C (131°F). An airflow of 200LFM at this maximum ambient temperature is required for HCA cards and NICs. To guarantee proper airflow, allow at least 8cm (3 inches) of clearance around the ventilation openings. During Lightning – Electrical Hazard During periods of lightning activity, do not work on the equipment or connect or disconnect cables.
31

Copper Cable Connecting/Disconnecting Some copper cables are heavy and not flexible, as such, they should be carefully attached to or detached from the connectors. Refer to the cable manufacturer for special warnings and instructions.
Equipment Installation This equipment should be installed, replaced, or serviced only by trained and qualified personnel.
Equipment Disposal The disposal of this equipment should be in accordance to all national laws and regulations.

Local and National Electrical Codes This equipment should be installed in compliance with local and national electrical codes.
Hazardous Radiation Exposure
· Caution ­ Use of controls or adjustment or performance of procedures other than those specified herein may result in hazardous radiation exposure.For products with optical ports.
· CLASS 1 LASER PRODUCT and reference to the most recent laser standards: IEC 60 825-1:1993 + A1:1997 + A2:2001 and EN 60825-1:1994+A1:1996+ A2:20

Installation Procedure Overview
The installation procedure of ConnectX-6 adapter cards involves the following steps:

Step 1 2 3

Procedure Check the system’s hardware and software requirements. Pay attention to the airflow consideration within the host system Follow the safety precautions

System Requirements Airflow Requirements Safety Precautions

Direct Link

32

Step 4 5 6 7
8 9

Procedure

Direct Link

Unpack the package

Unpack the package

Follow the pre-installation checklist

Pre-Installation Checklist

(Optional) Replace the full-height mounting bracket with the supplied short bracket Bracket Replacement Instructions

Install the ConnectX-6 PCIe x8/x16 adapter card in the system

ConnectX-6 PCIe x8/x16 Adapter Cards Installation Instructions

Install the ConnectX-6 2x PCIe x16 Socket Direct adapter card in the system

Socket Direct (2x PCIe x16) Cards Installation Instructions

Install the ConnectX-6 card for Intel Liquid-cooled platforms

Cards for Intel Liquid-Cooled Platforms Installation Instructions

Connect cables or modules to the card

Cables and Modules

Identify ConnectX-6 in the system

Identifying Your Card

System Requirements
Hardware Requirements
Unless otherwise specified, NVIDIA products are designed to work in an environmentally controlled data center with low levels of gaseous and dust
(particulate) contamination. The operating environment should meet severity level G1 as per ISA 71.04 for gaseous contamination and ISO 14644-1 class 8 for cleanliness level.
For proper operation and performance, please make sure to use a PCIe slot with a corresponding bus width and that can supply sufficient power to
your card. Refer to the Specifications section of the manual for more power requirements.
Please make sure to install the ConnectX-6 cards in a PCIe slot that is capable of supplying the required power as stated in Specifications.

33

PCIe x8/x16

ConnectX-6 Configuration

Cards for liquid-cooled Intel® Server System D50TNP platforms

Socket Direct 2x PCIe x8 in a row (single slot)

Socket Direct 2x PCIe x16 (dual slots)

Hardware Requirements
A system with a PCI Express x8/x16 slot is required for installing the card.
Intel® Server System D50TNP Platform with an available PCI Express x16 slot is required for installing the card. A system with a custom PCI Express x16 slot (four special pins) is required for installing the card. Please refer to PCI Express Pinouts Description for Single-Slot Socket Direct Card for pinout definitions.
A system with two PCIe x16 slots is required for installing the cards.

Airflow Requirements
ConnectX-6 adapter cards are offered with two airflow patterns: from the heatsink to the network ports, and vice versa, as shown below. Please refer to the Specifications section for airflow numbers for each specific card model.

34

Airflow from the heatsink to the network ports

Airflow from the network ports to the heatsink

All cards in the system should be planned with the same airflow direction.
Software Requirements
· See Operating Systems/Distributions section under the Introduction section. · Software Stacks – NVIDIA OpenFabric software package MLNX_OFED for Linux, WinOF-2 for Windows, and VMware. See the Driver Installation section.
Safety Precautions
The adapter is being installed in a system that operates with voltages that can be lethal. Before opening the case of the system, observe the following precautions to avoid injury and prevent damage to system components.
· Remove any metallic objects from your hands and wrists.
35

· Make sure to use only insulated tools. · Verify that the system is powered off and is unplugged. · It is strongly recommended to use an ESD strap or other antistatic devices.
Pre-Installation Checklist
· Unpack the ConnectX-6 Card; Unpack and remove the ConnectX-6 card. Check against the package contents list that all the parts have been sent. Check the parts for visible damage that may have occurred during shipping. Please note that the cards must be placed on an antistatic surface. For package contents please refer to Package Contents.
Please note that if the card is removed hastily from the antistatic bag, the plastic ziplock may harm the EMI fingers on the networking
connector. Carefully remove the card from the antistatic bag to avoid damaging the EMI fingers. · Shut down your system if active; Turn off the power to the system, and disconnect the power cord. Refer to the system documentation for
instructions. Before you install the ConnectX-6 card, make sure that the system is disconnected from power. · (Optional) Check the mounting bracket on the ConnectX-6 or PCIe Auxiliary Connection Card; If required for your system, replace the full-height
mounting bracket that is shipped mounted on the card with the supplied low- profile bracket. Refer to Bracket Replacement Instructions.
Bracket Replacement Instructions
The ConnectX-6 card and PCIe Auxiliary Connection card are usually shipped with an assembled high-profile bracket. If this form factor is suitable for your requirements, you can skip the remainder of this section and move to Installation Instructions. If you need to replace the high-profile bracket with the short bracket that is included in the shipping box, please follow the instructions in this section.
Due to risk of damaging the EMI gasket, it is not recommended to replace the bracket more than three times.
To replace the bracket you will need the following parts: · The new brackets of the proper height · The 2 screws saved from the removal of the bracket
Removing the Existing Bracket
36

1. Using a torque driver, remove the two screws holding the bracket in place. 2. Separate the bracket from the ConnectX-6 card.
Be careful not to put stress on the LEDs on the adapter card.
3. Save the two screws. Installing the New Bracket
1. Place the bracket onto the card until the screw holes line up.
Do not force the bracket onto the adapter card.
2. Screw on the bracket using the screws saved from the bracket removal procedure above.
Use a torque driver to apply up to 2 lbs-in torque on the screws.

Installation Instructions
This section provides detailed instructions on how to install your adapter card in a system. Choose the installation instructions according to the ConnectX-6 configuration you have purchased.

OPNs
MCX651105A-EDAT MCX653105A-HDAT MCX653106A-HDAT MCX653105A-ECAT MCX653106A- ECAT MCX653105A-EFAT MCX653106A-EFAT MCX683105AN-HDAT

Installation Instructions PCIe x8/16 Cards Installation Instructions

37

MCX654105A-HCAT MCX654106A-HCAT MCX654106A-ECAT
MCX653105A-HDAL MCX653106A-HDAL

Socket Direct (2x PCIe x16) Cards Installation Instructions Cards for Intel Liquid-Cooled Platforms Installation Instructions

Cables and Modules
Cable Installation
1. All cables can be inserted or removed with the unit powered on. 2. To insert a cable, press the connector into the port receptacle until the connector is firmly seated.
a. Support the weight of the cable before connecting the cable to the adapter card. Do this by using a cable holder or tying the cable to the rack.
b. Determine the correct orientation of the connector to the card before inserting the connector. Do not try and insert the connector upside down. This may damage the adapter card.
c. Insert the connector into the adapter card. Be careful to insert the connector straight into the cage. Do not apply any torque, up or down, to the connector cage in the adapter card.
d. Make sure that the connector locks in place.
When installing cables make sure that the latches engage.
Always install and remove cables by pushing or pulling the cable and connector in a straight line with the card.
3. After inserting a cable into a port, the Green LED indicator will light when the physical connection is established (that is, when the unit is powered on and a cable is plugged into the port with the other end of the connector plugged into a functioning port). See LED Interface under the Interfaces section.
4. After plugging in a cable, lock the connector using the latching mechanism particular to the cable vendor. When data is being transferred the Green LED will blink. See LED Interface under the Interfaces section.
5. Care should be taken as not to impede the air exhaust flow through the ventilation holes. Use cable lengths which allow for routing horizontally around to the side of the chassis before bending upward or downward in the rack.

38

6. To remove a cable, disengage the locks and slowly pull the connector away from the port receptacle. LED indicator will turn off when the cable is unseated.

Identifying the Card in Your System

On Linux Get the device location on the PCI bus by running lspci and locating lines with the string “Mellanox Technologies”:

ConnectX-6 Card Configuration
Single-port Socket Direct Card (2x PCIe x16)

lspci Command Output Example
[root@mftqa-009 ~]# lspci |grep mellanox -i a3:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6] e3:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]

Dual-port Socket Direct Card (2x PCIe x16)

[root@mftqa-009 ~]# lspci |grep mellanox -i 05:00.0 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 05:00.1 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 82:00.0 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 82:00.1 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6]

Single-port PCIe x8/x16 Card

In the output example above, the first two rows indicate that one card is installed in a PCI slot with PCI Bus address 05 (hexadecimal), PCI Device number 00 and PCI Function number 0 and 1. The other card is installed in a PCI slot with PCI Bus address 82 (hexa-decimal), PCI Device number 00 and PCI Function number 0 a nd 1.
Since the two PCIe cards are installed in two PCIe slots, each card gets a unique PCI Bus and Device number. Each of the PCIe x16 busses sees two network ports; in effect, the two physical ports of the ConnectX-6 Socket Direct adapter are viewed as four net devices by the system.

[root@mftqa-009 ~]# lspci |grep mellanox -i 3:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]

39

Dual-port PCIe x16 Card

[root@mftqa-009 ~]# lspci |grep mellanox -i 86:00.0 Network controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 86:00.1 Network controller: Mellanox Technologies MT28908A0 Family [ConnectX-6]

On Windows 1. Open Device Manager on the server. Click Start => Run, and then enter devmgmt.msc. 2. Expand System Devices and locate your NVIDIA ConnectX-6 adapter card. 3. Right click the mouse on your adapter’s row and select Properties to display the adapter card properties window. 4. Click the Details tab and select Hardware Ids (Windows 2012/R2/2016) from the Property pull-down menu.

40

PCI Device (Example)

5. In the Value display box, check the fields VEN and DEV (fields are separated by `&’). In the display example above, notice the sub-string “PCIVEN_15B3&DEV_1003”: VEN is equal to 0x15B3 ­ this is the Vendor ID of NVIDIA; and DEV is equal to 1018 (for ConnectX-6) ­ this is a valid NVIDIA PCI Device ID.

If the PCI device does not have a NVIDIA adapter ID, return to Step 2 to check another device.

41

The list of NVIDIA PCI Device IDs can be found in the PCI ID repository at http://pci-ids.ucw.cz/read/PC/15b3.

PCIe x8/16 Cards Installation Instructions
Installing the Card
Applies to OPNs MCX651105A-EDAT, MCX654105A-HCAT, MCX654106A-HCAT, MCX683105AN-HDAT, MCX653106A-ECAT and MCX653105A-ECAT.
Please make sure to install the ConnectX-6 cards in a PCIe slot that is capable of supplying the required power and airflow as stated in
Specifications.

Connect the adapter Card in an available PCI Express x16 slot in the chassis. Step 1: Locate an available PCI Express x16 slot and insert the adapter card to the chassis.

42

Step 2: Applying even pressure at both corners of the card, insert the adapter card in a PCI Express slot until firmly seated. 43

Do not use excessive force when seating the card, as this may damage the chassis.
Secure the adapter card to the chassis. Step 1: Secure the bracket to the chassis with the bracket screw.
Uninstalling the Card
Safety Precautions 44

The adapter is installed in a system that operates with voltages that can be lethal. Before uninstalling the adapter card, please observe the following precautions to avoid injury and prevent damage to system components.
1. Remove any metallic objects from your hands and wrists. 2. It is strongly recommended to use an ESD strap or other antistatic devices. 3. Turn off the system and disconnect the power cord from the server. Card Removal
Please note that the following images are for illustration purposes only.
1. Verify that the system is powered off and unplugged. 2. Wait 30 seconds. 3. To remove the card, disengage the retention mechanisms on the bracket (clips or screws).
4. Holding the adapter card from its center, gently pull the ConnectX-6 and Auxiliary Connections cards out of the PCI Express slot.
45

Socket Direct (2x PCIe x16) Cards Installation Instructions
The hardware installation section uses the terminology of white and black harnesses to differentiate between the two supplied cables. Due to supply chain variations, some cards may be supplied with two black harnesses instead. To clarify the difference between these two harnesses, one black harness was marked with a “WHITE” label and the other with a “BLACK” label. The Cabline harness marked with “WHITE” label should be connected to the connector on the ConnectX-6 and PCIe card engraved with “White Cable” while the one marked with “BLACK” label should be connected to the connector on the ConnectX-6 and PCIe card engraved with “Black Cable”.
The harnesses’ minimal bending radius is 10[mm].
46

Installing the Card
Applies to MCX654105A-HCAT, MCX654106A-HCAT and MCX654106A-ECAT. The installation instructions include steps that involve a retention clip to be used while connecting the Cabline harnesses to the cards. Please note
that this is an optional accessory.
Please make sure to install the ConnectX-6 cards in a PCIe slot that is capable of supplying the required power and airflow as stated in
Specifications. Connect the adapter card with the Auxiliary connection card using the supplied Cabline CA-II Plus harnesses. Step 1: Slide the black and white Cabline CA-II Plus harnesses through the retention clip while making sure the clip opening is facing the plugs.
47

Step 2: Plug the Cabline CA-II Plus harnesses on the ConnectX-6 adapter card while paying attention to the color-coding. As indicated on both sides of the card; plug the black harness to the component side and the white harness to the print side.
Step 2: Verify the plugs are locked.
48

Step 3: Slide the retention clip latches through the cutouts on the PCB. The latches should face the annotation on the PCB. 49

Step 4: Clamp the retention clip. Verify both latches are firmly locked. 50

Step 5: Slide the Cabline CA-II Plus harnesses through the retention clip. Make sure that the clip opening is facing the plugs. 51

Step 6: Plug the Cabline CA-II Plus harnesses on the PCIe Auxiliary Card. As indicated on both sides of the Auxiliary connection card; plug the black harness to the component side and the white harness to the print side.
Step 7: Verify the plugs are locked. 52

Step 8: Slide the retention clip through the cutouts on the PCB. Make sure latches are facing “Black Cable” annotation as seen in the below picture.
Step 9: Clamp the retention clip. Verify both latches are firmly locked. 53

Connect the ConnectX-6 adapter and PCIe Auxiliary Connection cards in available PCI Express x16 slots in the chassis. Step 1: Locate two available PCI Express x16 slots. Step 2: Applying even pressure at both corners of the cards, insert the adapter card in the PCI Express slots until firmly seated.
54

Do not use excessive force when seating the cards, as this may damage the system or

the cards.

Step 3: Applying even pressure at both corners of the cards, insert the Auxiliary Connection card in the PCI Express slots until firmly seated.

55

Secure the ConnectX-6 adapter and PCIe Auxiliary Connection Cards to the chassis. Step 1: Secure the brackets to the chassis with the bracket screw.
56

Uninstalling the Card
Safety Precautions The adapter is installed in a system that operates with voltages that can be lethal. Before uninstalling the adapter card, please observe the following precautions to avoid injury and prevent damage to system components.
1. Remove any metallic objects from your hands and wrists. 2. It is strongly recommended to use an ESD strap or other antistatic devices. 3. Turn off the system and disconnect the power cord from the server. Card Removal
Please note that the following images are for illustration purposes only.
57

1. Verify that the system is powered off and unplugged. 2. Wait 30 seconds. 3. To remove the card, disengage the retention mechanisms on the bracket (clips or screws).
4. Holding the adapter card from its center, gently pull the ConnectX-6 and Auxiliary Connections cards out of the PCI Express slot.
58

Cards for Intel Liquid-Cooled Platforms Installation Instructions
The below instructions apply to ConnectX-6 cards designed for Intel liquid- cooled platforms with ASIC interposer cooling mechanism. OPNs: MCX653105AHDAL and MCX653106A-HDAL.
The below figures are for illustration purposes only. The below instructions should be used in conjunction with the server’s documentation.
Installing the Card
Please make sure the system is capable of supplying the required power as stated in Specifications.
59

Pay extra attention to the black bumpers located on the print side of the card. Failure to do so may harm the bumpers.
Apply the supplied thermal pad (one of the two) on top of the ASIC interposer or onto the coldplate.
The thermal pads are shipped with two protective liners covering the pad on both sides. It is highly important to peel the liners as instructed
below prior to applying them to the card. 1. Gently peel the liner from the thermal pad’s tacky side. 2. Carefully apply the thermal pad on the cool block (ASIC interposer) while ensuring it thoroughly covers it. Extra care should be taken not to damage
the pad. The thermal pad should be applied on the cool block from its tacky (wet) side. The pad should be applied with its non-tacky side facing up.
OR Carefully apply the thermal pad on the coldplate while ensuring it thoroughly covers it. The below figure indicates the position of the thermal pad. Extra care should be taken not to damage the pad. The thermal pad should be applied on the coldplate from its tacky (wet) side. The pad should be applied with its non-tacky side facing up.
60

2. Ensure the thermal pad is in place and intact. 3. Once the thermal pad is applied to the ASIC interposer, the non-tacky side should be visible on the card’s faceplate.
61

4. Gently peel the liner of the pad’s non-tacky side visible on the card’s faceplate. Failure to do so may degrade the thermal performance of the product.
Install the adapter into the riser and attach the card to the PCIe x16 slot.

  1. Disengage the adapter riser from the blade. Please refer to the blade documentation for instructions.
    2. Applying even pressure at both corners of the card, insert the adapter card into the adapter riser until firmly seated. Care must be taken to not harm the black bumpers located on the print side of the card.
    62

Vertically insert the riser that populates the adapter card into the server blade. 63

1. Applying even pressure on the riser, gently insert the riser into the server. 2. Secure the riser with the supplied screws. Please refer to the server blade documentation for more information.
64

Driver Installation
Please use the relevant driver installation section.
ConnectX-6 Socket Direct cards 2x PCIe x16 (OPNs: MCX654106A-HCAT and MCX654106A-ECAT) are not supported in Windows and WinOF-2.

· Linux Driver Installation · Windows Driver Installation · VMware Driver Installation

Linux Driver Installation
This section describes how to install and test the MLNX_OFED for Linux package on a single server with a NVIDIA ConnectX-6 adapter card installed.

Prerequisites

Platforms

Requirements

Required Disk Space for Installation Operating System

Installer Privileges

Description
A server platform with a ConnectX-6 InfiniBand/Ethernet adapter card installed.
1GB Linux operating system. For the list of supported operating system distributions and kernels, please refer to the MLNX_OFED Release Notes The installation requires administrator (root) privileges on the target machine.

65

Downloading NVIDIA OFED

1. Verify that the system has a NVIDIA network adapter installed by running lscpi command. The below table provides output examples per ConnectX-6 card configuration.
ConnectX-6 Card Configuration

Single-port Socket Direct Card (2x PCIe x16)

[root@mftqa-009 ~]# lspci |grep mellanox -i a3:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6] e3:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]

Dual-port Socket Direct Card (2x PCIe x16)
Single-port PCIe x16 Card

[root@mftqa-009 ~]# lspci |grep mellanox -i 05:00.0 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 05:00.1 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 82:00.0 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 82:00.1 Infiniband controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] In the output example above, the first two rows indicate that one card is installed in a PCI slot with PCI Bus address 05 (hexadecimal), PCI Device number 00 and PCI Function number 0 and 1. The other card is installed in a PCI slot with PCI Bus address 82 (hexadecimal), PCI Device number 00 and PCI Function number 0 and 1. Since the two PCIe cards are installed in two PCIe slots, each card gets a unique PCI Bus and Device number. Each of the PCIe x16 busses sees two network ports; in effect, the two physical ports of the ConnectX-6 Socket Direct adapter are viewed as four net devices by the system.
[root@mftqa-009 ~]# lspci |grep mellanox -ia 3:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]

66

ConnectX-6 Card Configuration Dual-port PCIe x16 Card

[root@mftqa-009 ~]# lspci |grep mellanox -ia 86:00.0 Network controller: Mellanox Technologies MT28908A0 Family [ConnectX-6] 86:00.1 Network controller: Mellanox Technologies MT28908A0 Family [ConnectX-6]

2. Download the ISO image to your host. The image’s name has the format MLNX_OFED_LINUX--.iso. You can download and install the latest OpenFabrics Enterprise Distribution (OFED) software package available via the NVIDIA web site at
nvidia.com/en-us/networking Products Software InfiniBand Drivers NVIDIA MLNX_OFED i. Scroll down to the Download wizard, and click the Download tab. ii. Choose your relevant package depending on your host operating system.
iii. Click the desired ISO/tgz package. iv. To obtain the download link, accept the End User License Agreement (EULA). 3. Use the Hash utility to confirm the file integrity of your ISO image. Run the following command and compare the result to the value provided on the download page.
SHA256 MLNX_OFED_LINUX--.iso
Installing MLNX_OFED
Installation Script
The installation script, mlnxofedinstall, performs the following: · Discovers the currently installed kernel · Uninstalls any software stacks that are part of the standard operating system distribution or another vendor’s commercial stack
67

· Installs the MLNX_OFED_LINUX binary RPMs (if they are available for the current kernel) · Identifies the currently installed InfiniBand and Ethernet network adapters and automatically upgrades the firmware
Note: To perform a firmware upgrade using customized firmware binaries, a path can be provided to the folder that contains the firmware binary files, by running –fw-image-dir. Using this option, the firmware version embedded in the MLNX_OFED package will be ignored. Example:
./mlnxofedinstall –fw-image-dir /tmp/my_fw_bin_files
If the driver detects unsupported cards on the system, it will abort the installation procedure. To avoid this, make sure to add –skip-
unsupported-devices-check flag during installation. Usage
./mnt/mlnxofedinstall [OPTIONS] The installation script removes all previously installed OFED packages and re-installs from scratch. You will be prompted to acknowledge the deletion of the old packages.
Pre-existing configuration files will be saved with the extension “.conf.rpmsave”.
· If you need to install OFED on an entire (homogeneous) cluster, a common strategy is to mount the ISO image on one of the cluster nodes and then copy it to a shared file system such as NFS. To install on all the cluster nodes, use cluster-aware tools (suchaspdsh).
· If your kernel version does not match with any of the offered pre-built RPMs, you can add your kernel version by using the “mlnx_add_kernel_support.sh” script located inside the MLNX_OFED package.
On Redhat and SLES distributions with errata kernel installed there is no need to use the mlnx_add_kernel_support.sh script. The
regular installation can be performed and weak-updates mechanism will create symbolic links to the MLNX_OFED kernel modules.
68

If you regenerate kernel modules for a custom kernel (using –add-kernel- support), the packages installation will not involve
automatic regeneration of the initramfs. In some cases, such as a system with a root filesystem mounted over a ConnectX card, not regenerating the initramfs may even cause the system to fail to reboot. In such cases, the installer will recommend running the following command to update the initramfs:
dracut -f On some OSs, dracut -f might result in the following error message which can be safely ignore. libkmod: kmod_module_new_from_path: kmod_module ‘mdev’ already exists with different path
The “mlnx_add_kernel_support.sh” script can be executed directly from the mlnxofedinstall script. For further information, please see ‘-add-kernel- support’ option below.
On Ubuntu and Debian distributions drivers installation use Dynamic Kernel Module Support (DKMS) framework. Thus, the drivers’
compilation will take place on the host during MLNX_OFED installation. Therefore, using “mlnx_add_kernel_support.sh” is irrelevant on Ubuntu and Debian distributions.
Example: The following command will create a MLNX_OFED_LINUX ISO image for RedHat 7.3 under the /tmp directory.

./MLNX_OFED_LINUX-x.x-x-rhel7.3-x86_64/mlnx_add_kernel_support.sh -m

/tmp/MLNX_OFED_LINUX-x.x-xrhel7.3-x86_64/ –make-tgz Note: This program will create MLNX_OFED_LINUX TGZ for rhel7.3 under /tmp directory. All Mellanox, OEM, OFED, or Distribution IB packages will be removed. Do you want to continue?[y/N]:y See log file /tmp/mlnx_ofed_iso.21642.log
Building OFED RPMs. Please wait… Removing OFED RPMs… Created /tmp/MLNX_OFED_LINUX-x.x-x-rhel7.3-x86_64-ext.tgz · The script adds the following lines to /etc/security/limits.conf for the userspace components such as MPI:
· soft memlock unlimited · hard memlock unlimited
69

· These settings set the amount of memory that can be pinned by a userspace application to unlimited. If desired, tune the value unlimited to a specific amount of RAM.
For your machine to be part of the InfiniBand/VPI fabric, a Subnet Manager must be running on one of the fabric nodes. At this point, OFED for Linux has already installed the OpenSM Subnet Manager on your machine. For the list of installation options, run:
./mlnxofedinstall –h
Installation Procedure
This section describes the installation procedure of MLNX_OFED on NVIDIA adapter cards. a. Log in to the installation machine as root. b. Mount the ISO image on your machine.
host1# mount -o ro,loop MLNX_OFED_LINUX---.iso /mnt c. Run the installation script.
/mnt/mlnxofedinstall Logs dir: /tmp/MLNX_OFED_LINUX-x.x-x.logs This program will install the MLNX_OFED_LINUX package on your machine. Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed. Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them. Starting MLNX_OFED_LINUX-x.x.x installation … …….. …….. Installation finished successfully. Attempting to perform Firmware update… Querying Mellanox devices firmware …
70

For unattended installation, use the –force installation option while running the MLNX_OFED installation script:
/mnt/mlnxofedinstall –force
MLNXOFED for Ubuntu should be installed with the following flags in chroot environment:
./mlnxofedinstall –without-dkms –add-kernel-support –kernel <kernel version in chroot> –without-fw-update –force For example: ./mlnxofedinstall –without-dkms –add-kernel-support –kernel 3.13.0-85-generic –without-fw-update –force Note that the path to kernel sources (–kernel-sources) should be added if the sources are not in their default location.
In case your machine has the latest firmware, no firmware update will occur and the installation script will print at the end of
installation a message similar to the following: Device #1: ———Device Type: ConnectX-X Part Number: MCXXXX-XXX PSID: MT
PCI Device Name: 0b:00.0 Base MAC: 0000e41d2d5cf810 Versions: Current Available FW XX.XX.XXXX Status: Up to date
In case your machine has an unsupported network adapter device, no firmware update will occur and one of the error messages
below will be printed. Please contact your hardware vendor for help with firmware updates. Error message #1: Device #1: ———Device Type: ConnectX-X Part Number: MCXXXX-XXX PSID: MT_ PCI Device Name: 0b:00.0
71

Base MAC: 0000e41d2d5cf810 Versions: Current Available FW XX.XX.XXXX Status: No matching image found Error message #2: The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor.

d. Case A: If the installation script has performed a firmware update on your network adapter, you need to either restart the driver or reboot your system before the firmware update can take effect. Refer to the table below to find the appropriate action for your specific card.

Action Adapter

Driver Restart

Standard Reboot (Soft Reset)

Cold Reboot (Hard Reset)

Standard ConnectX-4/ConnectX-4 Lx or –

higher

Adapters with Multi-Host Support

Socket Direct Cards

Case B: If the installations script has not performed a firmware upgrade on your network adapter, restart the driver by running: “/etc/init.d/ openibd restart”.
e. (InfiniBand only) Run the hca_self_test.ofed utility to verify whether or not the InfiniBand link is up. The utility also checks for and displays additional information such as: · HCA firmware version · Kernel architecture · Driver version · Number of active HCA ports along with their states · Node GUID For more details on hca_self_test.ofed, see the file docs/readme_and_user_manual/hca_self_test.readme.
After installation completion, information about the OFED installation, such as prefix, kernel version, and installation parameters can be retrieved by running the command /etc/infiniband/info. Most of the OFED components can be configured or reconfigured after the installation, by modifying

72

the relevant configuration files. See the relevant chapters in this manual for details. The list of the modules that will be loaded automatically upon boot can be found in the /etc/infiniband/openib.conf file.
Installing OFED will replace the RDMA stack and remove existing 3rd party RDMA connectors.

Installation Results
Software
Firmware

· Most of MLNX_OFED packages are installed under the “/usr” directory except for the following packages which are installed under the “/opt” directory: · fca and ibutils · iproute2 (rdma tool) – installed under /opt/Mellanox/iproute2/sbin/rdma
· The kernel modules are installed under · /lib/modules/uname -r/updates on SLES and Fedora Distributions · /lib/modules/uname -r/extra/mlnx-ofa_kernel on RHEL and other RedHat like Distributions · /lib/modules/uname -r/updates/dkms/ on Ubuntu
· The firmware of existing network adapter devices will be updated if the following two conditions are fulfilled: · The installation script is run in default mode; that is, without the option `–without- fw-update’ · The firmware version of the adapter device is older than the firmware version included with the OFED ISO image Note: If an adapter’s Flash was originally programmed with an Expansion ROM image, the automatic firmware update will also burn an Expansion ROM image.
· In case your machine has an unsupported network adapter device, no firmware update will occur and the error message below will be printed. “The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor.”

Installation Logging
While installing MLNX_OFED, the install log for each selected package will be saved in a separate log file. The path to the directory containing the log files will be displayed after running the installation script in the following format: Example:
Logs dir: /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0.IBMM2150110033.logs

73

Driver Load Upon System Boot
Upon system boot, the NVIDIA drivers will be loaded automatically. To prevent the automatic load of the NVIDIA drivers upon system boot:
a. Add the following lines to the “/etc/modprobe.d/mlnx.conf” file.

blacklist mlx5_core blacklist mlx5_ib
b. Set “ONBOOT=no” in the “/etc/infiniband/openib.conf” file. c. If the modules exist in the initramfs file, they can automatically be loaded by the kernel. To prevent this behavior, update the initramfs using
the operating systems’ standard tools. Note: The process of updating the initramfs will add the blacklists from step 1, and will prevent the kernel from loading the modules automatically.

mlnxofedinstall Return Codes

The table below lists the mlnxofedinstall script return codes and their meanings.

Return Code

Meaning

0

The Installation ended successfully

1

The installation failed

2

No firmware was found for the adapter device

22

Invalid parameter

28

Not enough free space

171

Not applicable to this system configuration. This can occur when the required hardware is not present on the system

172

Prerequisites are not met. For example, missing the required software installed or the hardware is not configured correctly

173

Failed to start the mst driver

74

Software Firmware

· Most of MLNX_OFED packages are installed under the “/usr” directory except for the following packages which are installed under the “/opt” directory: · fca and ibutils · iproute2 (rdma tool) – installed under /opt/Mellanox/iproute2/sbin/rdma
· The kernel modules are installed under · /lib/modules/uname -r/updates on SLES and Fedora Distributions · /lib/modules/uname -r/extra/mlnx-ofa_kernel on RHEL and other RedHat like Distributions · /lib/modules/uname -r/updates/dkms/ on Ubuntu
· The firmware of existing network adapter devices will be updated if the following two conditions are fulfilled: · The installation script is run in default mode; that is, without the option `–without- fw-update’ · The firmware version of the adapter device is older than the firmware version included with the OFED ISO image Note: If an adapter’s Flash was originally programmed with an Expansion ROM image, the automatic firmware update will also burn an Expansion ROM image.
· In case your machine has an unsupported network adapter device, no firmware update will occur and the error message below will be printed. “The firmware for this device is not distributed inside NVIDIA driver: 0000:01:00.0 (PSID: IBM2150110033) To obtain firmware for this device, please contact your HW vendor.”

Installation Logging
While installing MLNX_OFED, the install log for each selected package will be saved in a separate log file. The path to the directory containing the log files will be displayed after running the installation script in the following format: Example:
Logs dir: /tmp/MLNX_OFED_LINUX-4.4-1.0.0.0.IBMM2150110033.logs
Uninstalling MLNX_OFED
Use the script /usr/sbin/ofed_uninstall.sh to uninstall the MLNX_OFED package. The script is part of the ofed-scripts RPM.

75

Additional Installation Procedures

Installing MLNX_OFED Using YUM
This type of installation is applicable to RedHat/OL and Fedora operating systems.
Setting up MLNX_OFED YUM Repository
a. Log into the installation machine as root. b. Mount the ISO image on your machine and copy its content to a shared location in your network.

mount -o ro,loop MLNX_OFED_LINUX---.iso /mnt

c. Download and install NVIDIA’s GPG-KEY: The key can be downloaded via the following link: http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox

wget http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox –2018-01-25

13:52:30– http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox Resolving www.mellanox.com… 72.3.194.0 Connecting to www.mellanox.com|72.3.194.0|:80… connected. HTTP request sent, awaiting response… 200 OK Length: 1354 (1.3K) [text/plain] Saving to: ?RPM-GPG-KEY- Mellanox?

100%[=================================================>] 1,354

–.-K/s in 0s

2018-01-25 13:52:30 (247 MB/s) – ?RPM-GPG-KEY-Mellanox? saved [1354/1354]

d. Install the key.

sudo rpm –import RPM-GPG-KEY-Mellanox

76

warning: rpmts_HdrFromFdno: Header V3 DSA/SHA1 Signature, key ID 6224c050: NOKEY Retrieving key from file:///repos/MLNX_OFED//RPM-GPG- KEY-Mellanox Importing GPG key 0x6224C050:
Userid: “Mellanox Technologies (Mellanox Technologies – Signing Key v2) support@mellanox.com” From : /repos/MLNX_OFED//RPM-GPG-KEY- Mellanox Is this ok [y/N]:
e. Check that the key was successfully imported.

rpm -q gpg-pubkey –qf ‘%{NAME}-%{VERSION}-%{RELEASE}t%{SUMMARY}n’ | grep

Mellanox gpg-pubkey-a9e4b643-520791ba gpg(Mellanox Technologies support@mellanox.com)
f. Create a yum repository configuration file called “/etc/yum.repos.d/mlnx_ofed.repo” with the following content:

[mlnx_ofed] name=MLNX_OFED Repository baseurl=file:///<path to extracted MLNX_OFED package>/RPMS enabled=1 gpgkey=file:///<path to the downloaded key RPM-GPG-KEY-Mellanox> gpgcheck=1
g. Check that the repository was successfully added.

yum repolist

Loaded plugins: product-id, security, subscription-manager

This system is not registered to Red Hat Subscription Management. You can use subscription-manager to

register.

repo id repo name

status

mlnx_ofed MLNX_OFED Repository

108

rpmforge RHEL 6Server – RPMforge.net – dag

4,597

repolist: 8,351

Setting up MLNX_OFED YUM Repository Using –add-kernel-support
a. Log into the installation machine as root. b. Mount the ISO image on your machine and copy its content to a shared location in your network.

77

mount -o ro,loop MLNX_OFED_LINUX---.iso /mnt c.

Build the packages with kernel support and create the tarball.

/mnt/mlnx_add_kernel_support.sh –make-tgz <optional –kmp> -k $(uname -r) -m

/mnt/ Note: This program will create MLNX_OFED_LINUX TGZ for rhel7.6 under /tmp directory. Do you want to continue?[y/N]:y See log file /tmp/mlnx_iso.4120_logs/mlnx_ofed_iso.4120.log
Checking if all needed packages are installed… Building MLNX_OFED_LINUX RPMS . Please wait… Creating metadata-rpms for 3.10.0-957.21.3.el7.x86_64 … WARNING: If you are going to configure this package as a repository, then please note WARNING: that it contains unsigned rpms, therefore, you need to disable the gpgcheck WARNING: by setting ‘gpgcheck=0’ in the repository conf file. Created /tmp/MLNX_OFED_LINUX-5.2-0.5.5.0-rhel7.6-x86_64-ext.tgz d. Open the tarball.

cd /tmp/ # tar -xvf /tmp/MLNX_OFED_LINUX-5.2-0.5.5.0-rhel7.6-x86_64-ext.tgz

e. Create a YUM repository configuration file called “/etc/yum.repos.d/mlnx_ofed.repo” with the following content:
[mlnx_ofed] name=MLNX_OFED Repository baseurl=file:///<path to extracted MLNX_OFED package>/RPMS enabled=1 gpgcheck=0 f. Check that the repository was successfully added.

yum repolist Loaded plugins: product-id, security, subscription-manager This

system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
78

repo id repo name mlnx_ofed MLNX_OFED Repository rpmforge RHEL 6Server – RPMforge.net – dag
repolist: 8,351

status 108 4,597

Installing MLNX_OFED Using the YUM Tool
After setting up the YUM repository for MLNX_OFED package, perform the following: a. View the available package groups by invoking:

yum search mlnx-ofedmlnx-ofed-all.noarch : MLNX_OFED all installer package

(with KMP support) mlnx-ofed-all-user-only.noarch : MLNX_OFED all-user-only installer package (User Space packages only) mlnx-ofed-basic.noarch : MLNX_OFED basic installer package (with KMP support) mlnx-ofed-basic-user- only.noarch : MLNX_OFED basic-user-only installer package (User Space packages only) mlnx-ofed-bluefield.noarch : MLNX_OFED bluefield installer package (with KMP support) mlnx-ofed-bluefield-user-only.noarch : MLNX_OFED bluefield-user- only installer package (User Space packages only) mlnx-ofed-dpdk.noarch : MLNX_OFED dpdk installer package (with KMP support) mlnx-ofed-dpdk-upstream- libs.noarch : MLNX_OFED dpdk-upstream-libs installer package (with KMP support) mlnx-ofed-dpdk-upstream-libs-user-only.noarch : MLNX_OFED dpdk- upstream-libs-user-only installer package
(User Space packages only) mlnx-ofed-dpdk-user-only.noarch : MLNX_OFED dpdk- user-only installer package (User Space packages only) mlnx-ofed-eth-only- user-only.noarch : MLNX_OFED eth-only-user-only installer package (User Space packages only) mlnx-ofed-guest.noarch : MLNX_OFED guest installer package (with KMP support) mlnx-ofed-guest-user-only.noarch : MLNX_OFED guest-user- only installer package (User Space packages only) mlnx-ofed-hpc.noarch : MLNX_OFED hpc installer package (with KMP support) mlnx-ofed-hpc-user- only.noarch : MLNX_OFED hpc-user-only installer package (User Space packages only) mlnx-ofed-hypervisor.noarch : MLNX_OFED hypervisor installer package (with KMP support) mlnx-ofed-hypervisor-user-only.noarch : MLNX_OFED hypervisor-user-only installer package (User Space packages only) mlnx-ofed- kernel-only.noarch : MLNX_OFED kernel-only installer package (with KMP support) mlnx-ofed-vma.noarch : MLNX_OFED vma installer package (with KMP support) mlnx-ofed-vma-eth.noarch : MLNX_OFED vma-eth installer package (with KMP support)

79

mlnx-ofed-vma-eth-user-only.noarch : MLNX_OFED vma-eth-user-only installer package (User Space packages only) mlnx-ofed-vma-user-only.noarch : MLNX_OFED vma-user-only installer package (User Space packages only) mlnx-ofed-vma- vpi.noarch : MLNX_OFED vma-vpi installer package (with KMP support) mlnx-ofed- vma-vpi-user-only.noarch : MLNX_OFED vma-vpi-user-only installer package (User Space packages only

where:
mlnx-ofed-all
mlnx-ofed-basic

Installs all available packages in MLNX_OFED Installs basic packages required for running NVIDIA cards

mlnx-ofed-guest

Installs packages required by guest OS

mlnx-ofed-hpc

Installs packages required for HPC

mlnx-ofed-hypervisor

Installs packages required by hypervisor OS

mlnx-ofed-vma

Installs packages required by VMA

mlnx-ofed-vma-eth

Installs packages required by VMA to work over Ethernet

mlnx-ofed-vma-vpi

Installs packages required by VMA to support VPI

bluefield

Installs packages required for BlueField

dpdk

Installs packages required for DPDK

dpdk-upstream-libs

Installs packages required for DPDK using RDMA-Core

kernel-only

Installs packages required for a non-default kernel

Note: MLNX_OFED provides kernel module RPM packages with KMP support for RHEL and SLES. For other operating systems, kernel module
RPM packages are provided only for the operating system’s default kernel. In this case, the group RPM packages have the supported kernel version in their package’s name. Example:

mlnx-ofed-all-3.17.4-301.fc21.x86_64.noarch : MLNX_OFED all installer package for kernel 3.17.4-301.fc21 .x86_64 (without KMP support) mlnx-ofed- basic-3.17.4-301.fc21.x86_64.noarch : MLNX_OFED basic installer package for kernel 3.17.4-301. fc21.x86_64 (without KMP support)

where:
mlnx-ofed-all

MLNX_OFED all installer package

mlnx-ofed-basic

MLNX_OFED basic installer package

mlnx-ofed-vma

MLNX_OFED vma installer package

mlnx-ofed-hpc

MLNX_OFED HPC installer package

mlnx-ofed-vma-eth

MLNX_OFED vma-eth installer package

mlnx-ofed-vma-vpi

MLNX_OFED vma-vpi installer package

knem-dkms

MLNX_OFED DKMS support for mlnx-ofed kernel modules

kernel-dkms

MLNX_OFED kernel-dkms installer package

kernel-only

MLNX_OFED kernel-only installer package

bluefield

MLNX_OFED bluefield installer package

mlnx-ofed-all-exact

MLNX_OFED mlnx-ofed-all-exact installer package

dpdk

MLNX_OFED dpdk installer package

mlnx-ofed-basic-exact

MLNX_OFED mlnx-ofed-basic-exact installer package

dpdk-upstream-libs

MLNX_OFED dpdk-upstream-libs installer package

b. Install the desired group.

apt-get install ‘

85

Example: apt-get install mlnx-ofed-all
Installing MLNX_OFED using the “apt-get” tool does not automatically update the firmware.
To update the firmware to the version included in MLNX_OFED package, run: # apt-get install mlnx-fw-updater
Performance Tuning
Depending on the application of the user’s system, it may be necessary to modify the default configuration of network adapters based on the ConnectX® adapters. In case that tuning is required, please refer to the Performance Tuning Guide for NVIDIA Network Adapters.

Windows Driver Installation
Windows driver is currently not supported in the following ConnectX-6 OPNs:
· MCX654106A-HCAT · MCX654106A-ECAT
For Windows, download and install the latest WinOF-2 for Windows software package available via the NVIDIA website at: WinOF-2 webpage. Follow the installation instructions included in the download package (also available from the download page). The snapshots in the following sections are presented for illustration purposes only. The installation interface may slightly vary, depending on the operating system in use.
86

Software Requirements
Description Windows Server 2022 Windows Server 2019 Windows Server 2016 Windows Server 2012 R2 Windows 11 Client (64 bit only) Windows 10 Client (64 bit only) Windows 8.1 Client (64 bit only)

Package MLNX_WinOF2-_All_x64.exe

Note: The Operating System listed above must run with administrator privileges.

Downloading WinOF-2 Driver
To download the .exe file according to your Operating System, please follow the steps below: 1. Obtain the machine architecture.
a. To go to the Start menu, position your mouse in the bottom-right corner of the Remote Desktop of your screen. b. Open a CMD console (Click Task Manager–>File –> Run new task and enter CMD). c. Enter the following command.
echo %PROCESSOR_ARCHITECTURE%
On an x64 (64-bit) machine, the output will be “AMD64”.

87

2. Go to the WinOF-2 web page at: https://www.nvidia.com/en-us/networking/ > Products > Software > InfiniBand Drivers (Learn More) > Nvidia WinOF-2.
3. Download the .exe image according to the architecture of your machine (see Step 1). The name of the .exe is in the following format: MLNXWinOF2-.exe.
Installing the incorrect .exe file is prohibited. If you do so, an error message will be displayed.
For example, if you install a 64-bit .exe on a 32-bit machine, the wizard will display the following (or a similar) error message: “The installation package is not supported by this processor type. Contact your vendor”

Installing WinOF-2 Driver

The snapshots in the following sections are for illustration purposes only. The installation interface may slightly vary, depending on the used operating system. This section provides instructions for two types of installation procedures, and both require administrator privileges:
· Attended Installation An installation procedure that requires frequent user intervention.
· Unattended Installation An automated installation procedure that requires no user intervention.
Attended Installation
The following is an example of an installation session. 1. Double click the .exe and follow the GUI instructions to install MLNX_WinOF2. 2. [Optional] Manually configure your setup to contain the logs option (replace “LogFile” with the relevant directory).
MLNXWinOF2_All_Arch.exe /v”/l*vx [LogFile]” 3. [Optional] If you do not want to upgrade your firmware version (i.e., MT_SKIPFWUPGRD default value is False).
88

MLNXWinOF2_All_Arch.exe /v” MT_SKIPFWUPGRD=1″ 4. [Optional] If you do not want to install the Rshim driver, run.
MLNXWinOF2_All_Arch.exe /v” MT_DISABLE_RSHIM_INSTALL=1″
The Rshim driver installanion will fail if a prior Rshim driver is already installed. The following fail message will be displayed in the log:
“ERROR!!! Installation failed due to following errors: MlxRshim drivers installation disabled and MlxRshim drivers Installed, Please remove the following oem inf files from driver store: ” 5. [Optional] If you want to skip the check for unsupported devices, run. MLNXWinOF2_All_Arch.exe /v” SKIPUNSUPPORTEDDEVCHECK=1″ 6. Click Next in the Welcome screen.
89

7. Read and accept the license agreement and click Next. 90

8. Select the target folder for the installation. 91

9. The firmware upgrade screen will be displayed in the following cases: · If the user has an OEM card. In this case, the firmware will not be displayed. · If the user has a standard NVIDIA® card with an older firmware version, the firmware will be updated accordingly. However, if the user has both an OEM card and a NVIDIA® card, only the NVIDIA® card will be updated.
92

10. Select a Complete or Custom installation, follow Step a onward. 93

a. Select the desired feature to install: · Performances tools – install the performance tools that are used to measure performance in user environment · Documentation – contains the User Manual and Release Notes · Management tools – installation tools used for management, such as mlxstat · Diagnostic Tools – installation tools used for diagnostics, such as mlx5cmd
94

b. Click Next to install the desired tools. 95

11. Click Install to start the installation.
12. In case firmware upgrade option was checked in Step 7, you will be notified if a firmware upgrade is required (see ). 96

97

13. Click Finish to complete the installation.
Unattended Installation
If no reboot options are specified, the installer restarts the computer whenever necessary without displaying any prompt or warning to the user.
To control the reboots, use the /norestart or /forcerestart standard command- line options. The following is an example of an unattended installation session.
1. Open a CMD console-> Click Start-> Task Manager File-> Run new task-> and enter CMD. 98

2. Install the driver. Run: MLNXWinOF2-[Driver/Version]<revision_version

All-Arch.exe /S /v/qn
3. [Optional] Manually configure your setup to contain the logs option:
MLNXWinOF2-[Driver/Version]All-Arch.exe /S /v/qn /v”/lvx [LogFile]” 4. [Optional] if you wish to control whether to install ND provider or not (i.e., MT_NDPROPERTY default value is True).
MLNXWinOF2-[Driver/Version]_All_Arch.exe /vMT_NDPROPERTY=1 5. [Optional] If you do not wish to upgrade your firmware version (i.e.,MT_SKIPFWUPGRD default value is False).
MLNXWinOF2-[Driver/Version]_All_Arch.exe /vMT_SKIPFWUPGRD=1 6. [Optional] If you do not want to install the Rshim driver, run.
MLNXWinOF2_All_Arch.exe /v” MT_DISABLE_RSHIM_INSTALL=1″
The Rshim driver installanion will fail if a prior Rshim driver is already installed. The following fail message will be displayed in the log:
“ERROR!!! Installation failed due to following errors: MlxRshim drivers installation disabled and MlxRshim drivers Installed, Please remove the following oem inf files from driver store: ” 7. [Optional] If you want to enable the default configuration for Rivermax, run. MLNXWinOF2_All_Arch.exe /v”MT_RIVERMAX=1 /l
vx C:Userslog.txt ” 8. [Optional] If you want to skip the check for unsupported devices, run/
99

MLNXWinOF2_All_Arch.exe /v” SKIPUNSUPPORTEDDEVCHECK=1″

Firmware Upgrade
If the machine has a standard NVIDIA® card with an older firmware version, the firmware will be automatically updated as part of the NVIDIA® WinOF-2 package installation. For information on how to upgrade firmware manually, please refer to MFT User Manual.
If the machine has a DDA (pass through) facility, firmware update is supported only in the Host. Therefore, to update the firmware, the following must be performed:
1. Return the network adapters to the Host. 2. Update the firmware according to the steps in the MFT User Manual. 3. Attach the adapters back to VM with the DDA tools.

VMware Driver Installation
This section describes VMware Driver Installation.

Software Requirements

Platforms Operating System Installer Privileges

Requirement

Description A server platform with an adapter card based on ConnectX®-6 (InfiniBand/EN) (firmware: fw-ConnectX6) ESXi 6.5 The installation requires administrator privileges on the target machine.

100

Installing NATIVE ESXi Driver for VMware vSphere

Please uninstall all previous driver packages prior to installing the new version.
To install the driver:
1. Log into the ESXi server with root permissions. 2. Install the driver.

> esxcli software vib install ­d / Example:

> esxcli software vib install -d /tmp/MLNX-NATIVE-ESX-

ConnectX-4-5_4.16.8.8-10EM-650.0.0.4240417.zipesxcli
3. Reboot the machine. 4. Verify the driver was installed successfully.

esxcli software vib list | grep nmlx

nmlx5-core

4.16.8.8-1OEM.650.0.0.4240417 MEL PartnerSupported 2017-01-31

nmlx5-rdma

4.16.8.8-1OEM.650.0.0.4240417 MEL PartnerSupported 2017-01-31

After the installation process, all kernel modules are loaded automatically upon boot.

Removing Earlier NVIDIA Drivers
Please unload the previously installed drivers before removing them.
To remove all the drivers:

101

1. Log into the ESXi server with root permissions. 2. List all the existing NATIVE ESXi driver modules. (See Step 4 in Installing NATIVE ESXi Driver for VMware vSphere.) 3. Remove each module:

> esxcli software vib remove -n nmlx5-rdma #> esxcli software vib remove -n

nmlx5-core
To remove the modules, you must run the command in the same order as shown in the example above.
4. Reboot the server.
Firmware Programming
1. Download the VMware bootable binary images v4.6.0 from the Firmware Tools (MFT) site. a. ESXi 6.5 File: mft-4.6.0.48-10EM-650.0.0.4598673.x86_64.vib b. MD5SUM: 0804cffe30913a7b4017445a0f0adbe1
2. Install the image according to the steps described in the MFT User Manual.
The following procedure requires custom boot image downloading, mounting and booting from a USB device.
102

Troubleshooting

General Troubleshooting
Server unable to find the adapter
The adapter no longer works Adapters stopped working after installing another adapter
Link indicator light is off Link light is on, but with no communication established
Event message received of insufficient power

· Ensure that the adapter is placed correctly · Make sure the adapter slot and the adapter are compatible
Install the adapter in a different PCI Express slot · Use the drivers that came with the adapter or download the latest · Make sure your motherboard has the latest BIOS · Try to reboot the server
· Reseat the adapter in its slot or a different slot, if necessary · Try using another cable · Reinstall the drivers for the network driver files may be damaged or deleted · Reboot the server
· Try removing and re-installing all adapters · Check that cables are connected properly · Make sure your motherboard has the latest BIOS
· Try another port on the switch · Make sure the cable is securely attached · Check you are using the proper cables that do not exceed the recommended lengths · Verify that your switch and adapter port are compatible
· Check that the latest driver is loaded · Check that both the adapter and its link are set to the same speed and duplex settings
· When [ adapter’s current power consumption ] > [ PCIe slot advertised power limit ] ­ a warning message appears in the server’s system even logs (Eg. dmesg: “Detected insufficient power on the PCIe slow”)
· It’s recommended to use a PCIe slot that can supply enough power. · If a message of the following format appears ­ “mlx5_core 0003:01:00.0: port_module:254:(pid 0): Port
module event[error]: module 0, Cable error, One or more network ports have been powered down due to insufficient/unadvertised power on the PCIe slot” please upgrade your Adapter’s firmware. · If the message remains ­ please consider switching from Active Optical Cable (AOC) or transceiver to Direct Attached Copper (DAC) connectivity.

103

Linux Troubleshooting
Environment Information
Card Detection Mellanox Firmware Tool (MFT)
Ports Information Firmware Version Upgrade Collect Log File

cat /etc/issue uname -a cat /proc/cupinfo | grep `model name’ | uniq ofed_info -s ifconfig -a ip link show ethtool ethtool -i

ibdev2netdev lspci | grep -i Mellanox Download and install MFT: MFT Documentation Refer to the User Manual for installation instructions. Once installed, run: mst start mst status flint -d q ibstat ibv_devinfo To download the latest firmware version, refer to the NVIDIA Update and Query Utility. cat /var/log/messages dmesg >> system.log journalctl (Applicable on new operating systems) cat /var/log/syslog 104

Windows Troubleshooting
Environment Information Mellanox Firmware Tool (MFT)
Ports Information Firmware Version Upgrade Collect Log File

From the Windows desktop choose the Start menu and run: msinfo32 To export system information to a text file, choose the Export option from the File menu. Assign a file name and save.
Download and install MFT: MFT Documentation Refer to the User Manual for installation instructions. Once installed, open a CMD window and run: WinMFT mst start mst status flint ­d q
vstat
Download the latest firmware version using the PSID/board ID from here. flint ­d ­i b
· Event log viewer · MST device logs:
· mst start · mst status · flint ­d dc > dump_configuration.log · mstdump dc > mstdump.log

105

Updating Adapter Firmware
Each adapter card is shipped with the latest version of qualified firmware at the time of manufacturing. However, NVIDIA issues firmware updates occasionally that provide new features and bug fixes. To check that your card is programmed with the latest available firmware version, download the mlxup firmware update and query utility. The utility can query for available NVIDIA adapters and indicate which adapters require a firmware update. If the user confirms, mlxup upgrades the firmware using embedded images. The latest mlxup executable and documentation are available in mlxup – Update and Query Utility.

[server1]# ./mlxup

Querying Mellanox devices firmware …

Device Type:

ConnectX-6

Part Number:

MCX654106A-HCAT

Description:

ConnectX®-6 VPI adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, Socket Direct 2x

PCIe3.0 x16, tall bracket

PSID:

MT_2190110032

PCI Device Name: 0000:06:00.0

Base GUID:

e41d2d0300fd8b8a

Versions:

Current

Available

FW 16.23.1020

16.24.1000

Status:

Update required

Device Type:

ConnectX-6

Part Number:

MCX654106A-HCAT

Description:

ConnectX®-6 VPI adapter card, HDR IB (200Gb/s) and 200GbE, dual-port QSFP56, Socket Direct 2x

PCIe3.0 x16, tall bracket

PSID:

MT_2170110021

PCI Device Name: 0000:07:00.0

Base MAC:

0000e41d2da206d4

Versions:

Current

Available

FW 16.24.1000 16.24.1000

Status:

Up to date

Perform FW update? [y/N]: y Device #1: Up to date Device #2: Updating FW … Done

106

Restart needed for updates to take effect. Log File: /var/log/mlxup/mlxup- yyyymmdd.log
107

Monitoring
The adapter card incorporates the ConnectX IC, which operates in the range of temperatures between 0C and 105C. There are three thermal threshold definitions that impact the overall system operation state: Warning ­ 105°C: On managed systems only: When the device crosses the 105°C threshold, a Warning Threshold message will be issued by the management SW, indicating to system administration that the card has crossed the Warning threshold. Note that this temperature threshold does not require nor lead to any action by hardware (such as adapter card shutdown). Critical ­ 115°C: When the device crosses this temperature, the firmware will automatically shut down the device. Emergency ­ 130°C: If the firmware fails to shut down the device upon crossing the Critical threshold, the device will auto-shutdown upon crossing the Emergency (130°C) threshold. The card’s thermal sensors can be read through the system’s SMBus. The user can read these thermal sensors and adapt the system airflow in accordance with the readouts and the needs of the above- mentioned IC thermal requirements.
108

Specifications

MCX651105A-EDAT Specifications

Please make sure to install the ConnectX-6 card in a PCIe slot that is capable of supplying the required power and airflow as stated in the below
table.

Physical Protocol Support

Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm)

Connector: Single QSFP56 InfiniBand and Ethernet (copper and optical)

InfiniBand: IBTA v1.4a
Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR (14.0625Gb/s per lane), EDR (25Gb/s per lane) port, HDR100 (2 lane x 50Gb/s per lane), HDR (50Gb/s per lane) port

Ethernet: 200GBASE-CR4, 200GBASE-KR4, 200GBASE-SR4, 100GBASE-CR4, 100GBASE- CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASECR4, 40GBASE- KR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE- KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR

Data Rate

InfiniBand

SDR/DDR/QDR/FDR/EDR/HDR100

Ethernet

1/10/25/40/50/100 Gb/s

PCI Express Gen3.0/4.0

SERDES @ 8.0GT/s/16GT/s, x8 lanes (2.0 and 1.1 compatible)

Voltage: 3.3Aux Maximum current: 100mA

109

Power and Airflow
Environmental Regulatory

Power

Cable

Typical Powerb

Passive Cables

10.1W

Maximum Power

Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)

Maximum power available through QSFP56 port: 5W

Airflow (LFM) / Ambient Temperature

Cable Type Passive Cables

Temperature Humidity

Operational Non-operational Operational Non-operational

Altitude (Operational) Safety: CB / cTUVus / CE

3050m

EMC: CE / FCC / VCCI / ICES / RCM / KC

RoHS: RoHS Compliant

Airflow Direction

Heatsink to Port
TBD
0°C to 55°C -40°C to 70°Cc 10% to 85% relative humidity 10% to 90% relative humidity

Port to Heatsink TBD

Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. The non- operational storage temperature specifications apply to the product without its package.

110

MCX653105A-HDAT Specifications

Please make sure to install the ConnectX-6 card in a PCIe slot that is capable of supplying the required power and airflow as stated in the below
table.

Physical Protocol Support

Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm)
Connector: Single QSFP56 InfiniBand and Ethernet (copper and optical)
InfiniBand: IBTA v1.4a Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR (14.0625Gb/s per lane), EDR (25Gb/s per lane) port, HDR100 (2 lane x 50Gb/s per lane), HDR (50Gb/s per lane) port

Ethernet: 200GBASE-CR4, 200GBASE-KR4, 200GBASE-SR4, 100GBASE-CR4, 100GBASE- CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASE- KR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE- KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASEKR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR

Data Rate

InfiniBand

SDR/DDR/QDR/FDR/EDR/HDR100/HDR

PCI Express Gen3/4:

Ethernet

1/10/25/40/50/100/200 Gb/s

SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)

111

Power and Airflow

Voltage: 3.3Aux Maximum current: 100mA

Power

Cable

Typical Powerb

Passive Cables

19.3W

Maximum Power

Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)

Maximum power available through QSFP56 port: 5W

Environmental Regulatory

Airflow (LFM) / Ambient Temperature

Cable Type
Passive Cables NVIDIA Active 4.7W Cables

Temperature Humidity
Altitude (Operational) Safety: CB / cTUVus / CE

Operational Non-operational Operational Non-operational 3050m

EMC: CE / FCC / VCCI / ICES / RCM / KC

RoHS: RoHS Compliant

Airflow Direction

Heatsink to Port

Port to Heatsink

350 LFM / 55°C

250 LFM / 35°C

500 LFM / 55°Cc

250 LFM / 35°C

0°C to 55°C -40°C to 70°C 10% to 85% relative humidity 10% to 90% relative humidityd

112

Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. For engineering samples – add 250LFM d. The non-operational storage temperature specifications apply to the product without its package.

MCX653106A-HDAT Specifications

Please make sure to install the ConnectX-6 card in a PCIe slot that is capable of supplying the required power and airflow as stated in the below
table.

Physical Protocol Support

Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm)

Connector: Dual QSFP56 InfiniBand and Ethernet (copper and optical)

InfiniBand: IBTA v1.4a Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR (14.0625Gb/s per
lane), EDR (25Gb/s per lane) port, HDR100 (2 lane x 50Gb/s per lane), HDR (50Gb/s per lane) port

Ethernet: 200GBASE-CR4, 200GBASE-KR4, 200GBASE-SR4, 100GBASE-CR4, 100GBASE- CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASECR4, 40GBASE- KR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE- KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR

Data Rate

InfiniBand

SDR/DDR/QDR/FDR/EDR/HDR100/HDR

Ethernet

1/10/25/40/50/100/200 Gb/s

PCI Express Gen3/4: SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)

113

Power and Airflow
Environmental Regulatory

Voltage: 3.3Aux Maximum current: 100mA

Power

Cable

Typical Powerb

Passive Cables

23.6W

Maximum Power

Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)

Maximum power available through QSFP56 port: 5W

Airflow (LFM) / Ambient Temperature
Temperature Humidity Altitude (Operational) Safety: CB / cTUVus / CE EMC: CE / FCC / VCCI / ICES / RCM / KC

Cable Type
Passive Cables NVIDIA Active 4.7W Cables
Operational Non-operational Operational Non-operational 3050m

Airflow Direction

Heatsink to Port

Port to Heatsink

400 LFM / 55°C

300 LFM / 35°C

950 LFM / 55°C 600 LFM / 48°Cd

300 LFM / 35°C

0°C to 55°C -40°C to 70°C 10% to 85% relative humidity 10% to 90% relative humidityc

RoHS: RoHS Compliant

114

Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. For both operational and non-operational states.

MCX653105A-HDAL Specifications

Please make sure to install the ConnectX-6 card in an liquid-cooled Intel® Server System D50TNP platform.

Physical

Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm) Connector: Single QSFP56 InfiniBand and Ethernet (copper and optical)

Protocol Support

InfiniBand: IBTA v1.4a Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR (14.0625Gb/s per lane), EDR (25Gb/s
per lane) port, HDR100 (2 lane x 50Gb/s per lane), HDR (50Gb/s per lane) port

Ethernet: 200GBASE-CR4, 200GBASE-KR4, 200GBASE-SR4, 100GBASE-CR4, 100GBASE- CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASEKR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE-KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASECX, 1000BASE-KX, 10GBASE-SR

Data Rate

InfiniBand

SDR/DDR/QDR/FDR/EDR/HDR100/HDR

Ethernet

1/10/25/40/50/100/200 Gb/s

PCI Express Gen3/4: SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)

Power and Airflow

Voltage: 3.3Aux Maximum current: 100mA

Power

Cable

115

Typical Powerb

Passive Cables

18.5W

Maximum Power

Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)

Maximum power available through QSFP56 port: 5W

Airflow (LFM) / Ambient Temperature

Cable Type

Passive Cables

NVIDIA Active 4.7W Cables

Environmen Temperature tal
Humidity

Regulatory

Altitude (Operational) Safety: CB / cTUVus / CE

Operational Non-operational Operational Non-operational 3050m

EMC: CE / FCC / VCCI / ICES / RCM / KC

RoHS: RoHS Compliant

Airflow Direction

Heatsink to Port
TBD
TBD
0°C to 55°C -40°C to 70°C 10% to 85% relative humidity 10% to 90% relative humidityc

Port to Heatsink TBD TBD

Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. For both operational and non-operational states.

116

MCX653106A-HDAL Specifications

Please make sure to install the ConnectX-6 card in an liquid-cooled Intel® Server System D50TNP platform.

Physical

Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm) Connector: Dual QSFP56 InfiniBand and Ethernet (copper and optical)

Protocol Support

InfiniBand: IBTA v1.4a Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR (14.0625Gb/s per lane), EDR (25Gb/s
per lane) port, HDR100 (2 lane x 50Gb/s per lane), HDR (50Gb/s per lane) port

Ethernet: 200GBASE-CR4, 200GBASE-KR4, 200GBASE-SR4, 100GBASE-CR4, 100GBASE- CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASEKR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE-KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASECX, 1000BASE-KX, 10GBASE-SR

Data Rate

InfiniBand

SDR/DDR/QDR/FDR/EDR/HDR100/HDR

Ethernet

1/10/25/40/50/100/200 Gb/s

PCI Express Gen3/4: SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)

117

Power and Airflow

Voltage: 3.3Aux Maximum current: 100mA

Power

Cable

Typical Powerb

Passive Cables

20.85W

Maximum Power

Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)

Maximum power available through QSFP56 port: 5W

Airflow (LFM) / Ambient Temperature

Cable Type

Environmen Temperature tal
Humidity

Passive Cables
NVIDIA Active 4.7W Cables
Operational Non-operational Operational Non-operational

Regulatory

Altitude (Operational) Safety: CB / cTUVus / CE

3050m

EMC: CE / FCC / VCCI / ICES / RCM / KC

RoHS: RoHS Compliant

Airflow Direction

Heatsink to Port
TBD
TBD
0°C to 55°C -40°C to 70°C 10% to 85% relative humidity 10% to 90% relative humidityc

Port to Heatsink TBD TBD

118

Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. For both operational and non-operational states.

MCX653105A-ECAT Specifications

Please make sure to install the ConnectX-6 card in a PCIe slot that is capable of supplying the required power and airflow as stated in the below
table.

Physical Protocol Support

Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm)

Connector: Single QSFP56 InfiniBand and Ethernet (copper and optical)

InfiniBand: IBTA v1.4a Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR (14.0625Gb/s per lane), EDR (25Gb/s per lane) port, HDR100 (2 lane x 50Gb/s per lane)

Ethernet: 100GBASE-CR4, 100GBASE-CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASE-KR4, 40GBASE-SR4, 40GBASE-LR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE-KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR

Data Rate

InfiniBand

SDR/DDR/QDR/FDR/EDR/HDR100

Ethernet

1/10/25/40/50/100 Gb/s

PCIe Gen3/4: SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)

119

Power and Airflow

Voltage: 3.3Aux Maximum current: 100mA

Power

Cable

Typical Powerb

Passive Cables

15.6W

Maximum Power

Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)

Maximum power available through QSFP56 port: 5W

Environmental Regulatory

Airflow (LFM) / Ambient Temperature

Cable Type Passive Cables

NVIDIA Active 2.7W Cables

Temperature Humidity

Operational Non-operational Operational Non-operational

Altitude (Operational) Safety: CB / cTUVus / CE

3050m

EMC: CE / FCC / VCCI / ICES / RCM / KC

RoHS: RoHS Compliant

Heatsink to Port
300 LFM / 55°C
300 LFM / 55°C
0°C to 55°C -40°C to 70°Cc 10% to 85% relative humidity 10% to 90% relative humidity

Airflow Direction Port to Heatsink
200 LFM / 35°C 200 LFM / 35°C

120

Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. The non- operational storage temperature specifications apply to the product without its package.

MCX653106A-ECAT Specifications

Please make sure to install the ConnectX-6 card in a PCIe slot that is capable of supplying the required power and airflow as stated in the below
table.

For power specifications when using a single-port configuration, please refer to MCX653105A-ECAT Specifications

Physical

Adapter Card Size: 6.6 in. x 2.71 in. (167.65mm x 68.90mm) Connector: Dual QSFP56 InfiniBand and Ethernet (copper and optical)

Protocol Support

InfiniBand: IBTA v1.4a
Auto-Negotiation: 1X/2X/4X SDR (2.5Gb/s per lane), DDR (5Gb/s per lane), QDR (10Gb/s per lane), FDR10 (10.3125Gb/s per lane), FDR (14.0625Gb/s per lane), EDR (25Gb/s per lane) port, HDR100 (2 lane x 50Gb/s per lane) port

Ethernet: 100GBASE-CR4, 100GBASE-CR2, 100GBASE-KR4, 100GBASE-SR4, 50GBASE-R2, 50GBASE-R4, 40GBASE-CR4, 40GBASE-KR4, 40GBASE-SR4, 40GBASELR4, 40GBASE-ER4, 40GBASE-R2, 25GBASE-R, 20GBASE-KR2, 10GBASE-LR,10GBASE-ER, 10GBASE-CX4, 10GBASE-CR, 10GBASE-KR, SGMII, 1000BASE-CX, 1000BASE-KX, 10GBASE-SR

Data Rate

InfiniBand

SDR/DDR/QDR/FDR/EDR

Ethernet

1/10/25/40/50/100 Gb/s

Gen3/4: SERDES @ 8.0GT/s/16GT/s, x16 lanes (2.0 and 1.1 compatible)

121

Power and Airflow
Environmental Regulatory

Voltage: 12V, 3.3VAUX Maximum current: 100mA

Power

Cable

Typical Powerb

Passive Cables

21.0W

Maximum Power

Please refer to ConnectX-6 VPI Power Specifications (requires NVONline login credentials)

Maximum power available through QSFP56 port: 5W

Airflow (LFM) / Ambient Temperature
Temperature Humidity
Altitude (Operational) Safety: CB / cTUVus / CE EMC: CE / FCC / VCCI / ICES / RCM / KC RoHS: RoHS Compliant

Cable Type
Passive Cables NVIDIA Active 2.7W Cables Operational Non-operational Operational Non-operational 3050m

Airflow Direction

Heatsink to Port
350 LFM / 55°C
550 LFM / 55°C
0°C to 55°C -40°C to 70°Cc 10% to 85% relative humidity 10% to 90% relative humidity

Port to Heatsink 250 LFM / 35°C 250 LFM / 35°C

122

Notes: a. The ConnectX-6 adapters supplement the IBTA auto-negotiation specification to get better bit error rates and longer cable reaches. This supplemental feature only initiates when connected to another NVIDIA InfiniBand product. b. Typical power for ATIS traffic load. c. The non- operational storage temperature specifications apply to the product without its package.

MCX654105A-HCAT Specifications

Please make sure to install the ConnectX-6 card in a PCIe slot that is capable of supplying the required power and air

References

Read User Manual Online (PDF format)

Read User Manual Online (PDF format)  >>

Download This Manual (PDF format)

Download this manual  >>

Related Manuals