Cambricon MLU-X1001 Accelerator Construction Unit of Artificial Intelligence Supercomputing User Manual

June 12, 2024
Cambricon

Cambricon logo MLU-X1001 Accelerator Construction Unit of Artificial Intelligence Supercomputing
User Manual

Cambricon MLU-X1001 Accelerator Construction Unit of Artificial Intelligence
Supercomputing MLU-X1001 Accelerator
Product Manual
V0.9.3

Preface

1.1. Copyright Declaration
Disclaimer
Cambricon Technologies Corporation Limited(hereinafter referred to as “Cambricon “) does not represent, guarantee (express, implied or statutory) or guarantee the information contained in this document and expressly waives any and all implied guarantees of saleability, ownership, non-aggression of intellectual property or applicability for a specific purpose, and cambricon does not assume any liability arising from the application or use of any product or service. cambricon shall not be liable for any breach of contract, damages, costs or problems arising from :(1) any way of using cambricon products contrary to this Guide; or (2) customer product design.
Limitation of liability
In no case shall Cambricon be liable for any damage caused by the use or inability to use this Guide (including but not limited to damage such as loss of profits, business disruption and loss of information), even if Cambricon has been advised that such damage may be suffered. Although the customer may suffer any damage for any reason, according to the terms and conditions of sale of the products of the Cambricon, the total and cumulative liability of Cambricon to the customer for the products described in this Guide shall be limited.
Accuracy of information
The information provided in this document is owned by Cambricon and Cambricon reserves the right to make any changes to this document information or to any products and services without notice. The information contained in this guide and all other information of the Cambricon documents cited in this guide are provided “as is “. Cambricon does not guarantee the accuracy or completeness of information, texts, patterns, links or other items contained in this guide. Cambricon may make changes to this Guide or to the products described in this Guide without notice, but does not undertake to update this Guide.
The performance tests and grades listed in this guide are to be measured using a specific chip or computer system or component. After such tests, the results shown in this guide reflect the general performance of Cambricon products. Any difference in system hardware or software design or configuration will affect actual performance. As mentioned above, Cambricon does not represent, warrant or guarantee that the products described in this Guide will apply for any particular purpose.Cambricon does not represent or guarantee testing all parameters of each product.The customer is solely responsible for ensuring that the product is suitable and applicable to the application of the customer plan and for performing the necessary tests on the application, with a view to avoiding the default of the application or product.
The fragility of customer product design can affect the quality and reliability of Cambricon products and lead to additional or different circumstances and/or requirements beyond the scope of this guide.
Notice of Intellectual Property
The Cambricon and Cambricon symbols are trademarks and/or registered trademarks of Cambricon Technologies Corporation Limited in the United States and other countries. Other companies and product names shall be trademarks of the respective companies associated with them.
This guide is copyrighted and protected by the provisions of copyright laws and treaties worldwide.This guide can not be reproduced, reworked, modified, published, uploaded, published, transmitted or distributed in any way without the prior written permission of Cambricon. Except for the customer’s right to use this guide information and products, according to this guide, Cambricon does not grant any other express or implied rights or permits.
It is doubtful that the Cambricon does not grant any (express or implied) rights or permits to the customer based on any patent, copyright, trademark, trade secret or any other Cambricon intellectual property or ownership.
Copyright Declaration
© Cambricon Technologies Corporation Limited reserves all rights.
1.2. Versioning
Table 1.1 Version Record

Document name MLU-X1001 accelerator Product Manual
Version number V0.9.3
Author Cambricon
Date created 2020.10.30

1.3. Update history
V0.2.0
Update time: 2020.07.10
Update:
– Initial version.
V0.93
Update time: 2020.10.30
Update:
– Modify the external interconnection name as MLU-Link, update the HBM rate, and add warning of the button battery.

Overview

MLU-X1001 accelerator is a construction unit of artificial intelligence supercomputing. The extender inherits 4 MLU290-M5 intelligent accelerating cards, and provides up to 2 POPs of adaptive precision computing power. The supercomputing system from 4 cards to 16 cards is constructed by using the Cambrian MLU-LINK inter chip direct connection technology, which provides a highly agile, highly reliable and high-performance computing foundation for the Artificial Intelligence Computing Center.

Cambricon MLU-X1001 Accelerator Construction Unit of Artificial Intelligence
Supercomputing

Product Specification Overview

3.1 Overview of Product Specification Parameters
MLU-X1001 accelerator Specification Parameters are as follows :
Table 3.1 MLU-X1001 Specification Parameters

Specification indicators Note
Model MLU-X1001
Core architecture Cambricon MLUv02
Core frequency 1.3GHz
Calculation accuracy support INT16,INT8,INT4,FP32,FP16
Video decoding Support
Memory capacity 192GB
ECC protection Yes
System interface 2* PCI Express 4.0×16
MLU-LINK external interface 8Ports
MLU-LINK interface bandwidth 8*100 GB /S
TDP power consumption 2300W
Heat dissipation scheme Air-cooled, compatible with liquid-cooled

3.2 Overview of structural specifications
The structure specifications of the MLU-X1001 accelerator are as follows:
Table 3.2 Structural Specification for MLU-X1001

Specification indicators Note
Shape 437mm87mm735mm
Weight 29Kg
Package Shape 1000mm635mm230mm
Package Weight 39Kg

Bending radius of cable:
Table 3.3 Specification for cable bending

Wire diameter| Bending radius L1
(Base on the cabinet column)| Bending radius L2
(Base on the chassis)
---|---|---
30 AWG| 97.45 mm| 78.5 mm
26 AWG| 121.64 mm| 102.7 mm

Cambricon MLU-X1001 Accelerator Construction Unit of Artificial Intelligence
Supercomputing - fig 1 3.3 Overview of electrical specifications
MLU-X1001 accelerator electrical specifications as follows:
Table 3.4 Electrical Specification for MLU-X1001

Specification indicators Note
System interface PCIE Gen 4X 16
Number of PCIE ports 2Ports
PCIE bandwidth 128GB /s
Number of MLU-LINK ports 8Ports
MLU-LINK bandwidth 800GB /s
BMC management interface IPMI V2.0
Host management interface SMBUS
Input voltage AC 115-127V,14,2A, 60/50Hz

AC 200-240V,14.9A, 60/50Hz
DC 240V, 16A(China mainland only)

3.4 Summary of heat dissipation specifications
The heat dissipation specification of MLU-X1001 accelerator is as follows:
Table 3.4 Heat dissipation specifications of MLU-X1001

Specification indicators Note
Working temperature 0℃-35℃, altitude of 900m below
Working humidity 20%RH-85%RH
Storage temperature -40℃—75℃
Storage humidity 5%RH-95%RH
Noise SDP @23℃, sound power ≤7.2 bels
Working altitude ≤3000 m (900-3000m, for each increase of 300 m supported

working temperature drop 1℃)

Component Profile

4.1 MLUX-BB 1
MLUX-BB 1 is the baseboard which carries MLU290-M5 Intelligent processing card. Each MLUX-BB 1 can carry 4 MLU290-M5 intelligent accelerating cards. The details are shown in the following figure:
Cambricon MLU-X1001 Accelerator Construction Unit of Artificial Intelligence
Supercomputing - fig 2Table 4.1 MLUX -BB1 Description

Serial number Note
1 MLU-LINK-0A &0B
2 MLU-LINK-2A &2B
3 MLU-LINK-1A &1B
4 MLU-LINK-1A &1B
5 PCIE 0
6 PCIE 1
7 IPMI
8 UID
9 COM HUB0
10 COM HUB1
11 COM HUB2
12 AC INDICATOR
13 FRONT PANEL CONN.
14 PDB MGT.CONN.
15 OAM MODULE 0
16 OAM MODULE 2
17 OAM MODULE 1
18 OAM MODULE 3
19 F AN 4
20 F AN 3
21 F AN 2
22 F AN 1
23 F AN 0
24 PCIE SWITCH 0
25 PCIE SWITCH 1
26 54V POWER BUSBAR
27 HANDLE 0
28 HANDLE 1
29 FRONT PCIE CONN.

4.2 MLUX -PA4
MLUX -PA4 is a PCIE board, which is placed on the host server and provides Mini SAS HD interface for connection with MLU-X1001.The details are shown in the following figure:Cambricon MLU-X1001 Accelerator Construction Unit of
Artificial Intelligence Supercomputing - fig 3Table 4.2 MLUX -PA4 Description

Serial number Note
1 mini SAS HD CONN.
2 PCIE RETIMER
3 mini SAS HD CONN.

4.3 MLUX -PDB
MLUX -PDB is the power distribution board.The details are shown in the following figure:Cambricon MLU-X1001 Accelerator Construction Unit of
Artificial Intelligence Supercomputing - fig 4Table 4.3 MLUX -PDB Description

Serial number Note
1 54V POWER BUSBAR
2 PSU CONN.0
3 PSU CONN.1
4 INTRUTION
5 SSD POWER CONN.0
6 SSD POWER CONN.1
7 PDB MGT.CONN.
8 12V POWER BUSBAR

4.4 MLUX -LINKB
MLUX -LINKB is passive connection board.The details are shown in the following figure:  Cambricon MLU-X1001 Accelerator Construction Unit of Artificial
Intelligence Supercomputing - fig 5Table 4.4 MLUX -PDB Description

Serial number Note
1 SSD MGT.CONN.0
2 SSD MGT.CONN.1
3 OCULINK 0
4 OCULINK 1
5 OCULINK 2
6 OCULINK 3
7 IBB CONN.
8 FRONT PCIE CONN.

4.5 MLUX -IBB
MLUX-IBB is the backplane of Infiniband card. Each MLUX-IBB can place two Infiniband cards.The details are shown in the following figure:Cambricon
MLU-X1001 Accelerator Construction Unit of Artificial Intelligence
Supercomputing - fig 6Table 4.5 MLUX-IBB Description

Serial number Note
1 IBB CONN.
2 IB SLOT 0
3 IB SLOT 1

4.6 Front panel
The front panel of the chassis is shown as follows: Cambricon MLU-X1001
Accelerator Construction Unit of Artificial Intelligence Supercomputing - fig
7Table 4.6 Description of front panel of chassis

Serial number Note
1 Switching keys
2 UID keys
3 Reset button
4 PSU 0
5 PSU 1
6 SSD 0
7 SSD 1
8 SSD 2
9 SSD 3
10 NIC 0
11 NIC 1

4.7 Back panel
The rear panel of the chassis is shown as follows: Cambricon MLU-X1001
Accelerator Construction Unit of Artificial Intelligence Supercomputing - fig
8Table 4.7 Description of rear panel of chassis

Serial number Note
1 PCIE 0
2 PCIE 1
3 MLU-LINK-0A
4 MLU-LINK-0B
5 MLU-LINK-2A
6 MLU-LINK-2B
7 MLU-LINK-1A
8 MLU-LINK-1B
9 MLU-LINK-3A
10 MLU-LINK-3B
11 IPMI
12 UID
13 COM HUB 0
14 COM HUB 1
15 COM HUB 2
16 AC INDICATOR
17 POWER CORD 0
18 POWER CORD 1

Electrical specifications

5.1 PCIE topology description
MLU-X1001 accelerator uses 2 miniSAS HD interfaces to connect with the host server, and there are 2 PCIE switching chips to connect the PCIE devices inside.PCIE interconnection topology is shown as follows: Cambricon
MLU-X1001 Accelerator Construction Unit of Artificial Intelligence
Supercomputing - fig 9PCIE signal rate is 16 Gbps, and the cable loss is controlled within 15 dB @8GHz. It is recommended to use 1 meter cable with a diameter of 30 AWG.
The pins of the miniSAS HD connectors used by PCIE interfaces are defined as follows:
Table 5.1 PCIE Interface pin definition

miniSAS HD pin Note Pin internal processing
RX [15:0]P/N PCIE input signal External AC coupling capacitance
TX [15:0]P/N PCIE output signal External AC coupling capacitance
SMCLK SMBUS interface clock signal 4.7 KΩ pull-up to 3.3 V
SMDAT SMBUS interface data signal 4.7 KΩ pull-up to 3.3 V
PERST# Reset signal
REFCL K P/N PCIE clock signal
PRESENT Opposite side in position detection signal 4.7 KΩ pull-up to 3.3 V

5.2 MLU-LINK interface description
MLU-X1001 accelerator is equipped with 4 MLU290-M5 intelligent accelerating cards, each card has 6 MLU-LINK ports. Among them, 4 ports are used for internal interconnection and 2 ports are used for external interconnection. The MLU-LINK interconnection topology between the internal cards is as follows: Cambricon MLU-X1001 Accelerator Construction Unit of Artificial
Intelligence Supercomputing - fig 10MLU-LINK interconnection between extenders refer to the following figure:

Cambricon MLU-X1001 Accelerator Construction Unit of Artificial Intelligence
Supercomputing - fig 11The signal rate of MLU-LINK is 50 Gbps, and the cable loss is controlled within 10 dB @12.5GHz. It is recommended to use 1 meter cable with a diameter of 30 AWG or 2 meter cable with a diameter of 28 AWG .
MLU-LINK interface uses QSFP DD connectors whose pins are defined as follows:
Table 5.2 MLU-LINK Interface pin definition

QSFP-DD pins Note Internal processing of pins
RX [8:1]P/N SERDES  signal  input  with  AC  coupling capacitance inside

External  AC coupling capacitance is not required
TX [8:1]P/N| SERDES  signal  output  with  AC  coupling capacitance inside| External  AC coupling capacitance is not required
SCL| I2C interface clock signal of optical module| 4.7 KΩ pull-up to 3.3 V
SDA| I2C interface data signal of optical module| 4.7 KΩ pull-up to 3.3 V
ModPrsL| Optical module in position signal output| 4.7 KΩ pull-up to 3.3 V
ModSelL| Selection signal of optical module, default pull- up inside| 1KΩ pull-down to GND
ResetL| Reset signal, low level effective| 4.7 KΩ pull-up to 3.3 V
IntL| Interrupt signal of optical module, OC gate, low level indicates an interrupt signal| 4.7 KΩ pull-up to 3.3 V
InitMode| Initialization mode| 1KΩ pull-down to GND
VccRx,VccRx1,Vcc1,Vcc2 VccTx ,VccTx1| Power signal|

5.3 Power Interface Description
MLU-X1001 accelerator Input Power Requirements:
Table 5.3 MLU-X1001 Input Power Supply Specifications

Input voltage Max. Input Current
AC 115-127V,60/50Hz 14.2A
AC 200-240V,60/50Hz 14.9A
DC 240V (China mainland only) 16A

MLU-X1001 accelerator is able to reduce power consumption adjustment for instantaneous power changes above the µs level. The power regulator can support power fluctuations within the ms level (e.g .1.2 x TDP).
Table 5.4 EDPp specifications of MLU-X1001

EDP Duration
TBD TBD

BMC management system

The BMC management system of MLU-X1001 is compatible with server management standards
IPMI 2.0, with high reliability of hardware monitoring and management functions.
6.1 BMC function description
MLU-X1001 accelerator BMC management system main functions and features as follows:
Table 6.1 BMC Functional description

Function Note
Remote control Management through SOL functions
Information management Management of equipment model, asset information and

version information
Status monitoring| Real-time monitoring of power supply, temperature, working status and other operating states information
Heat dissipation control| Modulate fan speed according to environment temperature, equipment working load and abnormal conditions
Alarm management| Report the alarm information in real time and deal with it accordingly
WEB interface management| Provides visual WEB interface for query and management
IPMITool tool management| Support IPMITool

Note: Use button battery (Panasonic: CR2032) to power the RTC clock.If the battery is not replaced correctly, there is a risk of explosion.

Heat dissipation specifications

7.1 Description of the heat dissipation environment
The working environment of MLU-X1001 is as follows:
Table 7.1 Working environment of MLU-X1001

Items Specification parameters
Working environment temperature 0~35℃
Relative humidity 20%~85% no condensation
Noise 62~88 dBA

Note: There will be 62~88dbA noise during normal operation. Please take adequate sound insulation measures in advance.
MLU-X1001 air volume description:

  • MLU-X1001 can provide up to 360 CFM of air volume
  • Do not block the front and rear ventilation areas of the chassis during operation of MLU-X1001
  • When installing MLU-X1001, please reduce the air resistance around the inlet and outlet of the chassis
  • Please follow the instructions to arrange the cable to minimize the air resistance of the air flue
  • Please install the chassis cover before using MLU-X1001. If CXM1000 is used without the chassis cover, the components may be damaged.
  • If you need to replace the fan, please make sure to complete within 25s to avoid overheating of the system.

7.2 Wind resistance curve of MLU-X1001
The system wind resistance curve of MLU-X1001 is shown below: Cambricon
MLU-X1001 Accelerator Construction Unit of Artificial Intelligence
Supercomputing - fig 12Table 7.2 Air Volume VS Pressure Drop of MLU-X1001

Air volume (CFM) Air pressure (Pa)
400 1737
360 1408
310 1044
260 735
0 0

Optional components

8.1 PCIE High Speed Cable
MLU-X1001 uses miniSAS HD high-speed cable for PCIE Gen4 interconnection.Compatible cable models are as follows:
Table 8.1 MLU-X1001 PCIE Compatible Cable

Manufacturers Model Specifications
Molex 2040431030 1 m ,30 AWG

8.2 MLU-LINK High Speed Cable
MLU-X1001 uses QSFP-DD high-speed cable for MLU-LINK interconnection.Compatible cable models are as follows:
Table 8.2 MLU-X1001 MLU-LINK Compatible Cable

Manufacturers Model Specifications
Molex 2015911012 1 m ,30 AWG
Molex 2015913020 2 m ,28 AWG
TE 2366016-4 1 m, 30 AWG
TE 2366101-3 2 m, 28 AWG

8.3 Network
MLU-X1001 can use InfiniBand network card or ROCE network card for cluster interconnection.
Compatible network card models are as follows:
Table 8.3 Network Card Compatibility

Manufacturers Model Specifications
Mellanox MCX653105A-HDAT Half high and half long single PCIE 4.0

8.4 Hard disk
Compatible NVMe hard disk models for MLU-X1001 are as follows:
Table 8.4 NVMe Hard Disk Compatibility

Manufacturers Model Specifications
HGST HUSMR7619BHP301 NVMe 1.92Tb

Cambricon NeuWare development environment

NeuWare fully supports various mainstream programming frameworks (e.g. TensorFlow 、Caffe 、PyTorch and MXNet). For the above programming framework, users can easily develop and deploy deep learning applications on Cambrian MLU290-M5. At the same time, the NeuWare provides a complete runtime system and driver software to facilitate the rapid integration of the system.
NeuWare also provides a range of tools including application development, function debugging, performance tuning, etc. Among them, application development tools include machine learning library, runtime library, compiler, model retraining tool and specific field (such as video analysis field) SDK; function debugging tools can meet different levels of debugging requirements such as programming framework and function library; performance tuning tools include performance profiling tools and system monitoring tools. Cambricon
MLU-X1001 Accelerator Construction Unit of Artificial Intelligence
Supercomputing - fig 13

Compliance

MLU-X1001 accelerator is compliant with the regulations listed in this chapter. The compliance marks can be found on the labels of each devices.
FCC statement
This device complies with Part 15 of the FCC Rules.
Operation is subject to the following two conditions: (1) This device may not cause harmful interference, and (2) this device must accept any interference received, including interference that may cause undesired operation.
This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to part 15 of the FCC Rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment. This equipment generates, uses, and can radiate radio frequency energy and, if not installed and used in accordance with the instruction manual, may cause harmful interference to radio communications. Operation of this equipment in a residential area is likely to cause harmful interference in which case the user will be required to correct the interference at his own expense.
Caution: Any changes or modifications not expressly approved by the party responsible for compliance could void the user’s authority to operate this equipment.
Underwriters Laboratories (UL)
UL Listed Product Logo for MLU-X1001 Accelerator,model name MLU-X1001.

Copyright © 2020 Cambricon Corporation

Documents / Resources

| Cambricon MLU-X1001 Accelerator Construction Unit of Artificial Intelligence Supercomputing [pdf] User Manual
MLU-X1001, MLUX1001, 2ARVF-MLU-X1001, 2ARVFMLUX1001, MLU-X1001, Accelerator Construction Unit of Artificial Intelligence Supercomputing
---|---

Read User Manual Online (PDF format)

Read User Manual Online (PDF format)  >>

Download This Manual (PDF format)

Download this manual  >>

Related Manuals