Roceofed环境搭建与测试
Roce ofed环境搭建与测试⼀、安装包下载:
mellanox驱动下载地址:
2、在打开的页⾯上到⾃⼰平台,如:Linux SW/Drivers,这⾥以centos 8为例;
3、在页⾯的下⽅到对应的版本进⾏下载;
这⾥以的格式进⾏相应的说明使⽤;
⼆、安装
1、将下载好的驱动包上传到服务器,上传的步骤这⾥不叙述;
2、解压上传好的驱动包,等待解压完成;
[root@localhost ~]# tar xzvf MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/
.
/MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/RPM-GPG-KEY-Mellanox
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/uninstall.sh
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/.mlnx
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/.arch
./MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/distro
………………………怎么下载mt4
3、安装
[root@localhost ~]# cd MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64/
[root@localhost MLNX_OFED_LINUX-5.0-2.1.8.0-rhel8.1-x86_64]# ./mlnxofedinstall
Logs dir: /tmp/MLNX_OFED_LINUX.10300.logs
General log file: /tmp/MLNX_OFED_LINUX.10300.logs/general.log
Verifying KMP rpms compatibility with
……………………….
Complete!
等待系统安装完成即可;
三、常⽤检查配置;
1、InfiniBand 状态:
[root@centos222 ~]# ibstat
CA 'mlx4_0'
CA type: MT4099
Number of ports: 1
Firmware version: 2.42.5000
Hardware version: 1
Node GUID: 0xf452140300880760
System image GUID: 0xf452140300880760
Port 1:
State: Active
Physical state: LinkUp
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x00010000
Port GUID: 0xf65214fffe880760
Link layer: Ethernet
2、InfiniBand 状态:
[root@centos222 ~]# ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid:    fe80:0000:0000:0000:f652:14ff:fe88:0760        base lid:        0x0
sm lid:          0x0
state:          4: ACTIVE
phys state:      5: LinkUp
rate:            10 Gb/sec (1X QDR)
link_layer:      Ethernet
3、⽹卡的对应关系:
[root@centos222 ~]# ibdev2netdev
mlx4_0 port 1 ==> enp2s0 (Up)
4、⽹卡协商相关信息:
[root@centos222 ~]# ethtool enp2s0
Settings for enp2s0:
Supported ports: [ FIBRE ]
Supported link modes:  1000baseKX/Full
10000baseKR/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes:  1000baseKX/Full
10000baseKR/Full
Advertised pause frame use: Symmetric
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: 10000Mb/s
Duplex: Full
Port: Direct Attach Copper
PHYAD: 0
Transceiver: internal
Auto-negotiation: off
Supports Wake-on: d
Wake-on: d
Current message level: 0x00000014 (20)
link ifdown
Link detected: yes
5、⽹卡⽀持的gid等相关信息:
[root@centos222 ~]# show_gids
DEV    PORT    INDEX  GID                                    IPv4            VER    DEV
---    ----    -----  ---                                    ------------    ---    ---
mlx4_0  1      0      fe80:0000:0000:0000:f652:14ff:fe88:0760                v1      enp2s0 n_gids_found=1
6、⽹卡⼯作模式:
[root@centos222 ~]# ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid:    fe80:0000:0000:0000:f652:14ff:fe88:0760
base lid:        0x0
sm lid:          0x0
state:          4: ACTIVE
phys state:      5: LinkUp
rate:            10 Gb/sec (1X QDR)
link_layer:      Ethernet
查看⽹卡当前的link ⼯作模式:
[root@centos7221 ~]# connectx_port_config -s
--------------------------------
Port configuration for PCI device: 0000:86:00.0 is:
eth
--------------------------------
[root@centos7221 ~]# connectx_port_config
ConnectX PCI devices :
|----------------------------|
| 1            0000:86:00.0 |
|----------------------------|
Before port change:
eth
|----------------------------|
| Possible port modes:      |
| 1: Infiniband              |
| 2: Ethernet                |
| 3: AutoSense              |
|----------------------------|
Select mode for port 1 (1,2,3):
按照需要进⾏选择;
Note:
Connectx-3只⽀持Ethernet模式
7、解释:
$ ib_send_bw -d mlx5_4 -x 3 //在⼀边服务器上启动收包测试, ⽤index3, RoCEv2:这⾥要注意的是Index 值;
$ sudo ib_send_bw -d mlx5_4 192.168.1.1 --report_gbits -F -x 3 //另外⼀边发包
Note:
# 记得给你的⽹卡绑定个IP, 两边能ping通
[root@centos222 ~]# nmcli c modify enp2s0 ipv4.addresses 10.10.10.222/24 autoconnect hod manual
[root@centos222 ~]# nmcli c down enp2s0
Connection 'enp2s0' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/439) [root@centos222 ~]# nmcli c reload enp2s0
[root@centos222 ~]# nmcli c up enp2s0
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/440)
[root@centos222 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 6c:92:bf:70:97:cc brd ff:ff:ff:ff:ff:ff
inet 192.168.101.222/24 brd 192.168.101.255 scope global noprefixroute eno1
valid_lft forever preferred_lft forever
inet6 fe80::aad6:ae47:a954:b791/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: enp4s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000    link/ether 6c:92:bf:70:97:cd brd ff:ff:ff:ff:ff:ff
4: enp132s0f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000    link/ether 68:91:d0:61:57:2e brd ff:ff:ff:ff:ff:ff
5: enp132s0f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000    link/ether 68:91:d0:61:57:2f brd ff:ff:ff:ff:ff:ff
6: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether f4:52:14:88:07:60 brd ff:ff:ff:ff:ff:ff
inet 10.10.10.222/24 brd 10.10.10.255 scope global noprefixroute enp2s0
valid_lft forever preferred_lft forever
inet6 fe80::da78:33ac:bf32:8856/64 scope link noprefixroute
valid_lft forever preferred_lft forever
7: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000    link/ether 52:54:00:53:81:7e brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
8: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000    link/ether 52:54:00:53:81:7e brd ff:ff:ff:ff:ff:ff
四、QA;
在安装的过程中可能出现很多问题,最常见的就是缺少安装包,可先安装缺少的包,再次安装驱动;
下⾯是安装驱动必须的包:
tcl tcsh gcc-gfortran tk python36 perl
在centos 上直接联⽹进⾏更新安装即可;
yum install tcl tcsh gcc-gfortran tk python36 perl