Skip to content

评估DDR通道数

整体思路:获取实际的峰峰值总带宽T,按照DDR型号及频率,确认单通道带宽t,channel数 = T/t

获取峰峰值带宽使用jeffhammond/STREAM stream官方文档 网友对于stream的解读 获取DDR型号及频率,使用 dmidecode --type memory 命令

获取峰峰值带宽

确认工具能够正常运行

git clone https://github.com/jeffhammond/STREAM.git
cd STREAM
make stream_c.exe
./stream_c.exe

获取测试机L3 Cache的大小

方法一:

~/workspace$ getconf -a | grep CACHE
LEVEL1_ICACHE_SIZE                 32768
LEVEL1_ICACHE_ASSOC                8
LEVEL1_ICACHE_LINESIZE             64
LEVEL1_DCACHE_SIZE                 32768
LEVEL1_DCACHE_ASSOC                8
LEVEL1_DCACHE_LINESIZE             64
LEVEL2_CACHE_SIZE                  1048576
LEVEL2_CACHE_ASSOC                 16
LEVEL2_CACHE_LINESIZE              64
LEVEL3_CACHE_SIZE                  28835840
LEVEL3_CACHE_ASSOC                 11
LEVEL3_CACHE_LINESIZE              64
LEVEL4_CACHE_SIZE                  0
LEVEL4_CACHE_ASSOC                 0
LEVEL4_CACHE_LINESIZE              0

方法二:

sudo dmidecode -t cache

方法三:

lscpu

根据L3 Cache大小修改STREAM_ARRAY_SIZE

可以直接修改stream.c中的STREAM_ARRAY_SIZE,也可以在make的时候 -DSTREAM_ARRAY_SIZE 来进行修改

STREAM_ARRAY_SIZE = LEVEL3_CACHE_SIZE * 4

因为L3 Cache获取到的是28MB, 那么STREAM_ARRAY_SIZE设置为120MB比较合适

再次编译运行stream

-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 120000000 (elements), Offset = 0 (elements)
Memory per array = 915.5 MiB (= 0.9 GiB).
Total memory required = 2746.6 MiB (= 2.7 GiB).
Each kernel will be executed 10 times.
 The *best* time for each kernel (excluding the first iteration)
 will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested = 80
Number of Threads counted = 80
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 31161 microseconds.
   (= 31161 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function    Best Rate MB/s  Avg time     Min time     Max time
Copy:           44672.0     0.055925     0.042980     0.101943
Scale:          50632.9     0.046314     0.037920     0.062966
Add:            57618.5     0.056867     0.049984     0.069840
Triad:          57517.2     0.060273     0.050072     0.096959
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------

获取DDR型号及频率

sudo dmidecode -t memory

# dmidecode 3.1
Getting SMBIOS data from sysfs.
SMBIOS 3.2 present.
# SMBIOS implementations newer than version 3.1.1 are not
# fully supported by this version of dmidecode.

Handle 0x1000, DMI type 16, 23 bytes
Physical Memory Array
    Location: System Board Or Motherboard
    Use: System Memory
    Error Correction Type: Multi-bit ECC
    Maximum Capacity: 7680 GB
    Error Information Handle: Not Provided
    Number Of Devices: 24

Handle 0x1100, DMI type 17, 84 bytes
Memory Device
    Array Handle: 0x1000
    Error Information Handle: Not Provided
    Total Width: 72 bits
    Data Width: 64 bits
    Size: 32 GB
    Form Factor: DIMM
    Set: 1
    Locator: A1
    Bank Locator: Not Specified
    Type: DDR4
    Type Detail: Synchronous Registered (Buffered)
    Speed: 2666 MT/s
    Manufacturer: 00CE00B300CE
    Serial Number: 41137A10
    Asset Tag: 02184251
    Part Number: M393A4K40CB2-CTD
    Rank: 2
    Configured Clock Speed: 2666 MT/s
    Minimum Voltage: 1.2 V
    Maximum Voltage: 1.2 V
    Configured Voltage: 1.2 V

DDR规格容量及传输速度对照表

DDR规格 容量 传输带宽
DDR 266 2.1 GB/s
DDR 333 2.6 GB/s
DDR 400 3.2 GB/s
DDR2 533 4.2 GB/s
DDR2 667 5.3 GB/s
DDR2 800 6.4 GB/s
DDR3 1066 8.5 GB/s
DDR3 1333 10.6 GB/s
DDR3 1600 12.8 GB/s
DDR3 1866 14.9 GB/s
DDR4 2133 17 GB/s
DDR4 2400 19.2 GB/s
DDR4 2666 21.3 GB/s
DDR4 3200 25.6 GB/s

计算通道数

57 // 21 = 3