ABINIT trunk/6.1.1-public/r190 : Benchmarks
The three most demanding routines of ABINIT, in the most usual situation (total energy SCF calculation) have been analyzed (fourwf.F90, nonlop.F90, projbd.F90), for different platforms. Tests have been made for sets of increasing number of plane waves, for which the three-dimensional FFT grids range from 20x20x20 to 96x96x96. For the fourwf.F90 routine, the seven results are shown. For nonlop.F90 and projbd.F90, an average on the seven sets is given. In all cases, the cpu time for one routine call has been divided by the size of the data set (number of 3D FFT points for fourwf.F90, and number of planewave coefficients for nonlop.F90 and projbd.F90 . Results are given in microseconds per data.
NB : these tests are for the SEQUENTIAL version of the code.| SLAVES | CPU | freq(Mhz)/grid | 20 | 30 | 36 | 48 | 64 | 80 | 96 |
|---|---|---|---|---|---|---|---|---|---|
| testf | 2xQC Xeon X5570 |
2900 | 0.014 | 0.017 | 0.015 | 0.015 | 0.016 | 0.019 | 0.021 |
| green_intel10_serial | 2xQC Xeon L5420 |
2500 | 0.016 | 0.021 | 0.020 | 0.019 | 0.021 | 0.025 | 0.028 |
| green_g95 | 2xQC Xeon L5420 | 2500 | 0.029 | 0.042 | 0.040 | 0.039 | 0.038 | 0.044 | 0.052 |
| coba2_gcc44_noplugs | 1xQC Xeon W3520 | 2670 | 0.015 | 0.019 | 0.019 | 0.018 | 0.018 | 0.021 | 0.023 |
| chum_psc | 2xDC Opteron 2220 | 2800 | 0.018 | 0.024 | 0.024 | 0.024 | 0.028 | 0.031 | 0.038 |
| bigmac_gcc43 | 2xQC Xeon E5462 | 2800 |
0.023 | 0.028 | 0.028 | 0.026 | 0.029 | 0.032 | 0.036 |
| chpit_intel11 | 4x Itanium2 | 1500 | 0.021 | 0.023 | 0.021 | 0.020 | 0.022 | 0.026 | 0.035 |
| buda_gcc43_mpiio | 2xQC Xeon X5570 | 2900 | 0.018 | 0.021 | 0.022 | 0.021 | 0.022 | 0.025 | 0.029 |
| inca_gcc44_sdebug | 1xQC Core2 Q9650 | 3000 | 0.023 | 0.029 | 0.029 | 0.029 | 0.031 | 0.034 | 0.036 |
| fock_xlf | 2xDC Power5 | 1500 | 0.026 |
0.035 |
0.030 |
0.024 | 0.028 |
0.030 |
0.036 |
| ibm6_xlf12 | 2xDC Power6 | 4700 | 0.016 | 0.012 | 0.017 | 0.013 | 0.017 | 0.016 | 0.017 |
| SLAVES | CPU/freq |
nonlop
|
projbd
|
|---|---|---|---|
| testf | 2xQC Xeon X5570 / 2.9Ghz |
0.031
|
0.00414
|
| green_intel10_serial | 2xQC Xeon L5420 / 2.5Ghz |
0.044
|
0.00629
|
| green_g95 | 2xQC Xeon L5420 / 2.5Ghz |
0.080
|
0.01400
|
| coba2_gcc44_noplugs | 1xQC Xeon W3520 / 2.67Ghz |
0.040
|
0.00443
|
| chum_psc | 2xDC Opteron 2220 / 2.8Ghz |
0.052
|
0.01143
|
| bigmac_gcc43 | 2xQC Xeon E5462 / 2.8Ghz |
0.044
|
0.00500
|
| chpit_intel11 | 4x Itanium2 / 1.5Ghz |
0.049
|
0.00671
|
| buda_gcc43_mpiio | 2xQC Xeon X5570 / 2.9Ghz |
0.036
|
0.00543
|
| inca_gcc44_sdebug | 1xQC Core2 Q9650 / 3.0Ghz |
0.064
|
0.01000
|
| fock_xlf | 2xDC Power5 / 1.5Ghz |
0.064
|
0.00429
|
| ibm6_xlf12 | 2xDC Power6 / 4.7Ghz |
0.047
|
0.00871
|
Brief description of the hardware/software/compilers
- Bull Novascale ("testf") 2 Quad-Core Nehalem Xeon@2.9GHz (5570) ; 1MB cache ; CentOS 5.4; Compilers gcc44
- Dell ("green") 2 Quad-Core Xeon@2.5GHz (5420) ; 1MB cache ; CentOS 5.4; Compilers g95, ifort 10.1
- Sun Galaxy X4200 ("chum") 2 Dual-Core AMD Opteron@2.8 GHz ; CentOS 5.x, Kernel 2.6, Compiler PathScale 3.2
- HP Z400 ("coba2") 1 Quad-Core Xeon@2.67GHz (3520) ; 1MB cache ; CentOS 5.4; Compilers gcc44
- Apple Mac Pro ("bigmac") 2 Quad-Core Xeon@2.8Ghz (5462); MacOS 10.5; Compiler gcc43
- HP Integrity ("chpit") 4 IA-64@1.5GHz ; Linux Debian; Compiler Ifort 8.1
- Supermicro ("buda") 2 Quad-Core Nehalem Xeon@2.9GHz (5570) ; 1MB cache ; CentOS 5.4; Compiler gcc43
- HP 7900 ("inca") 1 Quad-Core Core2@3.0GHz (Q9650) ; 1MB cache ; CentOS 5.4; Compiler gcc44
- IBM p5-570 ("fock") 2 Power5(dual-core)@1.65GHz ; Linux SLES9, Kernel 2.6; Compiler IBM XLF 9.1
- IBM p6-520 ("ibm6") 2 Power6(dual-core)@4.7GHz ; AIX 6.1; Compiler IBM XLF 12.1
- Apple Xserve ("Max") 2 PowerPC G5 @2.0 GHz, L2 = 512KB ; Mac OS X 10.3.8; Compiler IBM xlf for OSX

