Personal tools
You are here: Home Documentation Benchmarks ABINIT v4.3

ABINIT v4.3 : Benchmarks



The three most demanding routines of ABINIT, in the most usual situation (total energy SCF calculation) have been analyzed (fourwf.f, nonlop.f, projbd.f), for different platforms. Tests have been made for sets of increasing number of plane waves, for which the three-dimensional FFT grids range from 20x20x20 to 96x96x96. For the fourwf.f routine, the seven results are shown. For nonlop.f and projbd.f, an average on the seven sets is given. In all cases, the cpu time for one routine call has been divided by the size of the data set (number of 3D FFT points for fourwf.f, and number of planewave coefficients for nonlop.f and projbd.f . Results are given in microseconds per data.
NB : these tests are for the SEQUENTIAL version of the code.
Routine fourwf.f (3D FFT - part related to the treatment of the potential)
 
 
freq(Mhz)/grid
20
30
36
48
64
80
96
SGI Altix 3700 IA64-2
1300
0.022 0.024 0.023 0.024 0.028 0.032 0.035
HP/COMPAQ (ES45) EV68
1250
0.026 0.034 0.033 0.034 0.038 0.042 0.054
IBM p5-570 PWR5
1650
0.029 0.029 0.025 0.024 0.027 0.026 0.032
IBM p690 PWR4
1300
0.036 0.038 0.035 0.034 0.036 0.039 0.049
IBM p630 PWR4
1000
0.039 0.042 0.046 0.041 0.046 0.049 0.063
AMD Opteron 246 AMD-64
2000
0.029 0.038 0.039 0.040 0.043 0.049 0.054
AMD Opteron 246 AMD-64
1800
0.033 0.044 0.044 0.044 0.048 0.055 0.060
Apple PPC G5
1800
0.026 0.046 0.046 0.046 0.047 0.049 0.060
Apple PPC G4
800
0.117 0.145 0.155 0.166 0.204 0.233 0.275
Intel(Xeon) Xeon
3060
0.081 0.091 0.098 0.098 0.093 0.102 0.104
Intel(FSB800) PIV
2800
0.081 0.099 0.094 0.098 0.099 0.110 0.109
Intel(VortX) PIV
2400
0.096 0.117 0.123 0.120 0.118 0.130 0.132
Intel(Cox) PIII
933
0.151 0.194 0.201 0.209 0.217 0.255 0.321
HP N4000(Turing) PA-RISC8500
360
0.130 0.164 0.154 0.156 0.157 0.176 0.215
HP C360 PA-RISC8500
367
0.185 0.192 0.199 0.224 0.248 0.266 0.308
SGI(Spinoza) R14K
600
0.078 0.096 0.113 0.130 0.146 0.177 0.224
IBM RS6000/44P PWR3+
375
0.086 0.119 0.127 0.185 0.262 0.292 0.314
Fujitsu VPP/8
142
0.496 0.343 0.270 0.199 0.166 0.159 0.160
SUN Sunfire V750 USIII
750
0.229 0.246 0.243 0.283 0.323 0.368 0.389
Fujitsu PIII
600
0.299 0.355 0.353 0.366 0.397 0.438 0.492
Microway EV67
500
0.078 0.109 0.107 0.111 0.145 0.160 0.201
AlphaStation 7000 EV56
600
0.123 0.166 0.163 0.171 0.216 0.263 0.327

Routine nonlop.f (Non-local potential - part related to the treatment of the energy)
Routine projbd.f (Orthogonalisation)
   
freq (Mhz)
nonlop projbd
SGI Altix 3700 IA64-2
1300
0.078 0.016
HP/COMPAQ (ES45) EV68
1250
0.080 0.013
IBM p5-570 PWR5
1650
0.055 0.0053
IBM p690 PWR4
1300
0.010 0.010
IBM p630 PWR4
1000
0.095 0.015
AMD Opteron 246 AMD-64
2000
0.113 0.018
AMD Opteron 246 AMD-64
1800
0.102 0.015
Apple PPC G5
1800
0.179 0.013
Apple PPC G4
800
0.524 0.102
Intel(Xeon) Xeon
3060
0.118 0.013
Intel(FSB800) PIV
2800
0.144 0.024
Intel(VortX) PIV
2400
0.156 0.024
Intel(Cox) PIII
933
0.533 0.132
HP N4000(Turing) PA-RISC8500
360
0.852 0.046
HP C360 PA-RISC8500
367
0.334 0.072
SGI(Spinoza) R14K
600
0.303 0.041
IBM RS6000/44P PWR3+
375
0.233 0.023
Fujitsu VPP/8
142
0.611 0.049
SUN Sunfire V750 USIII
750
0.544 0.085
Fujitsu PIII
600
0.923 0.193
Microway EV67
500
0.309 0.038
AlphaStation 7000 EV56
600
0.508 0.068

Brief description of the hardware

  • SGI Altix 3700 28 x Intel Itanium2 Linux 2.4.21-sgi230r7(1.3GHz) 55Gb ram
  • HP/COMPAQ ES45 4 x Alpha EV68 (1.25 GHz) TRU64 5.1B, Cache L1 I/D 64/64 kB, Cache L2 8MB,32 GB RAM
  • IBM p5-570 4 CPUs@1.65GHz Linux SLES9, Kernel 2.6, Compilateur fortran IBM XLF 9.1
  • IBM Power4 pSeries p690 Turbo 1.3GHz (regatta), AIX 5.0 ,Cache L1 I/D 32/128 kB, Cache L2 (N.A.)
  • IBM Power4 pSeries p630 1.0GHz, AIX 5.0, Cache L1 I/D 32/128 kB, Cache L2 (N.A.)
  • AMD Opteron 246("e325") 2GHz, Fedora Core 2 Linux-64, Cache L2 1Mb, PGI x86-64 5.1
  • AMD Opteron 244("hyperion.enge") 1.8GHz, Suse Linux-64, Cache L2 1Mb, PGI x86-64 5.1
  • SGI ("spinoza") Octane 2 - IRIX 6.5
  • IBM ("dirac") 44P/Power3 - AIX 5.1 - xlf 7.x
  • Intel Xeon ("tsunami") cluster Cenaero bi-Xeon 3.06GHz, Cache L2 512Kb, 2Gb RAM, PGI 5.1
  • Intel FSB800 ("lowdin") P4 2.8GHz, Cache L2 512Kb, 1Gb RAM, PGI 5.1
  • Intel ("VortX") P4 2.4GHz, Cache L2 512Kb, 1Gb RAM (RAMBUS RR400 PC800), PGI 3.2-3
  • Intel ("Cox") P3 933MHz, Cache L2 265Kb, 1Gb RAM, PGI 3.2-3
  • HP N4000 ("Turing") PA-8600 - HP-UX 11
  • HP C360
  • Fujitsu VPP 8x142MHz theoretical peak performance 2.2 GFlops, vector processor 8 add + 8 mult per clock cycle, one processing element VX-1S
    NB: The CPU test for FUJITSU was performed on the Fujitsu machine (VX-1S) in Mitsubishi Chemical Corp
  • Apple PowerPC G5  2 x PPC  G5 (version = 2.2) , Bus speed: 900 MHz, L2 cache size: 512KB (times 2), L3 cache size: 2MB (times 2), Memory size: 512MB, Mac OS X 10.3.6, Compiler IBM xlf for OSX ver.8.
  • Apple PowerPC G4  2 x PPC  G4 (version = 2.1) Bus speed: 800 MHz, L2 cache size: 256KB (times 2), L3 cache size: 2MB (times 2), Memory size: 1.25GB, Mac OS X 10.2.2 (6F21), Compiler Absoft Pro Fortran for OSX ver.8.0
  • Intel PIII 600 MHz/Fujitsu, M/B Intel 440BX AGPset ATX M/B, SCSI Onboard Adaptec AIC7890 Chip ("compatible" with Adaptec 2940U2W), HDD 9.1GB(Ultra2-Wide SCSI) , Memory 1GB, NIC PCI 10/100Mbps Intel PILA8460B Management Adaptor , GNU/Linux (Kernel v.2.2.5), F90 v2.0 Fujitsu "Fortran & C" package 
  • SUN SunFire V750 2 x US III 750 MHz, Solaris 2.9, Cache L1 I/D 16/16 kB, Cache L2 8MB, Worshop 6.0
  • Alpha ev56 ("boop")
  • Alpha ev56 ("deepflow")
  • SGI Octane1 ("Zebulon")
« February 2012 »
February
MoTuWeThFrSaSu
12345
6789101112
13141516171819
20212223242526
272829
Site status
Stable