# ABINIT parallelisation input variables:

## List and description.

This document lists and provides the description of the name (keywords) of parallelisation input variables to be used in the main input file of the abinit code.

The new user is advised to read first the new user's guide, before reading the present file. It will be easier to discover the present file with the help of the tutorial.

When the user is sufficiently familiarized with ABINIT, the reading of the ~abinit/doc/users/tuning file might be useful. For response-function calculations using abinit, please read the response function help file

##### Copyright (C) 1998-2012 ABINIT group (DCA, XG, RC)

This file is distributed under the terms of the GNU General Public License, see
~abinit/COPYING or
http://www.gnu.org/copyleft/gpl.txt .

For the initials of contributors, see ~abinit/doc/developers/contributors.txt .

Goto :

**ABINIT home Page**

**|**

**Suggested acknowledgments**

**|**

**List of input variables**

**|**

**Tutorial home page**

**|**

**Bibliography**

Help files :

**New user's guide**

**|**

**Abinit (main)**

**|**

**Abinit (respfn)**

**|**

**Mrgddb**

**|**

**Anaddb**

**|**

**AIM (Bader)**

**|**

**Cut3D**

**|**

**Optic**

Files that describe other input variables:

- Basic variables, VARBAS
- Developper variables, VARDEV
- File handling variables, VARFIL
- Geometry builder + symmetry related variables, VARGEO
- Ground-state calculation variables, VARGS
- GW variables, VARGW
- Internal variables, VARINT
- Projector-Augmented Wave variables, VARPAW
- Response Function variables, VARRF
- Structural optimization variables, VARRLX
- Wannier90 interface variables, VARW90

** Content of the file : alphabetical list of variables.**

A.

B.

C.

D.

E.

F.

G. gwpara

H.

I.

J.

K.

L. localrdwf

M.

N. ngroup_rf npband npfft npimage npkpt npspinor

O.

P. paral_kgb paral_rf

Q.

R.

S.

T.

U. use_gpu_cuda

V.

W.

X.

Y.

Z.

gwpara

Mnemonics: GW PARAllelization level

Characteristic: GW, PARALLEL

Variable type: integer

Default is 1

**TODO: default should be 2**.

Only relevant if optdriver=3 or 4, that is, screening or sigma calculations.

**gwpara** is used to choose between the two different parallelization levels
available in the GW code. The available options are:

- =1 => parallelisation on k points
- =2 => parallelisation on bands

Additional notes:

In the present status of the code, only the parallelization over bands (**gwpara**=2)
allows to reduce the memory allocated by each processor.

Using **gwpara**=1, indeed, requires the same amount of memory as a sequential run,
irrespectively of the number of CPU's used.

A reduction of the requireed memory can be achieved by opting for an out-of-core solution
(mkmem=0, only coded for optdriver=3)
at the price of a drastic worsening of the performance.

Go to the top
** | **Complete list of input variables

localrdwf

Mnemonics: LOCAL ReaD WaveFunctions

Characteristic: DEVELOP, PARALLEL

Variable type: integer

Default is 1.

This input variable is used only when running abinit in parallel. If

**localrdwf**=1, the input wavefunction disk file or the KSS/SCR file in case of GW calculations, is read locally by each processor, while if

**localrdwf**=0, only one processor reads it, and broadcast the data to the other processors.

The option **localrdwf**=0 is NOT allowed when parallel I/O are activated (MPI-IO access),
i.e. when accesswff==1.

The option **localrdwf**=0 is NOT allowed when
mkmem==0 (or, for RF, when
mkqmem==0, or mk1mem==0), that
is, when the wavefunctions are stored on disk.
This is still to be coded ...

In the case of a parallel computer with a unique file system,
both options are as convenient for the user. However, if the I/O
are slow compared to communications between processors,
(e.g. for CRAY T3E machines), **localrdwf**=0 should be much more
efficient;
if you really need temporary disk storage, switch to localrdwf=1 ).

In the case of a cluster of nodes, with a different file system for
each machine, the input wavefunction file must be available on all
nodes if **localrdwf**=1, while it is needed only for the
master node if **localrdwf**=0.

Go to the top
** | **Complete list of input variables

ngroup_rf

Mnemonics: Number of GROUPs for parallelization over Response Function perturbations

Characteristic: can even be specified separately for each dataset, parameter paral_rf is necessary

Variable type: integer

Default is 0.

This parameter is used in connection to the parallelization over perturbations. It defines the number of groups for distributing the perturbation-cases over the total number of available processors. The maximum number of groups is limited by the number of perturbation cases. The size of each group is again limited by the number k-points.

Go to the top

**|**Complete list of input variables

npband

Mnemonics: Number of Processors at the BAND level

Characteristic:

Variable type: integer

Default is 1.

Relevant only for the band/FFT parallelisation (see the paral_kgb input variable).

**npband**gives the number of processors among which the work load over the band level is shared.

**npband**, npfft, npkpt and npspinor are combined to give the total number of processors (nproc) working on the band/FFT/k-point parallelisation.

See npfft, npkpt, npspinor and paral_kgb for the additional information on the use of band/FFT/k-point parallelisation.

Note : at present, **npband** has to be a divisor or equal to nband

Go to the top
** | **Complete list of input variables

npfft

Mnemonics: Number of Processors at the FFT level

Characteristic:

Variable type: integer

Default is nproc.

Relevant only for the band/FFT/k-point parallelisation (see the paral_kgb input variable).

**npfft**gives the number of processors among which the work load over the FFT level is shared.

**npfft**, npkpt, npband and npspinor are combined to give the total number of processors (nproc) working on the band/FFT/k-point parallelisation.

See npband, npkpt, npspinor, and paral_kgb for the additional information on the use of band/FFT/k-point parallelisation.

Note : ngfft is automatically adjusted to **npfft**.
If the number of processor is changed from a calculation to another one,
**npfft** may change, and then ngfft also.

Go to the top
** | **Complete list of input variables

npimage

Mnemonics: Number of Processors at the IMAGE level

Characteristic:

Variable type: integer

Default is min(nproc,ndynimage) (see below).

Relevant only when sets of images are activated (see imgmov and nimage.

**npimage**gives the number of processors among which the work load over the image level is shared. It is compatible with all other parallelization levels available for ground-state calculations.

Note on the

**npimage**default value: this default value is crude. It is set to the number of dynamic images (ndynimage) if the number of available processors allows this choice. If ntimimage=1,

**npimage**is set to min(nproc,nimage).

*See paral_kgb, npkpt, npband, npfft and npspinor for the additional information on the use of k-point/band/FFT parallelisation.*

Go to the top
** | **Complete list of input variables

npkpt

Mnemonics: Number of Processors at the K-Point Level

Characteristic:

Variable type: integer

Default is 1.

Relevant only for the band/FFT/k-point parallelisation (see the paral_kgb input variable).

**npkpt**gives the number of processors among which the work load over the k-point/spin-component level is shared.

**npkpt**, npfft, npband and npspinor are combined to give the total number of processors (nproc) working on the band/FFT/k-point parallelisation.

See npband, npfft, npspinor and paral_kgb for the additional information on the use of band/FFT/k-point parallelisation.

Note : **npkpt** should be a divisor or equal to with the number of k-point/spin-components
(nkpt*nsppol)
in order to have the better load-balancing and efficiency.

Go to the top
** | **Complete list of input variables

npspinor

Mnemonics: Number of Processors at the SPINOR level

Characteristic:

Variable type: integer

Default is 1.

Can be 1 or 2 (if nspinor=2).

Relevant only for the band/FFT/k-point parallelisation (see the paral_kgb input variable).

**npspinor**gives the number of processors among which the work load over the spinorial components of wave-functions is shared.

**npspinor**, npfft, npband and npkpt are combined to give the total number of processors (nproc) working on the band/FFT/k-point parallelisation.

See npkpt, npband, npfft, and paral_kgb for the additional information on the use of band/FFT/k-point parallelisation.

Go to the top

**|**Complete list of input variables

paral_kgb

Mnemonics: activate PARALelization over K-point, G-vectors and Bands.

Characteristic: can not be specified separately for each dataset.

Variable type: integer

Default is 0.

**If paral_kgb=1**,
the parallelization over bands, FFTs, and k-point/spin-components is activated
(see npkpt, npfft and
npband). With this parallelization, the work load is split over
three levels of parallelization. The different communications almost occur
along one dimension only. Require compilation option --enable-mpi="yes".

At first, try to parallelise over the k point and spin (see npkpt). Otherwise, for unpolarized calculation at the gamma point, parallelise over the two other levels: the band and FFT ones. For nproc<=50, the best speed-up is achieved for npband=nproc and npfft=1 (which is not yet the default). For nproc>=50, the best speed-up is achieved for npband >=4*npfft.

For additional information, download F. Bottin's presentation at the ABINIT workshop 2007

Suggested acknowledgments :

F. Bottin, S. Leroux, A. Knyazev and G. Zerah,
*Large scale ab initio calculations based on three levels of parallelization*,
Comput. Mat. Science **42**, 329 (2008),
available on arXiv, http://arxiv.org/abs/0707.3405 .

If the total number of processors used is compatible with the three levels of parallelization, the values for npband, npfft, npband and bandpp will be filled automatically, although the repartition may not be optimal. To optimize the repartition use:

**If paral_kgb=-n **, ABINIT will test automatically if all the processor numbers between 2 and n are convenient for a parallel calculation and print the possible values in the log file. A weight is attributed to each possible processors repartition. It is advice to select a processoer repartition the weight of which is close to 1. The code will then stop after the printing. This test can be done as well with a sequential as with a parallel version of the code. The user can then choose the adequate number of processor on which he can run his job. He must put again paral_kgb=1 in the input file and put the corresponding values for npband, npfft, npband and bandpp in the input file.

Go to the top
** | **Complete list of input variables

paral_rf

Mnemonics: activate PARALlelization over Response Function perturbations

Characteristic: can even be specified separately for each dataset, parameter ngroup_rf is necessary

Variable type: integer

Default is 0.

This parameter activates the parallelization over perturbations which can be used during RF-Calculation. It is possible to use this type of parallelization in combination to the parallelization over k-points.

Currently total energies calculated by groups, where the master process is not in, are saved in .status_LOGxxxx files.

Go to the top

**|**Complete list of input variables

use_gpu_cuda

Mnemonics: activate USE of GPU accelerators with CUDA (nvidia)

Characteristic:

Variable type: integer

Default is 1 for ground-state calculations (optdriver=0) when ABINIT has been compiled using cuda, 0 otherwise

Only available if ABINIT executable has been compiled with cuda nvcc compiler.

This parameter activates the use of NVidia graphic accelerators (GPU) if present.

If **use_gp_cuda**=1, some parts of the computation are transmitted to the GPUs.

If **use_gp_cuda**=1, no computation is done on GPUs, even if present.

Note that, while running ABINIT on GPUs, it is recommended to use MAGMA external library
(i.e. Lapack on GPUs). The latter is activated during compilation stage (see "configure"
step of ABINIT compilation process). If MAGMA is not used, ABINIT performances on GPUs
can be poor.

Go to the top
** | **Complete list of input variables

Goto :

**ABINIT home Page**

**|**

**Suggested acknowledgments**

**|**

**List of input variables**

**|**

**Tutorial home page**

**|**

**Bibliography**

Help files :

**New user's guide**

**|**

**Abinit (main)**

**|**

**Abinit (respfn)**

**|**

**Mrgddb**

**|**

**Anaddb**

**|**

**AIM (Bader)**

**|**

**Cut3D**

**|**

**Optic**