Optimizing the parallelization

From VASP Wiki
Revision as of 13:52, 11 April 2022 by Huebsch (talk | contribs) (Created page with "The best parallelization setup of a VASP calculation needs to be tested for each system, algorithm and computer architecture. Below, we offer gen...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The best parallelization setup of a VASP calculation needs to be tested for each system, algorithm and computer architecture. Below, we offer general advice on how to optimize the parallelization.

Optimizing the parallelization

Try to get as close as possible to the actual system. This includes both the physical system (atoms, cell size, cutoff, ...) as well as the computational hardware (CPUs, interconnect, number of nodes, ...). If too many parameters are different, the parallel configuration may not be transferable to the production calculation.

A few steps of any repetitive tasks give a good estimate of the performance for the full calculation. For example, run only a few electronic or ionic self-consistency steps (without reaching full convergence) and compare various setups for the parallelization.

Often, VASP yields the best performance by combining multiple parallelization options. This is because the parallel efficiency of each level drops near its limit. For the default option (band parallelization), the limit is NBANDS divided by a small integer. Note that VASP will increase NBANDS to match the number of ranks. Choose NCORE as a factor of the cores per node to avoid communicating between nodes for the FFTs. Recall that OpenMP and OpenACC enforce that NCORE is not set. The k-point parallelization is efficient but requires additional memory. Given sufficient memory, increase KPAR up to the number of irreducible k points. Keep in mind that KPAR should factorize the number of k points. Finally, use the IMAGES tag to split several VASP runs into separate calculations. The limit is dictated by the number of desired calculations.

Related tags an articles

Parallelization , KPAR, NCORE, KPAR, IMAGES

References