NWChem

NWChem
{{{image}}} {{{imagecaption}}}
Developer(s)	Pacific Northwestern National Laboratory
Stable Release	6.1/Latest Release Date here
Operating Systems	Linux, Unix, etc.
Type	Computational Chemistry
License	Educational Community License (ECL) 2.0
Website	http://www.nwchem-sw.org

NWChem provides many methods for computing the properties of molecular and periodic systems using standard quantum mechanical descriptions of the electronic wavefunction or density. Its classical molecular dynamics capabilities provide for the simulation of macromolecules and solutions, including the computation of free energies using a variety of force fields. These approaches may be combined to perform mixed quantum-mechanics and molecular-mechanics simulations.

Applications

There are many modules of NWChem. In the DynAX XStack program, PNNL will extract several of these modules and produce simple, readable, serial kernels. The DynAX team will produce parallel and optimized versions of these kernels and report on those optimizations.

The first of these kernels is the Self Consistent Field (SCF).

SCF Kernel

Click here to view a Powerpoint Presentation on the SCF kernel

Description

Download

Code will be available here shortly. It is licensed under the BSD license.

Optimizations

Additional Work Skipping

The reference implementation has two tests to skip unnecessary work, one at the i,j level and one at the i,j,k,l level (innermost loop). An additional test can be added to the i,j,k level which will allow skipping entire iterations of the inner most loop.

Reference twoel function:

 double twoel(double schwmax) {
   int i, j, k, l;
   
   for (i = 0; i < nbfn; i++) {
   for (j = 0; j < nbfn; j++) {
       if ((g_schwarz[i][j] * schwmax        ) < tol2e) {icut1 += nbfn * nbfn; continue;}
       double  KLTest = tol2e / g_schwarz[i][j];
   
   for (k = 0; k < nbfn; k++) {
   for (l = 0; l < nbfn; l++) {
   
       if (g_schwarz[k][l] < KLTest) {icut2 ++; continue;}
   
       icut3 ++;
       double gg = g(i, j, k, l);
       g_fock[i][j] += (       gg * g_dens[k][l]);
       g_fock[i][k] -= (0.50 * gg * g_dens[j][l]);
   
     } } } }
   
   return (0.50 * contract_matrices(g_fock, g_dens));
 }

The inner-most test, (g_schwarz[k][l] < KLTest), can be modified to work in the k loop under certain conditions. If the maximum value of all entries for a given row of g_schwarz is known, that can be compared to KLTest in the k loop and we can skip the l loop entirely in cases where it would not perform any updates.

This requires one additional line to be inserted between the k and l loops:

   for (k = 0; k < nbfn; k++) {
     if (g_schwarz_max_row_value[k] < KLTest) {icut4 ++; continue;}
   for (l = 0; l < nbfn; l++) {

To enable this additional test, the array g_schwarz_max_row_value should be populated with the maximum value for each row in makesz while it is initializing the g_schwarz matrix.

The additional memory requirements for this work skipping test are minimal as the array only has one dimension of size nbfn.

Symmetry of g()

The g function has several symmetries which can be exploited to reduce the number of times it must be run.

g(i, j, k, l) returns the same value in the following cases:

The i and j parameters are swapped.
The k and l parameters are swapped.
The i,j and k,l pairs are swapped.

Combining all possibly symmetries leads to an overall 8-way symmetry where the following calls all return the same value:

g(i, j, k, l)
g(i, j, l, k)
g(j, i, k, l)
g(j, i, l, k)
g(k, l, i, j)
g(k, l, j, i)
g(l, k, i, j)
g(l, k, j, i)

By calculating gg = g(i, j, k, l) once, we can then simply perform multiple updates to the fock matrix.

update_fock(gg, i, j, k, l)
update_fock(gg, i, j, l, k)
update_fock(gg, j, i, k, l)
update_fock(gg, j, i, l, k)
update_fock(gg, k, l, i, j)
update_fock(gg, k, l, j, i)
update_fock(gg, l, k, i, j)
update_fock(gg, l, k, j, i)

With update() being a macro or inlined function similar to:

update(gg, a, b, c, d) {
  g_fock[a][b] += (gg * g_dens[c][d]);
  g_fock[a][c] -= (0.50 * gg * g_dens[b][d]);
}

When taking advantage of the symmetry, there are special cases which need to be considered. Most obvious is that the iteration range of the loops must be modified so that g() is only called once for each equivalent ordering of its arguments then all possible updates to the fock matrix are made with the result. Secondly, when arguments which can be swapped are equal, there are fewer updates to be made to the fock matrix.

As an example, consider the case where i, j, k, and l are all equal to 0. The reference implementation can only process this once since the loop indices can only all be zero once. If all symmetric updates are performed in this case, the result would be that the exact same update is performed eight times since the arguments to update(0, 0, 0, 0) are unchanged by the swap operations.

To ensure proper operation, there are six separate variants of the twoel loops, one for each possible symmetry restriction:

No restrictions
i = j
k = l
i,j pair is identical to k,l pair
i = j and k = l
i, j, k, and l are all equal

With the special cases taken in account, the loop structure for the fully symmetric case is:

for (i = 0; i < nbfn; i++) {
  for (j = i + 1; j < nbfn; j++) {
    for (k = i; k < nbfn; k++) {
      
      if (k == i) l_start = 1 + j;
      else l_start = 1 + k;
      
      for (l = l_start; l < nbfn; l++) {

        gg = g(i,j,k,l);

        //perform all eight updates
        update_fock(gg, i, j, k, l);
        update_fock(gg, i, j, l, k);
        update_fock(gg, j, i, k, l);
        update_fock(gg, j, i, l, k);
        update_fock(gg, k, l, i, j);
        update_fock(gg, k, l, j, i);
        update_fock(gg, l, k, i, j);
        update_fock(gg, l, k, j, i);
      }
    }
  }
}

The loop structure for the other five cases can be trivially generated by removing the loops which are no longer independent. The variables should then simply be replaced with the independent index. Finally any duplicate identical updates should be removed.

As an example, the i = j case is:

for (i = 0; i < nbfn; i++) {
  // removed j loop
  for (k = i; k < nbfn; k++) {
    for (l = 1 + k; l < nbfn; l++) {
      
      gg = g(i,i,k,l); // Note that first two arguments are both i
      
      //perform four updates
      update_fock(gg, i, i, k, l);
      update_fock(gg, i, i, l, k);
      // duplicate: update_fock(gg, i, i, k, l);
      // duplicate: update_fock(gg, i, i, l, k);
      update_fock(gg, k, l, i, i);
      // duplicate: update_fock(gg, k, l, i, i);
      update_fock(gg, l, k, i, i);
      // duplicate: update_fock(gg, l, k, i, i);
      }
    }
  }
}

Note that the special calculation of the starting index for the l loop is only required in the fully symmetric case. In the i = j case, the initial value is:

if (k == i) l_start = 1 + i;
else l_start = 1 + k;

However, we can substitute i with k when k == i:

if (k == i) l_start = 1 + k;
else l_start = 1 + k;

Which removes the need for the conditional assignment since the initial value is 1 + k in both cases.

For all other symmetry restrictions, the l loop is removed entirely.

NWChem

From Modelado Foundation

Revision as of 17:16, February 5, 2013 by imported>Rishi
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Contents

Applications

SCF Kernel

Description

Download

Optimizations

Additional Work Skipping

Symmetry of g()

NWChem

From Modelado Foundation

Revision as of 17:16, February 5, 2013 by imported>Rishi(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Applications

SCF Kernel

Description

Download

Optimizations

Additional Work Skipping

Symmetry of g()

Revision as of 17:16, February 5, 2013 by imported>Rishi
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)