(12 Nov 92)

Section 4 - Further Information

This section of the manual contains both references, and hints on how to do things. The following is a list of the topics covered:

Computational References

    
    HONDO -
    "The General Atomic and Molecular Electronic Structure
       System: HONDO 7.0"  M.Dupuis, J.D.Watts, H.O.Villar,
       G.J.B.Hurst  Comput.Phys.Comm. 52, 415-425(1989)
    "HONDO: A General Atomic and Molecular Electronic
       Structure System"  M.Dupuis, P.Mougenot, J.D.Watts,
       G.J.B.Hurst, H.O.Villar in "MOTECC: Modern Techniques
       in Computational Chemistry"  E.Clementi, Ed.
       ESCOM, Leiden, the Netherlands, 1989, pp 307-361.
    The 2nd of these papers describes many of the algorithms
    in detail, and much of this paper applies also to GAMESS.
    
    sp integrals -
    J.A.Pople, W.J.Hehre  J.Comput.Phys. 27, 161-168(1978)
    
    sp gradient integrals -
    A.Komornicki, K.Ishida, K.Morokuma, R.Ditchfield,
    M.Conrad  Chem.Phys.Lett. 45, 595-602(1977)
    
    spdfg integrals -
    "Numerical Integration Using Rys Polynomials"
        H.F.King and M.Dupuis   J.Comput.Phys. 21,144(1976)
    "Evaluation of Molecular Integrals over Gaussian
                                         Basis Functions"
       M.Dupuis,J.Rys,H.F.King  J.Chem.Phys. 65,111-116(1976)
    "Molecular Symmetry and Closed Shell HF Calculations"
     M.Dupuis and H.F.King   Int.J.Quantum Chem. 11,613(1977)
    "Computation of Electron Repulsion Integrals using
               the Rys Quadrature Method"
        J.Rys,M.Dupuis,H.F.King J.Comput.Chem. 4,154-157(1983)
    
    spd gradient integrals -
    "Molecular Symmetry. II. Gradient of Electronic Energy
     with respect to Nuclear Coordinates"
        M.Dupuis and H.F.King  J.Chem.Phys. 68,3998(1978)
    
    spd hessian integrals -
    "Molecular Symmetry. III. Second derivatives of Electronic
     Energy with respect to Nuclear Coordinates"
        T.Takada, M.Dupuis, H.F.King
        J.Chem.Phys. 75, 332-336 (1981)
    
    RHF/ROHF/TCSCF coupled perturbed Hartree Fock -
    "Single Configuration SCF Second Derivatives on a Cray"
        H.F.King, A.Komornicki in "Geometrical Derivatives of
        Energy Surfaces and Molecular Properties" P.Jorgensen
        J.Simons, Ed. D.Reidel, Dordrecht, 1986, pp 207-214.
    Y.Osamura, Y.Yamaguchi, D.J.Fox, M.A.Vincent, H.F.Schaefer
        J.Mol.Struct. 103, 183-186 (1983)
    M.Duran, Y.Yamaguchi, H.F.Schaefer
        J.Phys.Chem. 92, 3070-3075 (1988)
    
    RHF -
    C.C.J. Roothaan      Rev.Mod.Phys. 23, 69(1951)
    
    UHF -
    J.A. Pople, R.K. Nesbet  J.Chem.Phys 22, 571 (1954)
    
    GVB and OCBSE-ROHF -
    F.W.Bobrowicz and W.A.Goddard, in Modern Theoretical
    Chemistry, Vol 3, H.F.Schaefer III, Ed., Chapter 4.
    
    ROHF -
    R.McWeeny, G.Diercksen J.Chem.Phys. 49,4852-4856(1968)
    M.F.Guest, V.R.Saunders, Mol.Phys. 28, 819-828(1974)
    J.S.Binkley, J.A.Pople, P.A.Dobosh
       Mol.Phys.  28, 1423-1429 (1974)
    E.R.Davidson  Chem.Phys.Lett.  21,565(1973)
    K.Faegri, R.Manne  Mol.Phys.  31,1037-1049(1976)
    H.Hsu, E.R.Davidson, and R.M.Pitzer
       J.Chem.Phys. 65,609(1976)
    
    Direct SCF -
    J.Almlof, K.Faegri, K.Korsell
       J.Comput.Chem. 3, 385-399 (1982)
    M.Haser, R.Ahlrichs
       J.Comput.Chem. 10, 104-111 (1989)
    
    GUGA CI -
    B.Brooks and H.F.Schaefer  J.Chem. Phys. 70,5092(1979)
    B.Brooks, W.Laidig, P.Saxe, N.Handy, and H.F.Schaefer,
       Physica Scripta 21,312(1980).
    
    2nd order Moller-Plesset -
    P.Carsky, B.A.Hess, Jr., L.J.Schaad
    J.Comput.Chem. 5, 280-287(1984)
    
    MOPAC 6 -
    J.J.P.Stewart  J.Computer-Aided Molecular Design
    4, 1-105 (1990)
    References for parameters for individual atoms may be
    found on the printout from your runs.
    
    DIIS (Direct Inversion in the Iterative Subspace) -
    P.Pulay  J.Comput.Chem. 3, 556-560(1982)
    
    Integral transformation -
    S.T.Elbert, "An improved fully symmetric four-index
    transformation", unpublished preprint, circa 1978.
    S.T.Elbert, in "Numerical Algorithms in Chemistry:
    Algebraic Methods", Proceedings of a Workshop of the
    National Resource for Computations in Chemistry,
    Lawrence Berkeley Laboratory, 1978.
    
    Davidson eigenvector method -
    E.R.Davidson  J.Comput.Phys. 17,87(1975) and "Matrix
    Eigenvector Methods" p. 95 in "Methods in Computational
    Molecular Physics" ed. by G.H.F.Diercksen and S.Wilson
    
    EVVRSP in memory diagonalization -
    S.T.Elbert  Theoret.Chim.Acta  71,169-186(1987)
    
    Energy localization-
    C.Edmiston, K.Ruedenberg  Rev.Mod.Phys.  35, 457-465(1963)
    
    Boys localization-
    J.M.Foster, S.F.Boys  Rev.Mod.Phys.  32, 300-302(1960)
    
    Population localization -
    J.Pipek, P.Z.Mezey  J.Chem.Phys.  90, 4916(1989)
    
    Geometry optimization and saddle point location -
    J.Baker  J.Comput.Chem. 7, 385-395(1986)
    H.B.Schlegel, J.Comput.Chem. 3, 214-218(1982).
    
    Vibrational Analysis in Cartesian coordinates -
    W.D.Gwinn  J.Chem.Phys.  55,477-481(1971)
    
    Normal coordinate analysis -
    J.A.Boatz and M.S.Gordon,
       J.Phys.Chem. 93, 1819-1826(1989).
    
    Modified Virtual Orbitals (MVOs) -
    C.W.Bauschlicher, Jr.  J.Chem.Phys.  72,880-885(1980)
    
    Effective core potentials (ECP) -
    L.R.Kahn, P.Baybutt, D.G.Truhlar
       J.Chem.Phys. 65, 3826-3853 (1976)
    See also the papers listed for SBK and HW basis sets.

Basis Set References, and Descriptions

    
    
    
    An excellent review of the relationship between
    the atomic basis used, and the accuracy with which
    various molecular properties will be computed is:
    E.R.Davidson, D.Feller  Chem.Rev. 86, 681-696(1986).
    
    STO-NG      H-Ne        Ref. 1 and 2
                Na-Ar,      Ref. 2 and 3 **
                K,Ca,Ga-Kr  Ref. 4
                Rb,Sr,In-Xe Ref. 5
                Sc-Zn,Y-Cd  Ref. 6
    
    1) W.J.Hehre, R.F.Stewart, J.A.Pople
       J.Chem.Phys. 51, 2657-2664(1969).
    2) W.J.Hehre, R.Ditchfield, R.F.Stewart, J.A.Pople
       J.Chem.Phys. 52, 2769-2773(1970).
    3) M.S.Gordon, M.D.Bjorke, F.J.Marsh, M.S.Korth
       J.Am.Chem.Soc. 100, 2670-2678(1978).
       ** the valence scale factors for Na-Cl are taken
          from this paper, rather than the "official"
          Pople values in Ref. 2.
    4) W.J.Pietro, B.A.Levi, W.J.Hehre, R.F.Stewart,
       Inorg.Chem. 19, 2225-2229(1980).
    5) W.J.Pietro, E.S.Blurock, R.F.Hout,Jr., W.J.Hehre, D.J.
       DeFrees, R.F.Stewart  Inorg.Chem. 20, 3650-3654(1980).
    6) W.J.Pietro, W.J.Hehre J.Comput.Chem. 4, 241-251(1983).
    
    MINI/MIDI    H-Xe       Ref. 9
    
    9) "Gaussian Basis Sets for Molecular Calculations"
       S.Huzinaga, J.Andzelm, M.Klobukowski, E.Radzio-Andzelm,
       Y.Sakai, H.Tatewaki   Elsevier, Amsterdam, 1984.
    
        The MINI bases are three gaussian expansions of each
    atomic orbital.  The exponents and contraction
    coefficients are optimized for each element, and s and p
    exponents are not constrained to be equal.  As a result
    these bases give much lower energies than does STO-3G.
    The valence MINI orbitals of main group elements are
    scaled by factors optimized by John Deisz at North Dakota
    State University.  Transition metal MINI bases are not
    scaled.  The MIDI bases are derived from the MINI sets by
    floating the outermost primitive in each valence orbitals,
    and renormalizing the remaining 2 gaussians.  MIDI bases
    are not scaled by GAMESS.  The transition metal bases are
    taken from the lowest SCF terms in the s**1,d**n
    configurations.
    
    
    3-21G       H-Ne           Ref. 10     (also 6-21G)
                Na-Ar          Ref. 11     (also 6-21G)
    K,Ca,Ga-Kr,Rb,Sr,In-Xe     Ref. 12
                Sc-Zn          Ref. 13
                Y-Cd           Ref. 14
    
    10) J.S.Binkley, J.A.Pople, W.J.Hehre
        J.Am.Chem.Soc. 102, 939-947(1980).
    11) M.S.Gordon, J.S.Binkley, J.A.Pople, W.J.Pietro,
        W.J.Hehre  J.Am.Chem.Soc. 104, 2797-2803(1982).
    12) K.D.Dobbs, W.J.Hehre  J.Comput.Chem. 7,359-378(1986)
    13) K.D.Dobbs, W.J.Hehre  J.Comput.Chem. 8,861-879(1987)
    14) K.D.Dobbs, W.J.Hehre  J.Comput.Chem. 8,880-893(1987)
    
    N-31G   references for  4-31G         5-31G        6-31G
                H            15            15           15
                He           23            23           23
                Li           19,24                      19
                Be           20,24                      20
                B            17                         19
                C-F          15            16           16
                Ne           23                         23
                Na-Ga                                   22
                Si                                      21 **
                P-Cl         18                         22
                Ar                                      22
    
    
    15) R.Ditchfield, W.J.Hehre, J.A.Pople
        J.Chem.Phys. 54, 724-728(1971).
    16) W.J.Hehre, R.Ditchfield, J.A.Pople
        J.Chem.Phys. 56, 2257-2261(1972).
    17) W.J.Hehre, J.A.Pople J.Chem.Phys. 56, 4233-4234(1972).
    18) W.J.Hehre, W.A.Lathan J.Chem.Phys. 56,5255-5257(1972).
    19) J.D.Dill, J.A.Pople J.Chem.Phys. 62, 2921-2923(1975).
    20) J.S.Binkley, J.A.Pople J.Chem.Phys. 66, 879-880(1977).
    21) M.S.Gordon  Chem.Phys.Lett. 76, 163-168(1980)
        ** - Note that the built in 6-31G basis for Si is
             not that given by Pople in reference 22.
             The Gordon basis gives a better wavefunction,
             for a ROHF calculation in full atomic (Kh)
             symmetry,
             6-31G      Energy       virial
             Gordon   -288.828573   1.999978
             Pople    -288.828405   2.000280
             See the input examples for how to run in Kh.
    22) M.M.Francl, W.J.Pietro, W.J.Hehre, J.S.Binkley,
        M.S.Gordon, D.J.DeFrees, J.A.Pople
        J.Chem.Phys. 77, 3654-3665(1982).
    23) Unpublished, copied out of GAUSSIAN82.
    24) For Li and Be, 4-31G is actually a 5-21G expansion.
    
                                0
    
    
    Extended basis sets
    
        6-311G
    28) R.Krishnan, J.S.Binkley, R.Seeger, J.A.Pople
        J.Chem.Phys. 72, 650-654(1980).
    
        valence double zeta "DZV" sets:
    
        "DH" basis - DZV for H, Li-Ne, Al-Ar
    30) T.H.Dunning, Jr., P.J.Hay  Chapter 1 in "Methods of
        Electronic Structure Theory", H.F.Shaefer III, Ed.
        Plenum Press, N.Y. 1977, pp 1-27.
        Note that GAMESS uses inner/outer scale factors of
        1.2 and 1.15 for DH's hydrogen (since at least 1983).
        To get Thom's usual basis, scaled 1.2 throughout:
            HYDROGEN   1.0   x, y, z
               DH  0  1.2   1.2
        "BC" basis - DZV for Ga-Kr
    31) R.C.Binning, Jr., L.A.Curtiss
        J.Comput.Chem. 11, 1206-1216(1990)
    
        valence triple zeta "TZV" sets:
    
        TZV for atoms Li - Ne
    40) T.H. Dunning, J.Chem.Phys. 55 (1971) 716-723.
        "MC" basis - TZV for Mg-Ar
    41) A.D.McLean, G.S.Chandler
        J.Chem.Phys. 72,5639-5648(1980).
        TZV for Ca
    42) A.J.H. Wachters, J.Chem.Phys. 52 (1970) 1033-1036.
        (see Table VI, Contraction 3).
        TZV for Sc-Zn (taken from HONDO 7)
    This is Wachters' (14s9p5d) basis (ref 42) contracted
    to (10s8p3d) with the following modifications
           1. the most diffuse s removed;
           2. additional s spanning 3s-4s region;
           3. two additional p functions to describe the 4p;
           4. (6d) contracted to (411) from ref 43,
              except for Zn where Wachter's (5d)/[41]
              and Hay's diffuse d are used.
    43) A.K. Rappe, T.A. Smedley, and W.A. Goddard III,
        J.Phys.Chem. 85 (1981) 2607-2611
    
    Valence only basis sets (for use with corresponding ECPs)
    
    SBK   CEP-31G splits (available Li-Rn, except La-Yb)
    50) W.J.Stevens, H.Basch, M.Krauss
            J.Chem.Phys. 81, 6026-6033 (1984)
    51) W.J.Stevens, H.Basch, M.Krauss, P.Jasien
            Can.J.Chem, 70, 612-xxx (1992)
    
    HW    -21 splits (sp exponents not shared)
        transition metals (not built in at present, although
        they will work if you type them in).
    53) P.J.Hay, W.R.Wadt  J.Chem.Phys.  82, 270-283 (1985)
        main group (available Na-Xe)
    54) W.R.Wadt, P.J.Hay  J.Chem.Phys.  82, 284-298 (1985)
        see also
    55) P.J.Hay, W.R.Wadt  J.Chem.Phys.  82, 299-310 (1985)
    
    Polarization exponents
    
        STO-NG*
    60) J.B.Collins, P. von R. Schleyer, J.S.Binkley,
        J.A.Pople  J.Chem.Phys. 64, 5142-5151(1976).
    
        3-21G*.   See also reference 12.
    61) W.J.Pietro, M.M.Francl, W.J.Hehre, D.J.DeFrees,  J.A.
        Pople, J.S.Binkley J.Am.Chem.Soc. 104,5039-5048(1982)
    
        6-31G* and 6-31G**.   See also reference 22 above.
    62) P.C.Hariharan, J.A.Pople
        Theoret.Chim.Acta 28, 213-222(1973)
    
    STO-NG*  means d orbitals are used on third row atoms only.
             The original paper (ref 60) suggested z=0.09 for
             Na and Mg, and z=0.39 for Al-Cl.
             At NDSU we prefer to use the same exponents as
             in 3-21G* and 6-31G*, so we know we're looking
             at changes in the sp basis, not the d exponent.
    
    3-21G*   means d orbitals on main group elements in the
             third and higher periods.  Not defined for the
             transition metals, where there are p's already
             in the basis.  Except for alkalis and alkali
             earths, the 4th and 5th row zetas are from
             Huzinaga, et al (ref 9).  The exponents are
             normally the same as for 6-31G*.
    
    6-31G*   means d orbitals on second and third row atoms.
             We use Mark Gordon's z=0.395 for silicon, as well
             as his fully optimized sp basis (ref 21).
             This is often written 6-31G(d) today.
    
    6-31G**  means the same as 6-31G*, except that p functions
             are added on hydrogens.
             This is often written 6-31G(d,p) today.
    
    6-311G** means p orbitals on H, and d orbitals elsewhere.
             The exponents were derived from correlated atomic
             states, and so are considerably tighter than the
             polarizing functions used in 6-31G**, etc.
             This is often written 6-311G(d,p) today.
    
        The definitions for 6-31G* for C-F are disturbing in
    that they treat these atoms the same.  Dunning and Hay
    (ref 30) have recommended a better set of exponents for
    second row atoms and a slightly different value for H.
    
        2p, 3p, 2d, 3p polarization sets are usually thought
    of as arising from applying splitting factors to the
    1p and 1d values.  For example, SPLIT2=2.0, 0.5 means
    to double and halve the single value.  The default
    values for SPLIT2 and SPLIT3 are taken from reference
    72, and were derived with correlation in mind.  The
    SPLIT2 values often produce a higher (!) HF energy than
    the singly polarized run, because the exponents are
    split too widely.  SPLIT2=0.4,1.4 will always lower the
    SCF energy (the values are the unpublished personal
    preference of MWS), and for SPLIT3 we might suggest
    3.0,1.0,1/3.
    
        With all this as background, we are ready to present
    the table of polarization exponents built into GAMESS.
    
       Built in polarization exponents, chosen by POLAR=
    in the $BASIS group.  The values are for d functions
    unless otherwise indicated.
    
        Please note that the names associated with each
    column are only generally descriptive.  For example, the
    column marked "Pople" contains a value for Si with which
    John Pople would not agree, and that the K-Xe values in
    this column were actually originally from the Huzinaga
    group.  The exponents for Ga-Kr are from the Binning and
    Curtiss paper, not Thom Dunning.  And so on.
    
    
           POPLE    POPN311   DUNNING   HUZINAGA    HONDO7
           ------   -------   -------   --------    ------
      H    1.1(p)    0.75(p)   1.0(p)     1.0(p)    1.0(p)
      He   1.1(p)    0.75(p)   1.0(p)     1.0(p)    1.0(p)
    
      Li   0.2       0.200                0.076(p)
      Be   0.4       0.255                0.164(p)  0.32
      B    0.6       0.401     0.70       0.388     0.50
      C    0.8       0.626     0.75       0.600     0.72
      N    0.8       0.913     0.80       0.864     0.98
      O    0.8       1.292     0.85       1.154     1.28
      F    0.8       1.750     0.90       1.496     1.62
      Ne   0.8       2.304     1.00       1.888     2.00
    
      Na   0.175                          0.061(p)  0.157
      Mg   0.175                          0.101(p)  0.234
      Al   0.325                          0.198     0.311
      Si   0.395                          0.262     0.388
      P    0.55                           0.340     0.465
      S    0.65                           0.421     0.542
      Cl   0.75                           0.514     0.619
      Ar   0.85                           0.617     0.696
    
      K    0.1                            0.039(p)
      Ca   0.1                            0.059(p)
      Ga   0.207               0.141
      Ge   0.246               0.202
      As   0.293               0.273
      Se   0.338               0.315
      Br   0.389               0.338
      Kr   0.443               0.318
    
      Rb   0.11                           0.034(p)
      Sr   0.11                           0.048(p)
    
    
    A blank means the value equals the "Pople" column.
    
    Common d polarization for all sets:
    
    
        In     Sn     Sb     Te      I     Xe
      0.160  0.183  0.211  0.237  0.266  0.297
        Tl     Pb     Bi     Po     At     Rn
      0.146  0.164  0.185  0.204  0.225  0.247
    
    
    Anion diffuse functions
    
        3-21+G, 3-21++G, etc.
    70) T.Clark, J.Chandrasekhar, G.W.Spitznagel, P. von R.
        Schleyer J.Comput.Chem. 4, 294-301(1983)
    71) G.W.Spitznagel, Diplomarbeit, Erlangen, 1982.
    72) M.J.Frisch, J.A.Pople, J.S.Binkley J.Chem.Phys.
        80, 3265-3269 (1984).
    
        Anions usually require diffuse basis functions to
    properly represent their spatial diffuseness.  The use of
    diffuse sp shells on atoms in the second and third rows is
    denoted by a + sign, also adding diffuse s functions on
    hydrogen is symbolized by ++.  These designations can be
    applied to any of the Pople bases, e.g.  3-21+G, 3-21+G*,
    6-31++G**.  The following exponents are for L shells,
    except for H.  For H-F, they are taken from ref 70.  For
    Na-Cl, they are taken directly from reference 71.  These
    values may be found in footnote 13 of reference 72.
    For Ga-Br, In-I, and Tl-At these were optimized for the
    atomic ground state anion, using ROHF with a flexible ECP
    basis set, by Ted Packwood at NDSU.
    
    
        H
     0.0360
       Li      Be       B       C       N       O       F
     0.0074  0.0207  0.0315  0.0438  0.0639  0.0845  0.1076
       Na      Mg      Al      Si       P       S      Cl
     0.0076  0.0146  0.0318  0.0331  0.0348  0.0405  0.0483
                       Ga      Ge      As      Se      Br
                     0.0205  0.0222  0.0287  0.0318  0.0376
                       In      Sn      Sb      Te       I
                     0.0223  0.0231  0.0259  0.0306  0.0368
                       Tl      Pb      Bi      Po      At
                     0.0170  0.0171  0.0215  0.0230  0.0294
    
    
    Additional information about diffuse functions and also
    Rydberg type exponents can be found in reference 30.
    
        The following atomic energies are from UHF
    calculations (RHF on 1-S states), with p orbitals not
    symmetry equivalenced, and using the default molecular
    scale factors.  They should be useful in picking a basis
    of the desired energy accuracy, and estimating the correct
    molecular total energies.
    
    
    Atom state   STO-2G        STO-3G       3-21G       6-31G
    H   2-S     -.454397     -.466582     -.496199    -.498233
    He  1-S    -2.702157    -2.807784    -2.835680   -2.855160
    Li  2-S    -7.070809    -7.315526    -7.381513   -7.431236
    Be  1-S   -13.890237   -14.351880   -14.486820  -14.566764
    B   2-P   -23.395284   -24.148989   -24.389762  -24.519492
    C   3-P   -36.060274   -37.198393   -37.481070  -37.677837
    N   4-S   -53.093007   -53.719010   -54.105390  -54.385008
    O   3-P   -71.351689   -73.530122   -74.393657  -74.780310
    F   2-P   -95.015084   -97.986505   -98.845009  -99.360860
    Ne  1-S  -122.360485  -126.132546  -127.803825 -128.473877
    Na  2-S  -155.170019  -159.797148  -160.854065 -161.841425
    Mg  1-S  -191.507082  -197.185978  -198.468103 -199.595219
    Al  2-P  -233.199965  -239.026471  -240.551046 -241.854186
    Si  3-P  -277.506857  -285.563052  -287.344431 -288.828598
    P   4-S  -327.564244  -336.944863  -339.000079 -340.689008
    S   3-P  -382.375012  -393.178951  -395.551336 -397.471414
    Cl  2-P  -442.206260  -454.546015  -457.276552 -459.442939
    Ar  1-S  -507.249273  -521.222881  -524.342962 -526.772151
    
                                                     SCF   *
    Atom state     DH       6-311G        MC         limit
    H   2-S    -.498189     -.499810      --        -0.5
    He  1-S      --        -2.859895      --        -2.861680
    Li  2-S   -7.431736    -7.432026      --        -7.432727
    Be  1-S  -14.570907   -14.571874      --       -14.573023
    B   2-P  -24.526601   -24.527020      --       -24.529061
    C   3-P  -37.685571   -37.686024      --       -37.688619
    N   4-S  -54.397260   -54.397980      --       -54.400935
    O   3-P  -74.802707   -74.802496      --       -74.809400
    F   2-P  -99.395013   -99.394158      --       -99.409353
    Ne  1-S -128.522354  -128.522553      --      -128.547104
    Na  2-S      --           --     -161.845587  -161.858917
    Mg  1-S      --           --     -199.606558  -199.614636
    Al  2-P -241.855079       --     -241.870014  -241.876699
    Si  3-P -288.829617       --     -288.847782  -288.854380
    P   4-S -340.689043       --     -340.711346  -340.718798
    S   3-P -397.468667       --     -397.498023  -397.504910
    Cl  2-P -459.435938       --     -459.473412  -459.482088
    Ar  1-S      --           --     -526.806626  -526.817528
    
    
    * M.W.Schmidt and K.Ruedenberg, J.Chem.Phys. 71,
      3951-3962(1979). These are ROHF in Kh energies.
    

    How to do RHF, ROHF, UHF, and GVB calculations

    General Considerations

    These four SCF wavefunctions are all based on Fock operator techniques, even though some GVB runs use more than one determinant. Thus all of these have an intrinsic N**4 time dependence, because they are all driven by integrals in the AO basis. This similarity makes it convenient to discuss them all together. In this section we will use the term HF to refer generically to any of these four wavefunctions, including the multi-determinate GVB-PP functions. $SCF is the main input group for all these HF wavefunctions.

    As will be discussed below, in GAMESS the term ROHF refers to high spin open shell SCF only, with other open shell coupling cases possible using the GVB code.

    Analytic gradients are implemented for every possible HF type calculation possible in GAMESS, and therefore numerical hessians are available for each.

    Analytic hessian calculation is implemented for RHF, ROHF, and any GVB case with NPAIR=0 or NPAIR=1. Analytic hessians are more accurate, and much more quickly computed than numerical hessians, but require additional disk storage to perform an integral transformation, and also more physical memory.

    The second order Moller-Plesset energy correction (MP2) is implemented only for the close shell case, RHF. Note that since calculation of the first order wavefunction's density matrix is not implemented, properties during an MP2 run are for the underlying RHF (zeroth order) wavefunction.

    Direct SCF is implemented for every possible HF type calculation involving only the energy or gradient (and thus numerical hessians). The direct method may not be used with analytic hessians, or when computing the MP2 energy correction.

    Direct SCF

    Normally, HF calculations proceed by evaluating a large number of two electron repulsion integrals, and storing these on a disk. This integral file is read back in during each HF iteration to form the appropriate Fock operators. In a direct HF, the integrals are not stored on disk, but are instead reevaluated during each HF iteration. Since the direct approach *always* requires more CPU time, the default for DIRSCF in $SCF is .FALSE.

    Even though direct SCF is slower, there are at least three reasons why you may want to consider using it. The first is that it may not be possible to store all of the integrals on the disk drives attached to your computer. Secondly, the index label packing scheme used by GAMESS restricts the basis set size to no more than 361 if the integrals are stored on disk, whereas for direct HF you can (in principle) use up to 2047 basis functions. Finally, what you are really interested in is reducing the wall clock time to obtain your answer, not the CPU time. Workstations have modest hardware (and sometimes software) I/O capabilities. Other environments such as an IBM mainframe shared by many users may also have very poor CPU/wall clock performance for I/O bound jobs such as conventional HF.

    You can estimate the disk storage requirements for conventional HF using a P or PK file by the following formulae:

    nint = 1/sigma * 1/8 * N**4
    Mbytes = nint * x / 1024**2

    Here N is the total number of basis functions in your run, which you can learn from an EXETYP=CHECK run. The 1/8 accounts for permutational symmetry within the integrals. Sigma accounts for the point group symmetry, and is difficult to estimate accurately. Sigma cannot be smaller than 1, in no symmetry (C1) calculations. For benzene, sigma would be almost six, since you generate 6 C's and 6 H's by entering only 1 of each in $DATA. For water sigma is not much larger than one, since most of the basis set is on the unique oxygen, and the C2v symmetry applies only to the H atoms. The factor x is 12 bytes per integral for RHF, and 20 bytes per integral for ROHF, UHF, and GVB. Finally, since integrals very close to zero need not be stored on disk, the actual power dependence is not as bad as N**4, and in fact in the limit of very large molecules can be as low as N**2. Thus plugging in sigma=1 should give you an upper bound to the actual disk space needed. If the estimate exceeds your available disk storage, your only recourse is direct HF.

    What are the economics of direct HF? Naively, if we assume the run takes 10 iterations to converge, we must spend 10 times more CPU time doing integrals on each iteration. However, we do not have to waste any CPU time reading blocks of integrals from disk, or in unpacking their indices. We also do not have to waste any wall clock time waiting for a relatively slow mechanical device such as a disk to give us our data.

    There are some less obvious savings too, as first noted by Almlof. First, since the density matrix is known while we are computing integrals, we can use the Schwarz inequality to avoid doing some of the integrals. In a conventional SCF this inequality is used to avoid doing small integrals. In a direct SCF it can be used to avoid doing integrals whose contribution to the Fock matrix is small (density times integral small). Secondly, we can form the Fock matrix by calculating only its change since the previous iteration. The contributions to the change in the Fock matrix are equal to the change in the density times the integrals. Since the change in the density goes to zero as the run converges, we can use the Schwarz screening to avoid more and more integrals as the calculation progresses. The input option FDIFF in $SCF selects formation of Fock operators by computing only its change from iteration to iteration. The FDIFF option is not implemented for GVB since there are too many density matrices from the previous iteration to store, but is the default for direct RHF, ROHF, and UHF.

    So, in our hypothetical 10 iteration case, we do not spend as much as 10 times more time in integral evaluation. Additionally, the run as a whole will not slow down by whatever factor the integral time is increased. A direct run spends no additional time summing integrals into the Fock operators, and no additional time in the Fock diagonalizations. So, generally speaking, a RHF run with 10-15 iterations will slow down by a factor of 3-4 times when run in direct mode. The energy gradient time is unchanged by direct HF, and this is a large time compared to HF energy, so geometry optimizations will be slowed down even less. This is really the converse of Amdahl's law: if you slow down only one portion of a program by a large amount, the entire program slows down by a much smaller factor.

    To make this concrete, here are some times for GAMESS benchmark BENCH12, which is a RHF energy for a moderately large molecule. The timings shown were obtained on a DECstation 3100 under Ultrix 3.1, which was running only these tests, so that the wall clock times are meaningful. This system is typical of Unix workstations in that it uses SCSI disks, and the operating system is not terribly good at disk I/O. By default GAMESS stores the integrals on disk in the form of a P supermatrix, because this will save time later in the SCF cycles. By choosing NOPK=1 in $INTGRL, an ordinary integral file can be used, which typically contains many fewer integrals, but takes more CPU time in the SCF. Because the DECstation is not terribly good at I/O, the wall clock time for the ordinary integral file is actually less than when the supermatrix is used, even though the CPU time is longer. The run takes 13 iterations to converge, the times are in seconds.

    
                               P supermatrix   ordinary file
       # nonzero integrals      8,244,129       6,125,653
       # blocks skipped            55,841          55,841
       CPU time (ints)              709              636
       CPU time (SCF)              1289             1472
       CPU time (job total)        2123             2233
       wall time (job total)       3468             3200
    
    When the same calculation is run in direct mode (integrals are processed like an ordinary integral disk file when running direct),
    
          iteration 0:         FDIFF=.TRUE.   FDIFF=.FALSE.
       # nonzero integrals       6,117,416      6,117,416
       # blocks skipped             60,208         60,208
          iteration 13:
       # nonzero integrals       3,709,733      6,122,912
       # blocks skipped            105,278         59,415
       CPU time (job total)         6719            7851
       wall time (job total)        6764            7886
    
    Note that elimination of the disk I/O dramatically increases the CPU/wall efficiency. Here's the bottom line on direct HF:

    best direct CPU / best disk CPU = 6719/2123 = 3.2
    best direct wall/ best disk wall= 6764/3200 = 2.1

    Direct SCF is slower than conventional SCF, but not outrageously so! From the data in the tables, we can see that the best direct method spends about 6719-1472 = 5247 seconds doing integrals. This is an increase of about 5247/636 = 8.2 in the time spent doing integrals, in a run which does 13 iterations (14 times spent evaluating integrals to form the Fock operator). 8.2 is less than 14 because the run avoids all CPU charges related to I/O, and makes efficient use of the Schwarz inequality to avoid doing many of the integrals in its final iterations.

    SCF Convergence Accelerators

    Generally speaking, the simpler the HF function, the better its convergence. In our experience, the majority of RHF, ROHF, and UHF runs will converge readily from GUESS=HUCKEL. GVB runs typically require GUESS=MOREAD, although the Huckel guess occasionally works. RHF convergence is the best, closely followed by ROHF. In the current implementation in GAMESS, ROHF is somewhat better convergent than the closely related unrestricted high spin UHF. For example, the radical cation of diphosphine (test job BENCH10) converges in 12 iterations for ROHF and 15 iterations for UHF, starting from the neutral's closed shell orbitals. GVB calculations require much more care, and cases with NPAIR greater than one are particularly difficult.

    Unfortunately, not all HF runs converge readily. The best way to improve your convergence is to provide better starting orbitals! In many cases, this means to MOREAD orbitals from some simpler HF case. For example, if you want to do a doublet ROHF, and the HUCKEL guess does not seem to converge, do this: Do an RHF on the +1 cation. RHF is typically more stable than ROHF, UHF, or GVB, and cations are often readily convergent. Then MOREAD the cation's orbitals into the neutral calculation which you wanted to do at first.

    GUESS=HUCKEL does not always guess the correct electronic configuration. It may be useful to use PRTMO in $GUESS during a CHECK run to examine the starting orbitals, and then reorder them with NORDER if that seems appropriate.

    Of course, by default GAMESS uses the convergence procedures which are usually most effective. Still, there are cases which are difficult, so the $SCF group permits you to select several alternative methods for improving convergence. Briefly, these are

    EXTRAP. This extrapolates the three previous Fock matrices, in an attempt to jump ahead a bit faster. This is the most powerful of the old-fashioned accelerators, and normally should be used at the beginning of any SCF run. When an extrapolation occurs, the counter at the left of the SCF printout is set to zero.

    DAMP. This damps the oscillations between several successive Fock matrices. It may help when the energy is seen to oscillate wildly. Thinking about which orbitals should be occupied initially may be an even better way to avoid oscillatory behaviour.

    SHIFT. This shifts the diagonal elements of the virtual part of the Fock matrix up, in an attempt to uncouple the unoccupied orbitals from the occupied ones. At convergence, this has no effect on the orbitals, just their orbital energies, but will produce different (and hopefully better) orbitals during the iterations.

    RSTRCT. This limits mixing of the occupied orbitals with the empty ones, especially the flipping of the HOMO and LUMO to produce undesired electronic configurations or states. This should be used with caution, as it makes it very easy to converge on excited state, especially if DIIS is also used. If you use this, be sure to check your final orbital energies to see if they are sensible. A lower energy for an unoccupied orbital than for one of the occupied ones is a sure sign of problems.

    DIIS. Direct Inversion in the Iterative Subspace is a modern method, due to Pulay, using stored error and Fock matrices from a large number of previous iterations to interpolate an improved Fock matrix. This method was developed to improve the convergence at the final stages of the SCF process, but turns out to be quite powerful at forcing convergence in the initial stages of SCF as well. By giving ETHRSH as 10.0 in $SCF, you can practically guarantee that DIIS will be in effect from the first iteration. The default is set up to do a few iterations with conventional methods (extrapolation) before engaging DIIS. This is because DIIS can sometimes converge to solutions of the SCF equations that do not have the lowest possible energy. For example, the 3-A-2 small angle state of SiLi2 (see M.S.Gordon and M.W.Schmidt, Chem.Phys.Lett., 132, 294-8(1986)) will readily converge with DIIS to a solution with a reasonable S**2, and an energy about 25 milliHartree above the correct answer. A SURE SIGN OF TROUBLE WITH DIIS IS WHEN THE ENERGY RISES TO ITS FINAL VALUE. However, if you obtain orbitals at one point on a PES without DIIS, the subsequent use of DIIS with MOREAD will probably not introduce any problems. Because DIIS is quite powerful, EXTRAP, DAMP, and SHIFT are all turned off once DIIS begins to work. DEM and RSTRCT will still be in use, however.

    DEM. Direct energy minimization should be your last recourse. It explores the "line" between the current orbitals and those generated by a conventional change in the orbitals, looking for the minimum energy on that line. DEM should always lower the energy on every iteration, but is very time consuming, since each of the points considered on the line search requires evaluation of a Fock operator. DEM will be skipped once the density change falls below DEMCUT, as the other methods should then be able to affect final convergence. While DEM is working, RSTRCT is held to be true, regardless of the input choice for RSTRCT. Because of this, it behooves you to be sure that the initial guess is occupying the desired orbitals. DEM is available only for RHF. The implementation in GAMESS resembles that of R.Seeger and J.A.Pople, J.Chem.Phys. 65, 265-271(1976). Simultaneous use of DEM and DIIS resembles the ADEM-DIOS method of H.Sellers, Chem.Phys.Lett. 180, 461-465(1991).

    High Spin Open Shell SCF (ROHF)

    Open shell SCF calculations are performed in GAMESS by both the ROHF code and the GVB code. Note that when the GVB code is executed with no pairs, the run is NOT a true GVB run, and should be referred to in publications and discussion as a ROHF calculation.

    The ROHF module in GAMESS can handle any number of open shell electrons, provided these have a high spin coupling. Some commonly occurring cases are:

    one open shell, doublet:
         $CONTRL SCFTYP=ROHF MULT=2 $END
    
    two open shells, triplet:
         $CONTRL SCFTYP=ROHF MULT=3 $END
    
    m open shells, high spin:
         $CONTRL SCFTYP=ROHF MULT=m+1 $END
    
    John Montgomery of United Technologies is responsible for the current ROHF implementation in GAMESS. The following discussion is due to him: The Fock matrix in the MO basis has the form
                       closed       open        virtual
            closed      F2      |     Fb     | (Fa+Fb)/2
                     -----------------------------------
            open        Fb      |     F1     |    Fa
                     -----------------------------------
            virtual   (Fa+Fb)/2 |     Fa     |    F0
    
    where Fa and Fb are the usual alpha and beta Fock matrices any UHF program produces. The Fock operators for the doubly, singly, and zero occupied blocks can be written as
            F2 = Acc*Fa + Bcc*Fb
            F1 = Aoo*Fa + Boo*Fb
            F0 = Avv*Fa + Bvv*Fb
    
    Some choices found in the literature for these canonicalization coefficients are
    
                              Acc  Bcc  Aoo  Boo  Avv  Bvv
     Guest and Saunders       1/2  1/2  1/2  1/2  1/2  1/2
     Roothaan single matrix  -1/2  3/2  1/2  1/2  3/2 -1/2
     Davidson                 1/2  1/2   1    0    1    0
     Binkley, Pople, Dobosh   1/2  1/2   1    0    0    1
     McWeeny and Diercksen    1/3  2/3  1/3  1/3  2/3  1/3
     Faegri and Manne         1/2  1/2   1    0   1/2  1/2
    
    The choice of the diagonal blocks is arbitrary, as ROHF is converged when the off diagonal blocks go to zero. The exact choice for these blocks can however have an effect on the convergence rate. This choice also affects the MO coefficients, and orbital energies, as the different choices produce different canonical orbitals within the three subspaces. All methods, however, will give identical total wavefunctions, and hence properties such as gradients and hessians.

    The default coupling case in GAMESS is the Roothaan single matrix set. Note that pre-1988 versions of GAMESS produced "Davidson" orbitals. If you would like to fool around with any of these other canonicalizations, the Acc, Aoo, Avv and Bcc, Boo, Bvv parameters can be input as the first three elements of ALPHA and BETA in $SCF.

    Other Open sShell SCF cCases (GVB)

    Genuine GVB-PP runs will be discussed later in this section. First, we will consider how to do open shell SCF with the GVB part of the program.

    It is possible to do other open shell cases with the GVB code, which can handle the following cases:

    one open shell, doublet:
         $CONTRL SCFTYP=GVB MULT=2 $END
         $SCF    NCO=xx NSETO=1 NO(1)=1 $END
    two open shells, triplet:
         $CONTRL SCFTYP=GVB MULT=3 $END
         $SCF    NCO=xx NSETO=2 NO(1)=1,1 $END
    two open shells, singlet:
         $CONTRL SCFTYP=GVB MULT=1 $END
         $SCF    NCO=xx NSETO=2 NO(1)=1,1 $END
    
    Note that the first two cases duplicate runs which the ROHF module can do better. Note that all of these cases are really ROHF, since the default for NPAIR in $SCF is 0.

    If you would like to do any cases other than those shown above, you must derive the coupling coefficients ALPHA and BETA, and input them with the occupancies F in the $SCF group.

    Many open shell states with degenerate open shells (for example, in diatomic molecules) can be treated as well. There is a sample of this in the 'Input Examples' section of this manual.

    Mariusz Klobukowski of the University of Alberta has shown how to obtain coupling coefficients for the GVB open shell program for many such open shell states. These can be derived from the values in Appendix A of the book "A General SCF Theory" by Ramon Carbo and Josep M. Riera, Springer-Verlag (1978). The basic rule is

    
           (1)      F(i) = 1/2 * omega(i)
           (2)  ALPHA(i) =       alpha(i)
           (3)   BETA(i) =      - beta(i),
    
    where omega, alpha, and beta are the names used by Ramon in his Tables.

    The variable NSETO should give the number of open shells, and NO should give the degeneracy of each open shell. Thus the 5-S state of carbon would have NSETO=2, and NO(1)=1,3.

    Some specific examples, for the lowest term in each of the atomic P**N configurations are

      
      !   p**1   2-P state
       $CONTRL SCFTYP=GVB  MULT=2   $END
       $SCF    NCO=xx   NSETO=1  NO=3   COUPLE=.TRUE.
            F(1)=  1.0  0.16666666666667
        ALPHA(1)=  2.0  0.33333333333333  0.00000000000000
         BETA(1)= -1.0 -0.16666666666667 -0.00000000000000  $END
      !   p**2   3-P state
       $CONTRL SCFTYP=GVB  MULT=3   $END
       $SCF    NCO=xx   NSETO=1  NO=3   COUPLE=.TRUE.
            F(1)=  1.0  0.333333333333333
        ALPHA(1)=  2.0  0.66666666666667  0.16666666666667
         BETA(1)= -1.0 -0.33333333333333 -0.16666666666667  $END
      !   p**3   4-S state
       $CONTRL SCFTYP=ROHF  MULT=4   $END
      !   p**4   3-P state
       $CONTRL SCFTYP=GVB  MULT=3   $END
       $SCF    NCO=xx   NSETO=1  NO=3   COUPLE=.TRUE.
            F(1)=  1.0  0.66666666666667
        ALPHA(1)=  2.0  1.33333333333333  0.83333333333333
         BETA(1)= -1.0 -0.66666666666667 -0.50000000000000  $END
      !   p**5   2-P state
       $CONTRL SCFTYP=GVB  MULT=2   $END
       $SCF    NCO=xx   NSETO=1  NO=3   COUPLE=.TRUE.
            F(1)=  1.0  0.83333333333333
        ALPHA(1)=  2.0  1.66666666666667  1.33333333333333
         BETA(1)= -1.0 -0.83333333333333 -0.66666666666667  $END
      
    Coupling constants for d**N configurations are
      
      !     d**1   2-D state
       $CONTRL SCFTYP=GVB MULT=2 $END
       $SCF    NCO=xx NSETO=1 NO=5 COUPLE=.TRUE.  F(1)=1.0,0.1
               ALPHA(1)= 2.0, 0.20, 0.00
                BETA(1)=-1.0,-0.10, 0.00  $END
      !     d**2   average of 3-F and 3-P states
       $CONTRL SCFTYP=GVB MULT=3 $END
       $SCF    NCO=xx NSETO=1 NO=5 COUPLE=.TRUE.  F(1)=1.0,0.2
               ALPHA(1)= 2.0, 0.40, 0.05
                BETA(1)=-1.0,-0.20,-0.05  $END
      !     d**3   average of 4-F and 4-P states
       $CONTRL SCFTYP=GVB MULT=3 $END
       $SCF    NCO=xx NSETO=1 NO=5 COUPLE=.TRUE.  F(1)=1.0,0.3
               ALPHA(1)= 2.0, 0.60, 0.15
                BETA(1)=-1.0,-0.30,-0.15  $END
      !     d**4   5-D state
       $CONTRL SCFTYP=GVB MULT=5 $END
       $SCF    NCO=xx NSETO=1 NO=5 COUPLE=.TRUE.  F(1)=1.0,0.4
               ALPHA(1)= 2.0, 0.80, 0.30
                BETA(1)=-1.0,-0.40,-0.30 $END
      !     d**5   6-S state
       $CONTRL SCFTYP=ROHF MULT=6 $END
      !     d**6   5-D state
       $CONTRL SCFTYP=GVB MULT=5 $END
       $SCF    NCO=xx NSETO=1 NO=5 COUPLE=.TRUE.  F(1)=1.0,0.6
               ALPHA(1)= 2.0, 1.20, 0.70
                BETA(1)=-1.0,-0.60,-0.50 $END
      !     d**7   average of 4-F and 4-P states
       $CONTRL SCFTYP=GVB MULT=3 $END
       $SCF    NCO=xx NSETO=1 NO=5 COUPLE=.TRUE.  F(1)=1.0,0.7
               ALPHA(1)= 2.0, 1.40, 0.95
                BETA(1)=-1.0,-0.70,-0.55  $END
      !     d**8   average of 3-F and 3-P states
       $CONTRL SCFTYP=GVB MULT=3 $END
       $SCF    NCO=xx NSETO=1 NO=5 COUPLE=.TRUE.  F(1)=1.0,0.8
               ALPHA(1)= 2.0, 1.60, 1.25
                beta(1)=-1.0,-0.80,-0.65  $end
      !     d**9   2-D state
       $CONTRL SCFTYP=GVB MULT=2 $END
       $SCF    NCO=xx NSETO=1 NO=5 COUPLE=.TRUE.  F(1)=1.0,0.9
               ALPHA(1)= 2.0, 1.80, 1.60
                BETA(1)=-1.0,-0.90,-0.80 $END
      
    The source for these values is R.Poirier, R.Kari, and I.G.Csizmadia's book "Handbook of Gaussian Basis Sets", Elsevier, Amsterdam, 1985.

    Note that GAMESS can do a proper calculation on the ground terms for the d**2, d**3, d**7, and d**8 configurations only by means of state averaged MCSCF. For d**8, use

       $CONTRL SCFTYP=MCSCF MULT=3 $END
       $DRT    GROUP=C1 FORS=.TRUE. NMCC=xx NDOC=3 NALP=2 $END
       $GUGDIA NSTATE=10 $END
       $GUGDM2 WSTATE(1)=1,1,1,1,1,1,1,0,0,0 $END
      

    True GVB Perfect Pairing Runs

    True GVB runs are obtained by choosing NPAIR nonzero. If you wish to have some open shell electrons in addition to the geminal pairs, you may add the pairs to the end of any of the GVB coupling cases shown above. The GVB module assumes that you have reordered your MOs into the order: NCO double occupied orbitals, NSETO sets of open shell orbitals, and NPAIR sets of geminals (with NORDER=1 in the $GUESS group).

    Each geminal consists of two orbitals and contains two singlet coupled electrons (perfect pairing). The first MO of a geminal is probably heavily occupied (such as a bonding MO u), and the second is probably weakly occupied (such as an antibonding, correlating orbital v). If you have more than one pair, you must be careful that the initial MOs are ordered u1, v1, u2, v2..., which is -NOT- the same order that RHF starting orbitals will be found in. Use NORDER=1 to get the correct order.

    These pair wavefunctions are actually a limited form of MCSCF. GVB runs are much faster than MCSCF runs, because the natural orbital u,v form of the wavefunction permits a Fock operator based optimization. However, convergence of the GVB run is by no means assured. The same care in selecting the correlating orbitals that you would apply to an MCSCF run must also be used for GVB runs. In particular, look at the orbital expansions when choosing the starting orbitals, and check them again after the run converges.

    GVB runs will be carried out entirely in orthonormal natural u,v form, with strong orthogonality enforced on the geminals. Orthogonal orbitals will pervade your thinking in both initial orbital selection, and the entire orbital optimization phase (the CICOEF values give the weights of the u,v orbitals in each geminal). However, once the calculation is converged, the program will generate and print the nonorthogonal, generalized valence bond orbitals. These GVB orbitals are an entirely equivalent way of presenting the wavefunction, but are generated only after the fact.

    Convergence of GVB runs is by no means as certain as convergence of RHF, UHF, or ROHF. This is especially true when extended bases are used. You can assist convergence by doing a preliminary RHF or UHF calculation (take either the alpha orbitals, or NOs of the latter), and use these orbitals for GUESS=MOREAD. Generation of MVOs during the prelimnary SCF can also be advantageous.

    The total number of electrons in the GVB wavefunction is given by the following formula:

    NE = 2*NCO + sum 2*F(i)*NO(i) + 2*NPAIR

    i The charge is obtained by subtracting the total number of protons given in $DATA. The multiplicity is implicit in the choice of alpha and beta constants. Note that ICHARG and MULT must be given correctly in $CONTRL anyway.

    The Special Case of TCSCF

    The wavefunction with NSETO=0 and NPAIR=1 is called GVB-PP(1) by Goddard, two configuration SCF (TCSCF) by Schaefer or Davidson, and CAS-SCF with two electrons in two orbitals by others. Note that this is just semantics, as all of these are the identical wavefunction. It is also a very important wavefunction, as TCSCF is the minimum treatment required to describe singlet biradicals. This function can be obtained with SCFTYP=MCSCF, but it is usually much faster to use the Fock based SCFTYP=GVB. Due to its importance, the TCSCF function has been singled out for special attention. The DIIS method is implemented for the case NSETO=0 and NPAIR=1. Analytic hessians can be obtained when NPAIR=1, and in this case NSETO may be nonzero.

    A caution about symmetry in GVB

    Caution! Some exotic calculations with the GVB program do not permit the use of symmetry. The symmetry algorithm in GAMESS was "derived assuming that the electronic charge density transforms according to the completely symmetric representation of the point group", Dupuis/King, JCP, 68, 3998(1978). This may not be true for certain open shell cases. Consider the following correct input for the singlet-delta state of NH:

    $CONTRL SCFTYP=GVB NOSYM=1 $END
    $SCF NCO=3 NSETO=2 NO(1)=1,1 $END

    for the x**1y**1 state, or for the x**2-y**2 state,

    $CONTRL SCFTYP=GVB NOSYM=1 $END
    $SCF NCO=3 NPAIR=1 $END

    Neither example will work, unless you enter NOSYM=1. Note that if you are using coupling constants which have been averaged over all degenerate states which comprise a single term, as is done in EXAM15 and EXAM16, the charge density is symmetric, and these runs can exploit symmetry.

    Since GAMESS cannot at present detect that the GVB electronic state is not totally symmetric (or averaged to at least have a totally symmetric density), it is ---your--- responsibility to recognize that NOSYM=1 must be chosen in the input.

    How to do CI and MCSCF calculations

    The majority of the input to a CI calculation is in the $DRT group, with $INTGRL, $CIINP, $TRNSFM, $CISORT, $GUGEM, $GUGDIA, $GUGDM, $GUGDM2, $LAGRAN groups relevant, but hardly ever given. The defaults for all these latter groups are usually correct. Perhaps the most interesting variable outside the $DRT group is NSTATE in $GUGDIA to include excited states in the CI computation.

    The MCSCF part of GAMESS is based on the CI package, so that most of each iteration is just a CI using the current set of orbitals. The final step in each MCSCF iteration is the generation of improved MOs by means of a Newton-Raphson optimization. Thus, all of the input just mentioned for a CI run except $GUGDM is still relevant, with the $MCSCF group being of primary concern to users. The $CIINP group is largely irrelevant, while the $TRFDM2 group is possibly relevant to MCSCF gradient calculations. Once again, the defaults in any of the groups other than $DRT and $MCSCF are probably appropriate for your run. The most interesting variable outside $DRT and $MCSCF is probably WSTATE in $GUGDM2.

    It is very helpful to execute a EXETYP=CHECK run before doing any MCSCF or CI run. The CHECK run will tell you the total number of CSFs, the electronic state, and the charge and multiplicity your $DRT generated. The CHECK run also lets the program feel out the memory that will be required to actually do the run. Thus the CHECK run can potentially prevent costly mistakes, or tell you when a calculation is prohibitively large.

    Specification of the wavefunction

    The configuration state functions (CSFs) to be used in a CI or MCSCF calculation are specified in the $DRT by giving a single reference CSF, and the maximum degree of electron excitation from that CSF.

    The MOs are filled in the order FZC or MCC first, followed by DOC, AOS, BOS, ALP, VAL, EXT, FZV (Aufbau principle). AOS, BOS, and ALP are singly occupied MOs. The open shell singlet couplings NAOS and NBOS must give the same value, and their MOs alternate. An example is the input NFZC=1 NDOC=2 NAOS=2 NBOS=2 NALP=1 NVAL=3 which gives the reference CSF

    FZC,DOC,DOC,AOS,BOS,AOS,BOS,ALP,VAL,VAL,VAL

    This is a doublet state with five unpaired electrons. VAL orbitals are unoccupied only in the reference CSF, they will become occupied as the other CSFs are generated.

    To perform a singles and doubles CI calculation with any such reference, select the electron excitation level IEXCIT=2. For a normal CI, you would give all of the virtual MOs as VALs.

    Another example, for a MCSCF run, might be NMCC=3 NDOC=3 NVAL=2, which gives the reference

    MCC,MCC,MCC,DOC,DOC,DOC,VAL,VAL

    MCSCF calculations are usually of the Full Optimized Reaction Space (FORS) type. Choosing FORS=.TRUE. leads to an excitation level of 4, as the 6 valence electrons have only 4 holes available for excitation. MCSCF runs typically have only a small number of VAL orbitals. If you choose an excitation level IEXCIT that is smaller than that needed to generate the FORS space, be sure to set FORS=.FALSE. in $MCSCF or else VERY POOR or even NO CONVERGENCE will result.

    The next example is a first or second order CI wavefunction, NFZC=3 NDOC=3 NVAL=2 NEXT=-1 leads to

    FZC,FZC,FZC,DOC,DOC,DOC,VAL,VAL,EXT,EXT,...

    FOCI or SOCI is chosen by selecting the appropriate flag, the correct excitation level is automatically generated. Note that the negative number for NEXT causes all remaining MOs to be included in the external orbital space. One way of viewing FOCI and SOCI wavefunctions is as all singles, or all singles and doubles from the entire MCSCF wavefunction as a reference. An equivalent way of saying this is that all CSFs with N electrons (in this case N=6) distributed in the valence orbitals in all ways (that is the FORS MCSCF wavefunction) to make the base wavefunction. To this, FOCI adds all CSFs with N-1 electrons in active and 1 electron in external orbitals. SOCI adds all possible CSFs with N-2 electrons in active orbitals and 2 in external orbitals. SOCI is often prohibitively large, but is also a very accurate wavefunction.

    Use of Symmetry

    The CI part of the program can work with D2h, and any of its Abelian subgroups. However, $DRT allows the input of some (but not all) higher point groups. For these non-Abelian groups, the program automatically assigns the orbitals to an irrep in the highest possible Abelian subgroup. For the other non-Abelian groups, you must at present select GROUP=C1. Note that when you are computing a Hessian matrix, many of the displaced geometries are asymmetric, hence you must choose C1 in $DRT (however, be sure to use the highest symmetry possible in $DATA!).

    As an example, consider a molecule with Cs symmetry, and these two reference CSFs

    ...MCC...DOC DOC VAL VAL
    ...MCC...DOC AOS BOS VAL

    Suppose that the 2nd and 3rd active MOs have symmetries a' and a". Both of these generate singlet wavefunctions, with 4 electrons in 4 active orbitals, but the former constructs 1-A' CSFs, while the latter generates 1-A" CSFs. On the other hand, if the 2nd and 3rd orbitals had the same symmetry type, the references are equivalent.

    Selection of Orbitals

    The first step is to partition the orbital space into core, active, and external sets, in a manner which is sensible for your chemical problem. This is a bit of an art, and the user is referred to the references quoted at the end of this section. Having decided what MCSCF to perform, you now must consider the more pedantic problem of what orbitals to begin the MCSCF calculation with.

    You should always start an MCSCF run with orbitals from some other run, by means of GUESS=MOREAD. Do not expect to be able to use HCORE, MINGUESS, or EXTGUESS! Example 6 is a poor example, converging only because methylene has so much symmetry, and the basis is so small. If you are just beginning your MCSCF problem, use orbitals from some appropriate converged SCF run. Thus, a realistic example of beginning a MCSCF calculation is examples 8 and 9. Once you get an MCSCF to converge, you can and should use these MCSCF MOs (which will be Schmidt orthogonalized) at other nearby geometries.

    Starting from SCF orbitals can take a little bit of care. Most of the time (but not always) the orbitals you want to correlate will be the highest occupied orbitals in the SCF. Fairly often, however, the correlating orbitals you wish to use will not be the lowest unoccupied virtuals of the SCF. You will soon become familiar with NORDER=1 in $GUESS, as well as the nearly mandatory GUESS=MOREAD. You should also explore SYMLOC in $LOCAL, as the canonical SCF MOs are often spread out over regions of the molecule other than "where the action is".

    Convergence of MCSCF is by no means guaranteed. Poor convergence can invariably be traced back to a poor initial selection of orbitals. My advice is, before you even start: "Look at the orbitals. Then look at the orbitals again". Later, if you have any trouble: "Look at the orbitals some more". Even if you don't have any trouble, look at the orbitals to see if they converged to what you expected, and have reasonable occupation numbers. MCSCF is by no means the sort of "black box" that RHF is these days, so LOOK VERY CAREFULLY AT YOUR RESULTS. Since orbitals are very hard to look at on the screen, this means print them out on paper, and inspect the individual MO expansion coefficients! Better yet, plot them and look at their nodal characteristics.

    The iterative Process

    Each MCSCF iteration consists of the following steps: transformation of the integrals to the current MO basis, sorting of these integrals, generation of the Hamiltonian matrix, optimization of the CI coefficients by a Davidson method diagonalization, generation of the second order density matrix, and finally orbital improvment by the Newton-Raphson method. On the first iteration at the first geometry, you will receive the normal amount of printout from each of these stages, while subsequent iterations will produce only a single line summarizing the iteration.

    Each iteration contains MICIT Newton-Raphson orbital improvements. After the first such NR step, a micro- iteration consists solely of an integral transformation over the new MOs, followed immediately by a NR orbital improvment, reusing the old second order density matrix. This avoids the CI diagonalization step, so may be of some use in MCSCF calculations with large CSF expansions. If the number of CSFs is small, using additional micro- iterations is a stupid idea

    .

    Miscellaneous Hints

    Old versions of the program required that you give correct input data in groups like $TRNSFM. Nowadays this is not required, usually only $DRT and $MCSCF are given. Old versions of the program forced you to assign each orbital a symmetry code number, this is now done automatically.

    Note that the partial transformation is coded only for small in-memory jobs, and even then may be slower than the full transformation. This is particularly true on vector machines, as the full transformation is vectorized by matrix multiply calls. Thus the default is the full integral transformation.

    A very common MCSCF wavefunction, the simplest possible function describing a singlet diradical, has 2 electrons in 2 active MOs. While this wavefunction can be obtained in a MCSCF run (using NDOC=1 NVAL=1), it can be obtained much faster by use of the GVB code, with one GVB pair. This GVB-PP(1) wavefunction is also known in the literature as two configuration SCF (TCSCF). The two configuration form of this GVB is equivalent to the three configurations obtained in this MCSCF, as optimization in natural form (20 and 02) causes the coefficient of the 11 CSF to vanish.

    One way to improve upon the SCF orbitals as starting MOs is to generate modified virtual orbitals (MVOs). MVOs are obtained by diagonalizing the Fock operator of a very positive ion, within the virtual orbital space only. As implemented in GAMESS, MVOs can be obtained at the end of any RHF, ROHF, or GVB run by setting MVOQ in $SCF nonzero, at the cost of a single SCF cycle. (Generating MVOs in a run feeding in converged orbitals is more expensive, as the 2 electron integrals must be recomputed). Generating MVOs will never change any of the occupied orbitals, but will generate more valence like LUMOs.

    It is easy to generate large numbers of CSFs with the $DRT input (especially SOCI), which produces a bottleneck in the CI diagonalization step. If the number of CSFs is small (say under 10,000), then the most time consuming step in a CI run is the integral transformation, and in a MCSCF run is both the integral transformation and the Newton-Raphson orbital improvement. However, larger numbers of CSFs will require a great deal of time in the diagonalization step, as well as external disk storage for the pieces of the Hamiltonian matrix. The largest CI calculation performed at NDSU had about 85,000 CSFs, although we have heard of a CISD with over 200,000 CSFs that required 8 IBM 3380 disks to hold the Hamiltonian matrix. The largest MCSCF performed at NDSU had about 20,000 CSFs, although we have heard of one with about 35,000 CSFs. MCSCF typically is smaller, simply because this kind of run must perform a CI diagonalization once every iteration.

    MCSCF Implementation

    The MCSCF code in GAMESS is adapted from the HONDO7 program, and was written by Michel Dupuis of IBM. The implementation is of the type termed "unfolded two-step Newton-Raphson" by Roos. This means that the orbital and CI coefficient optimizations are separated. The latter are obtained in a conventional CI step, and the former by means of a NR orbital improvment (using the augmented hessian matrix). This means convergence is not quite second order, but that only the orbital-orbital hessian terms must be computed.

    There are three different methods for obtaining the two electron contributions to the Lagrangian matrix and the orbital hessian in GAMESS (collectively called the NR matrices). These methods compute exactly the same NR matrices, so the overall number of iterations is the same for all methods.

    The computational problem in forming the NR matrices is the combination of the second order density matrix, and the transformed two electron integrals. Both of these are 4 index matrices, but the DM2 is much smaller since its indices run only over occupied orbitals. The TEIs are much more numerous, as two of the indices run over all orbitals, including unoccupied ones.

    METHOD=TEI is driven by the two electron integrals. After reading all of the DM2 into memory, this method begins to read the TEI file, and combines these integrals with the available DM2 elements. This method uses the least memory, reads the DM2 and TEI files only once each, and is usually very much slower than the other two methods.

    METHOD=DM2 is the reverse strategy, with all the integrals in core, and is driven by reading DM2 elements from disk. Because the number of TEIs is large, the implementation actually divides the TEIs into four sets, (ij/kl), (aj/kl), (ab/kl), and (aj/bl), where i,j,k,l run over the internal orbitals, and a and b are external orbitals. The program must have enough memory to hold all the TEI's belonging to a particular set in memory. Thus, construction of the NR matrices requires 4 reads of the TEI file, and 4 reads of the DM2. METHOD=DM2 requires more memory than METHOD=TEI, and has 4 times the I/O, but is much faster.

    METHOD=FORMULA tries to combine the merits of both methods. This method constructs a Newton formula tape on the first MCSCF iteration at the first geometry computed. This formula tape contains integer pointers telling how TEIs should be combined with DM2 elements to form the NR matrices. If all of the TEIs will not fit into memory, the formula tape must be sorted after it is generated. The formula tape can be saved for use in later runs on the same molecule, see FMLFIL in $MCSCF. This can be a very large disk file, which may diminish your enthusiasm for saving it.

    Once the formula tape is constructed, the "passes" (say M in number) begin. A pass consists of reading as large a block of the TEI file as will fit into the available memory, then reading the related portion of the formula tape and the entire DM2 file for information about how to efficiently combine the TEI's and the DM2. When all of this TEI block is processed, the next block of TEI's is read to begin the next "pass". An iteration thus performs one complete read through the TEI and formula files, and M reads of the DM2. METHOD=FORMULA uses as much memory as the program has available (above a certain minimum), by adjusting the chunk size used for the TEI file), has an I/O requirement somewhat larger than METHOD=TEI, and is perhaps a bit faster than METHOD=DM2. It does require additional disk space for the Newton formula tape.

    Unlike METHOD=DM2 and METHOD=TEI, METHOD=FORMULA requires a canonical order file of transformed integrals. The other two methods use Steve Elbert's reverse canonical ordering, which allows the integral transformation to run 14% faster. Thus, METHOD=FORMULA may lose in the transformation whatever it gains over METHOD=DM2 in the Newton-Raphson step.

    As all three methods have advantages and disadvantages, all three are available. The amount of memory required for each method will be included in your output (including EXETYP=CHECK jobs) to assist you in selecting a method.

    References

    There are several review articles about second order MCSCF methodologies. Of these, Roos' is a nice overview of the subject, while the other 3 are more detailed.

    1. "The Multiconfiguration (MC) SCF Method"
      B.O.Roos, in "Methods in Computational Molecular Physics", edited by G.H.F.Diercksen and S.Wilson D.Reidel Publishing, Dordrecht, Netherlands, 1983, pp 161-187.

    2. "Optimization and Characterization of a MCSCF State"
      Olsen, Yeager, Jorgensen in "Advances in Chemical Physics", vol.54, edited by I.Prigogine and S.A.Rice, Wiley Interscience, New York, 1983, pp 1-176.

    3. "Matrix Formulated Direct MCSCF and Multiconfiguration Reference CI Methods"
      H.-J.Werner, in "Advances in Chemical Physics", vol.69, edited by K.P.Lawley, Wiley Interscience, New York, 1987, pp 1-62.

    4. "The MCSCF Method" R.Shepard, ibid., pp 63-200.

    Some papers particularly germane to the implementation in GAMESS are

    1. "General second order MCSCF theory: A Density Matrix Directed Algorithm"
      B.H.Lengsfield, III, J.Chem.Phys. 73,382-390(1980).

    2. "The use of the Augmented Matrix in MCSCF Theory"
      D.R.Yarkony, Chem.Phys.Lett. 77,634-635(1981).

    The next set of references discuss the choice of orbitals for the active space. This is a matter of some importance, and needs to be understood well by the GAMESS user.

    1. "The CASSCF Method and its Application in Electronic Structure Calculations"
      B.O.Roos, in "Advances in Chemical Physics", vol.69, edited by K.P.Lawley, Wiley Interscience, New York, 1987, pp 339-445.

    2. "Are Atoms Intrinsic to Molecular Electronic Wavefunctions?"
      K.Ruedenberg, M.W.Schmidt, M.M.Dombek, S.T.Elbert Chem.Phys. 71, 41-49, 51-64, 65-78 (1982).

    The final references are simply some examples of FORS MCSCF applications, the latter done with GAMESS.

    1. D.F.Feller, M.W.Schmidt, K.Ruedenberg,
      J.Am.Chem.Soc. 104, 960-967(1982).

    2. M.W.Schmidt, P.N.Truong, M.S.Gordon,
      J.Am.Chem.Soc. 109, 5217-5227(1987).

    Geometry Searches and Internal Coordinates

    Stationary points are places on the potential energy surface with zero gradient vector (first derivative of the energy with respect to nuclear coordinates). These include minima (whether relative or global), better known to chemists as reactants, products, and intermediates; as well as transition states (also known as saddle points). The two types of stationary points have a precise mathematical definition, depending on the curvature of the potential energy surface at these points. If all of the eigenvalues of the hessian matrix (second derivative of the energy with respect to nuclear coordinates) are positive, the stationary point is a minimum. If there is one, and only one, negative curvature, the stationary point is a transition state. Points with more than one negative curvature exist, but are not of particularly great interest in chemistry. Because vibrational frequencies are basically the square roots of these curvatures, a minimum has all real frequencies, and a saddle point has one imaginary frequency.

    GAMESS locates minima by geometry optimization, RUNTYP=OPTIMIZE, and transition states by saddle point searches, RUNTYP=SADPOINT. In many ways these are similar, and in fact nearly identical FORTRAN code is used for both. The term "geometry search" is used here to describe features which are common to both procedures. The input to control both RUNTYPs is found in the $STATPT group.

    As will be noted in the symmetry section below, an OPTIMIZE run does not always find a minimum, and a SADPOINT run may not find a transtion state, even though the gradient is brought to zero. You can prove you have located a minimum or saddle point only by examining the local curvatures of the potential energy surface. This can be done by following the geometry search with a RUNTYP=HESSIAN job, which should be a matter of routine.

    Quasi-Newton Searches

    Geometry searches are most effectively done by what is called a quasi-Newton-Raphson procedure. These methods assume a quadratic potential surface, and require the exact gradient vector and an approximation to the hessian. It is the approximate nature of the hessian that makes the method "quasi". The rate of convergence of the geometry search depends on how quadratic the real surface is, and the quality of the hessian. The latter is something you have control over, and is discussed in the next section.

    GAMESS contains two different implementations of the quasi-Newton procedure for finding stationary points, namely METHOD=SCHLEGEL and METHOD=BAKER. For minima searches, the two procedures vary mainly in the algorithm that is used to approximately update the hessian as the search proceeds. For saddle point runs, Baker's method treats the unique direction of negative curvature in a more sophisticated way than the Schlegel algorithm. For both kinds of searches, Baker's method seems to be the more robust, boasting the ability (in some cases) to locate transition states even when started in a region of all positive local curvatures.

    You should read the papers by Baker and Schlegel to understand how these methods are designed to work. See section 2 of the paper by Dunning's group for the Argonne approach to transition searches. A good summary of various search procedures is given by Bell and Crighton, and in the review by Schlegel. These papers are:

    1. J.Baker J.Comput.Chem. 7,385-395(1986)

    2. H.B.Schlegel J.Comput.Chem. 3,214(1982)

    3. T.H.Dunning, L.B.Harding, R.A.Blair, R.A.Eades, R.L.Shepard J.Phys.Chem. 90, 344-356, (1986).

    4. S.Bell, J.S.Crighton J.Chem.Phys. 80, 2464-2475, (1984).

    5. H.B.Schlegel Advances in Chemical Physics (Ab Initio Methods in Quantum Chemistry, Part I), volume 67, K.P.Lawley, Ed. Wiley, New York, 1987, pp 249-286.

    The Hessian

    Although quasi-Newton methods require only an approximation to the true hessian, the choice of this matrix has a great affect on convergence of the geometry search. The choice of the hessian is controlled by the variable HESS in the $STATPT group.

    There is a procedure contained within GAMESS for guessing a diagonal, positive definite hessian matrix, HESS=GUESS. This guess is more sophisticated when internal coordinates are defined, as Badger's rules can be used to estimate stretching force constants. This option often works well for minima, but cannot possibly help you find transition states (because it is positive definite). Therefore, GUESS cannot be selected for SADPOINT runs.

    Two options for providing a more accurate hessian are HESS=READ and CALC. In the latter, the true hessian is obtained by direct calculation at the initial geometry, and then the geometry search begins, all in one run. The READ option allows you to feed in the hessian in a $HESS group, as obtained by a RUNTYP=HESSIAN job. The second procedure may actually be preferable, as you get a chance to see the frequencies. Then, if the local curvatures look good, you can commit to the geometry search. Be sure to include the $GRAD group (if the exact gradient is available) in the HESS=READ job so that GAMESS can take its first step immediately.

    Note also that you can compute the hessian at a lower basis set and/or wavefunction level, and read this into a higher level geometry search. This trick works because the hessian is 3Nx3N for N atoms, no matter what atomic basis is used. The gradient from the lower level is of course worthless, as the geometry search must work with the exact gradient.

    You get what you pay for here. HESS=GUESS is free, but may lead to significantly more steps in the geometry search. The other two options are more expensive at the beginning, but may pay back by rapid convergence to the stationary point.

    The procedure for updating the hessian as the geometry search progresses frequently improves the hessian for a few steps (especially for HESS=GUESS), and then breaks down. The symptoms of this are a nice lowering of the energy or RMS gradient for maybe 10 steps, followed by crazy steps. You can help by putting the best coordinates into $DATA, and resubmitting the job, which will then make a fresh determination of the hessian.

    Internal Coordinates

    Normally you should use internal coordinates (see NZVAR in $CONTRL and $ZMAT) in a geometry search. Using internal coordinates automatically eliminates the six rotational and translational degrees of freedom, which can be troublesome. As just mentioned, the GUESS hessian is able to use empirical information about bond stretches when internals are used. On the other hand, there are many possible choices for the internal coordinates, and some of these may lead to much poorer convergence of the geometry search than others.

    Optimizations in Cartesian space are relatively uncoupled, whereas internal coordinates are frequently strongly coupled. Because of this, Jerry Boatz calls them "infernal coordinates"! An very common example to illustrate this might be a bond length in a ring, and an angle on the opposite side of the ring. Clearly, changing one changes the other simultaneously. A more mathematical definition of "coupled" might be to say that there is a large off diagonal element in the hessian. In this case convergence might well be unsatisfactory, especially with the diagonal GUESS hessian, but possibly even with an accurately calculated one.

    Particularly poorly chosen coordinates may not correspond to a quadratic surface, thereby ending all hope that a quasi-Newton method will converge.

    The best procedure is to pick a set of internal coordinates which are as independent as possible. For a straight chain molecule of N atoms, one might do this according to the usual "model builder" Z-matrix input, using N-1 bond lengths, N-2 bond angles, and N-3 torsions. However, you should note that GAMESS does not require this particular mixture of the common types. The only requirement is that you provide a total of 3N-6 coordinates, chosen from these three basic types or several more exotic possibilities. (Of course, we mean 3N-5 throughout for linear molecules).

    For example, the most effective choice of coordinates for the atoms in a four membered ring is to define all four sides, any one of the internal angles, and a dihedral defining the ring pucker. For a six membered ring, the best coordinates seem to be 6 sides, 3 angles, and 3 torsions. The angles should be every other internal angle, so that the molecule can "breathe" freely. The torsions should be arranged so that the central bond of each is placed on alternating sides of the ring, as if they were pi bonds in Kekule benzene. For a five membered ring, we suggest all 5 sides, 2 internal angles, alternating every other one, and 2 dihedrals to fill in. The internal angles of necessity skip two atoms where the ring closes. Larger rings should generalize on the idea of using all sides and only alternating angles.

    Rings and more especially fused rings can be tricky. In fact, these systems may work better in Cartesian coordinates, and also may benefit from decreasing DXMAX to avoid large steps. Rings nearly always benefit from a computed hessian, rather than the diagonal GUESS one, even when looking for a minimum.

    GAMESS also has several other types of internal coordinates, such as linear bends for 3 collinear atoms, out of plane bends, and so on. If you experience difficulty with one set of internals, consider using some of these more exotic types. A little thought about a set of coordinates which are natural for your problem can be the difference between good convergence of the geometry search, and poor or even no convergence.

    In many cases the "model builder" set of coordinates (N-1 bonds, N-2 angles, and N-3 torsions) is actually an appropriate one. If you have entered your molecule in $DATA using one of the Z-matrix options, GAMESS can automatically build a $ZMAT using the "model builder" coordinates given in $DATA. This is done by setting NZVAR to 3N-6, while not entering a $ZMAT. Note that this option will not work if you have used any dummy atoms to enter the molecule. If you want to override the "model builder" choice with some other set, simply leave NZVAR at 3N-6, and provide the appropriate $ZMAT.

    A poor choice of internal coordinates will sometimes cause GAMESS to stop with a unfriendly message about "Cartesian displacement exceeds 3*DXMAX". This is often caused by three angles around an atom summing to nearly 360 degrees (an almost planar center), but can be due to other causes as well. The best fix for this is to look for a set of internals which is better behaved. Sometimes providing a better hessian than GUESS will help, but a willingness to think about alternate coordinates is even better. A little brain time can save a lot of CPU time.

    If the molecule has symmetry, be sure to define coordinates which are symmetrically related.

    Sometimes it is handy to constrain the geometry search by freezing one or more coordinates, via the IFREEZ array. For example, such constrained optimizations might be useful while trying to determine what area of the potential surface contains a saddle point. GAMESS will not actually hold such frozen coordinates truly constant, instead they sometimes drift a small amount. This does not trouble us in the least, as a structure obtained under such a constraint is neither a minimum or a saddle point anyway. If you are trying to freeze coordinates with an automatically generated $ZMAT, you need to know that the order of the coordinates is

        
              y
              y  x r1
              y  x r2  x a3
              y  x r4  x a5  x w6
              y  x r7  x a8  x w9
        
    and so on, where y and x are whatever atoms and molecular connectivity you happen to be using.

    The Role of Symmetry

    At the end of a succesful geometry search, you will have a set of coordinates where the gradient of the energy is zero. However this does not mean that your newly discovered stationary point is a minimum or saddle point, simply because you had requested RUNTYP=OPTIMIZE or SADPOINT!

    This apparent mystery is due to the fact that the gradient vector transforms under the totally symmetric representation of the molecular point group. As a direct consequence, a geometry search is point group conserving. (For a proof of these statements, see J.W.McIver and A.Komornicki, Chem.Phys.Lett., 10,303-306(1971)). In simpler terms, the molecule will remain in whatever point group you select in $DATA, even if the true minimum is in some lower point group. Since a geometry search only explores totally symmetric degrees of freedom, the only way to learn about the curvatures for all degrees of freedom is RUNTYP=HESSIAN.

    As an example, consider disilene, the silicon analog of ethene. It is natural to assume that this molecule is planar like ethene, and an OPTIMIZE run in D2h symmetry will readily locate a stationary point. However, as a calculation of the hessian will readily show, this structure is a transition state (one imaginary frequency), and the molecule is really trans-bent (C2h). A careful worker will always characterize a stationary point as either a minimum, a transition state, or some higher order stationary point (which is not of great interest!) by performing a RUNTYP=HESSIAN.

    The point group conserving properties of a geometry search can be annoying, as in the preceeding example, or advantageous. For example, assume you wish to locate the transition state for rotation about the double bond in ethene. A little thought will soon reveal that ethene is D2h, the 90 degrees twisted structure is D2d, and structures in between are D2. Since the saddle point is actually higher symmetry than the rest of the rotational surface, you can locate it by RUNTYP=OPTIMIZE within D2d symmetry. You can readily find this stationary point with the diagonal guess hessian! In fact, if you attempt to do a RUNTYP=SADPOINT within D2d symmetry, there will be no totally symmetric modes with negative curvatures, and it is unlikely that the geometry search will be very well behaved.

    Although a geometry search cannot lower the symmetry, the gain of symmetry is quite possible. For example, if you initiate a water molecule optimization with a trial structure which has unequal bond lengths, the geometry search will come to a structure that is indeed C2v (to within OPTTOL, anyway). However, GAMESS leaves it up to you to realize that a gain of symmetry has occurred.

    In general, Mother Nature usually chooses more symmetrical structures over less symmetrical structures. Therefore you are probably better served to assume the higher symmetry, perform the geometry search, and then check the stationary point's curvatures. The alternative is to start with artificially lower symmetry and see if your system regains higher symmetry. The problem with this approach is that you don't necessarily know which subgroup is appropriate, and you lose the great speedups GAMESS can obtain from proper use of symmetry. It is good to note here that "lower symmetry" does not mean simply changing the name of the point group and entering more atoms in $DATA, instead the nuclear coordinates themselves must actually be of lower symmetry.

    Practical Matters

    Geometry searches do not bring the gradient exactly to zero. Instead they stop when the largest component of the gradient is below the value of OPTTOL, which defaults to a reasonable 0.0001. Analytic hessians usually have residual freguencies below 10 cm**-1 with this degree of optimization. The sloppiest value you probably ever want to try is 0.0005.

    If a geometry search should happen to run out of time, or to exceed NSTEP before meeting the OPTTOL criterion, it can easily be restarted. For RUNTYP=OPTIMIZE, the restart should use the coordinates with the lowest total energy (do a string search on "FINAL"). For RUNTYP=SADPOINT, the restart should use the coordinates which have the smallest gradient (do a string search on "RMS", which means root mean square). Note that these are not necessarily the last geometry considered by the program!

    The "restart" should actually be a normal run, that is you should not try to use the restart options in $CONTRL (which anyway may not work). A geometry search can be restarted by extracting the desired coordinates for $DATA from the printout, and by extracting the corresponding $GRAD group from the PUNCH file. If the $GRAD group is supplied, the program is able to save the time it would ordinarily take to compute the wavefunction and gradient at the initial point, which can be a substantial savings. There is no input to trigger reading of a $GRAD group: if found, it is read and used. Be careful that your $GRAD group actually corresponds to the coordinates in $DATA, as GAMESS has no check for this.

    Sometimes when you are fairly close to the minimum, an OPTIMIZE run will take a first step which raises the energy, with subsequent steps improving the energy and perhaps finding the minimum. The erratic first step is caused by the GUESS hessian, and the subsequent convergence by the updating procedures correcting this guess. It can be helpful to limit the size of this wrong first step, by reducing the radius of the step, DXMAX. Conversely, if you are far from the minimum, sometimes you can decrease the total number of steps by increasing DXMAX.

    Saddle Points

    Finding minima is relatively easy. There are extensive tables of bond lengths and angles, so guessing starting geometries is pretty straightforward. Very nasty cases may require computation of an exact hessian, but location of most minima is straightforward.

    In contrast, finding saddle points is a black art. The diagonal guess hessian will never work, so you must provide a computed one. The hessian should be computed at your best guess as to what the transition state should be. It is safest to do this in two steps as outlined above, rather than HESS=CALC. This lets you verify you have guessed a structure with one and only one negative curvature. Guessing a good trial structure is the hardest part of a RUNTYP=SADPOINT!

    The RUNTYP=HESSIAN's normal coordinate analysis is rigorously valid only at stationary points on the surface. This means the frequencies from the hessian at your trial geometry are untrustworthy, in particular the six "zero" frequencies corresponding to translational and rotational degrees of freedom will usually be 300-500 cm**-1, and possibly imaginary. The Sayvetz conditions on the printout will help you distinguish the T&R "contaminants" from the real vibrational modes. If you have defined a $ZMAT, the PURIFY option in $FORCE will help zap out these T&R contaminants), but this is not really necessary.

    If the hessian at your assumed geometry does not have one and only one imaginary frequency (taking into account that the "zero" frequencies can sometimes be 300i!), then it will probably be difficult to find the saddle point. Instead you need to compute a hessian at a better guess for the initial geometry, or else read about mode following below.

    If you need to restart your run, do so with the coordinates which have the smallest RMS gradient. Note that the energy does not necessarily have to decrease in a SADPOINT run, in contrast to an OPTIMIZE run. It is often necessary to do several restarts, involving recomputation of the hessian, before actually locating the saddle point.

    Assuming you do find the T.S., it is always a good idea to recompute the hessian at this structure. As described in the discussion of symmetry, only totally symmetric vibrational modes are probed in a geometry search. Thus it is fairly common to find that at your "T.S." there is a second imaginary frequency, which corresponds to a non-totally symmetric vibration. This means you haven't found the correct T.S., and are back to the drawing board. The proper procedure is to lower the point group symmetry by distorting along the symmetry breaking "extra" imaginary mode, by a reasonable amount. Don't be overly timid in the amount of distortion, or the next run will come back to the invalid structure.

    The real trick here is to find a good guess for the transition structure. The closer you are, the better. It is often difficult to guess these structures. One way around this is to compute a linear least motion (LLM) path. This connects the reactant structure to the product structure by linearly varying each coordinate. If you generate about ten structures intermediate to reactants and products, and compute the energy at each point, you will in general find that the energy first goes up, and then down. The maximum energy structure is a "good" guess for the true T.S. structure. Actually, the success of this method depends on how curved the reaction path is.

    Mode Following

    In certain circumstances, METHOD=BAKER can walk from a region of all positive curvatures to a transition state. This is a tricky business, and should be attempted only after reading Baker's paper carefully. Note that GAMESS retains all degrees of freedom in its hessian, and thus there is no reason to suppose the lowest mode is totally symmetric. Of course, you should attempt to follow only totally symmetric modes with IFOLOW. You can get a printout of the modes in internal coordinate space by a EXETYP=CHECK run, which will help you decide on the value of IFOLOW.

    Mode following is really not a substitute for the ability to intuit regions of the PES with a single local negative curvature. Baker's case D works well, but case E is much trickier. When you start near a minimum, it matters a great deal which side of the minima you start from, as the direction of the search depends on the sign of the gradient. Be sure to use your brain when you use IFOLOW!

    MOPAC Calculations in GAMESS

    Parts of MOPAC 6.0 have been included in GAMESS so that the GAMESS user has access to three semiempirical wavefunctions: MNDO, AM1 and PM3. These wavefunctions are quantum mechanical in nature but neglect most two electron integrals, a deficiency that is (hopefully) compensated for by introduction of empirical parameters. The quantum mechanical nature of semiempirical theory makes it quite compatible with the ab initio methodology in GAMESS. As a result, very little of MOPAC 6.0 actually is incorporated into GAMESS. The part that did make it in is the code that evaluates

    1. the one- and two-electron integrals,

    2. the two-electron part of the Fock matrix,

    3. the cartesian energy derivatives, and

    4. the ZDO atomic charges and molecular dipole.

    Everything else is actually GAMESS: coordinate input (including point group symmetry), the SCF convergence procedures, the matrix diagonalizer, the geometry searcher, the numerical hessian driver, and so on. Most of the output will look like an ab initio output.

    It is extremely simple to perform one of these calculations. All you need to do is specify GBASIS=MNDO, AM1, or PM3 in the $BASIS group. Note that this not only picks a particular Slater orbital basis, it also selects a particular "hamiltonian", namely a particular parameter set.

    MNDO, AM1, and PM3 will not work with every option in GAMESS. Currently the semiempirical wavefunctions support SCFTYP=RHF, UHF, and ROHF in any combination with RUNTYP=ENERGY, GRADIENT, OPTIMIZE, SADPOINT, HESSIAN, and IRC. Note that all hessian runs are numerical finite differencing. The MOPAC CI and half electron methods are not supported.

    Because the majority of the implementation is GAMESS rather than MOPAC you will notice a few improvments. Dynamic memory allocation is used, so that GAMESS uses far less memory for a given size of molecule. The starting orbitals for SCF calculations are generated by a Huckel initial guess routine. Spin restricted (high spin) ROHF can be performed. Converged SCF orbitals will be labeled by their symmetry type. Numerical hessians will make use of point group symmetry, so that only the symmetry unique atoms need to be displaced. Infrared intensities will be calculated at the end of hessian runs. We have not at present used the block diagonalizer during intermediate SCF iterations, so that the run time for a single geometry

    point in GAMESS is usually longer than in MOPAC. However, the geometry optimizer in GAMESS can frequently optimize the structure in fewer steps than the procedure in MOPAC. Orbitals and hessians are punched out for convenient reuse in subsequent calculations. Your molecular orbitals can be drawn with the PLTORB graphics program.

    To reduce CPU time, only the EXTRAP convergence accelerator is used by the SCF procdures. For difficult cases, the DIIS, RSTRCT, and/or SHIFT options will work, but may add significantly to the run time. With the Huckel guess, most calculations will converge acceptably without these special options.

    MOPAC parameters exist for the following elements. The quote means that these elements are treated as "sparkles" rather than as atoms with genuine basis functions. For MNDO:

    
    
     H
    Li  *          B  C  N  O  F
    Na' *         Al Si  P  S Cl
     K' * ...  Zn  * Ge  *  * Br
    Rb' * ...   *  * Sn  *  *  I
    *   * ...  Hg  * Pb  *
    
             For AM1:                         For PM3:
     H                               H
     *  *          B  C  N  O  F     *  Be         *  C  N  O  F
    Na' *         Al Si  P  S Cl    Na' Mg        Al Si  P  S Cl
     K' * ...  Zn  * Ge  *  * Br     K' * ...  Zn Ga Ge As Se Br
    Rb' * ...   *  * Sn  *  *  I    Rb' * ...  Cd In Sn Sb Te  I
    *   * ...  Hg  *  *  *          *   * ...  Hg Tl Pb Bi
    
    Semiempirical calculations are very fast. One of the motives for the MOPAC implementation within GAMESS is to take advantage of this speed. Semiempirical models can rapidly provide reasonable starting geometries for ab initio optimizations. Semiempirical hessian matrices are obtained at virtually no computational cost, and may help dramatically with an ab initio geometry optimization. Simply use HESS=READ in $STATPT to use a MOPAC $HESS group in an ab initio run.

    It is important to exercise caution as semiempirical methods can be dead wrong! The reasons for this are bad parameters (in certain chemical situations), and the underlying minimal basis set. A good question to ask before using MNDO, AM1 or PM3 is "how well is my system modeled with an ab initio minimal basis sets, such as STO-3G?" If the answer is "not very well" there is a good chance that a semiempirical description is equally poor.

    Molecular Properties and Conversion Factors

    These two papers are of general interest: A.D.Buckingham, J.Chem.Phys. 30, 1580-1585(1959). D.Neumann, J.W.Moskowitz J.Chem.Phys. 49, 2056-2070(1968).

    All units are derived from the atomic units for distance and the monopole electric charge, as given below.

    
    
    distance               - 1 au = 5.291771E-09 cm
    
    monopole               - 1 au = 4.803242E-10 esu
                            1 esu = sqrt(g-cm**3)/sec
    
    dipole                 - 1 au = 2.541766E-18 esu-cm
                          1 Debye = 1.0E-18 esu-cm
    
    quadrupole             - 1 au = 1.345044E-26 esu-cm**2
                     1 Buckingham = 1.0E-26 esu-cm**2
    
    octupole               - 1 au = 7.117668E-35 esu-cm**3
    
    electric potential     - 1 au = 9.076814E-02 esu/cm
    
    electric field         - 1 au = 1.715270E+07 esu/cm**2
                      1 esu/cm**2 = 1 dyne/esu
    
    electric field gradient- 1 au = 3.241390E+15 esu/cm**3
    
    The atomic unit for electron density is electron/bohr**3 for the total density, and 1/bohr**3 for an orbital density.

    The atomic unit for spin density is excess alpha spins per unit volume, h/4*pi*bohr**3. Only the expectation value is computed, with no constants premultiplying it.

    IR intensities are printed in Debye**2/amu-Angstrom**2. These can be converted into intensities as defined by Wilson, Decius, and Cross's equation 7.9.25, in km/mole, by multiplying by 42.255. If you prefer 1/atm-cm**2, use a conversion factor of 171.65 instead. A good reference for deciphering these units is A.Komornicki, R.L.Jaffe J.Chem.Phys. 1979, 71, 2150-2155. A reference showing how IR intensities change with basis improvements at the HF level is Y.Yamaguchi, M.Frisch, J.Gaw, H.F.Schaefer, J.S.Binkley, J.Chem.Phys. 1986, 84, 2262-2278.

    Localization Tips

    Three different orbital localization methods are implemented in GAMESS. The energy and dipole based methods normally produce similar results, but see M.W.Schmidt, S.Yabushita, M.S.Gordon in J.Chem.Phys., 1984, 88, 382-389 for an interesting exception. You can find references to the three methods at the beginning of this chapter.

    The method due to Edmiston and Ruedenberg works by maximizing the sum of the orbitals' two electron self repulsion integrals. Most people who think about the different localization criteria end up concluding that this one seems superior. The method requires the two electron integrals, transformed into the molecular orbital basis. Because only the integrals involving the orbitals to be localized are needed, the integral transformation is actually not very time consuming.

    The Boys method maximizes the sum of the distances between the orbital centroids, that is the difference in the orbital dipole moments.

    The population method due to Pipek and Mezey maximizes a certain sum of gross atomic Mulliken populations. This procedure will not mix sigma and pi bonds to yield localized banana bonds. Hence it is rather easy to find cases where this method give rather different results than the Ruedenberg or Boys approach.

    GAMESS will localize orbitals for any kind of RHF, UHF, ROHF, or MCSCF wavefunctions. The localizations will automatically restrict any rotation that would cause the energy of the wavefunction to be changed (the total wavefunction is left invariant). As discussed below, localizations for GVB or CI functions are not permitted.

    The default is to freeze core orbitals. The localized valence orbitals are scarcely changed if the core orbitals are included, and it is usually convenient to leave them out. Therefore, the default localizations are: RHF functions localize all doubly occupied valence orbitals. UHF functions localize all valence alpha, and then all valence beta orbitals. ROHF functions localize all doubly occupied orbitals, and all singly occupied orbitals, but do not mix these two orbital spaces. MCSCF functions localize all valence MCC type orbitals, and localize all active orbitals, but do not mix these two orbital spaces. To recover invariant an MCSCF function, you must be using a FORS=.TRUE. wavefunction, and you must set GROUP=C1 in $DRT as the localized orbitals possess no symmetry.

    In general, GVB functions are invariant only to localizations of the NCO doubly occupied orbitals. Any pairs must be written in natural form, so pair orbitals cannot be localized. The open shells may be degenerate, so in general these should not be mixed. If for some reason you feel you must localize the doubly occupied space, do a RUNTYP=PROP job. Feed in the GVB orbitals, but tell the program it is SCFTYP=RHF, and enter a negative ICHARG so that GAMESS thinks all orbitals occupied in the GVB are occupied in this fictitous RHF. Use NINA or NOUTA to localize the desired doubly occupied orbitals. Orbital localization is not permitted for CI, because we cannot imagine why you would want to do that anyway.

    Boys localization of the core orbitals in molecules having elements from the third or higher row almost never succeeds. Boys localization including the core for second row atoms will often work, since there is only one inner shell on these. The Ruedenberg method should work for any element, although including core orbitals in the integral transformation is more expensive.

    The easiest way to do localization is in the run which determines the wavefunction, by selecting LOCAL=xxx in the $CONTRL group. However, localization may be conveniently done at any time after determination of the wavefunction, by executing a RUNTYP=PROP job. This will require only $CONTRL, $BASIS/$DATA, $GUESS (pick MOREAD), the converged $VEC, possibly $SCF or $DRT to define your wavefunction, and optionally some $LOCAL input.

    There is an option to restrict all rotations that would mix orbitals of different symmetries. SYMLOC=.TRUE. yields only partially localized orbitals, but these still possess symmetry. They are therefore very useful as starting orbitals for MCSCF or GVB-PP calculations. Because they still have symmetry, these partially localized orbitals run as efficiently as the canonical orbitals. Because it is much easier for a user to pick out the bonds which are to be correlated, a significant number of iterations can be saved, and convergence to false solutions is less likely.

    Intrinsic Reaction Coordinate (IRC) Methods

    The Intrinsic Reaction Coordinate (IRC) is defined as the minimum energy path connecting the reactants to products via the transition state. In practice, the IRC is found by first locating the transition state for the reaction. The IRC is then found in halves, going forward and backwards from the saddle point, down the steepest descent path in mass weighted Cartesian coordinates. This is accomplished by numerical integration of the IRC equations, by a variety of methods to be described below.

    The IRC is becoming an important part of polyatomic dynamics research, as it is hoped that only knowledge of the PES in the vicinity of the IRC is needed for prediction of reaction rates, at least at threshhold energies. The IRC has a number of uses for electronic structure purposes as well. These include the proof that a certain transition structure does indeed connect a particular set of reactants and products, as the structure and imaginary frequency normal mode at the saddle point do not always unambiguously identify the reactants and products. The study of the electronic and geometric structure along the IRC is also of interest. For example, one can obtain localized orbitals along the path to determine when bonds break or form.

    The accuracy to which the IRC is determined is dictated by the use one intends for it. Dynamical calculations require a very accurate determination of the path, as derivative information (second derivatives of the PES at various IRC points, and path curvature) is required later. Thus, a sophisticated integration method (such as AMPC4 or RK4), and small step sizes (STRIDE=0.05, 0.01, or even smaller) will be needed. In addition to this, care should be taken to locate the transition state carefully (perhaps decreasing OPTTOL by a factor of 10), and in the initiation of the IRC run. The latter might require a hessian matrix obtained by double differencing, certainly the hessian should be PURIFY'd. Note also that EVIB must be chosen carefully, as decribed below.

    On the other hand, identification of reactants and products allows for much larger step sizes, and cruder integration methods. In this type of IRC one might want to be careful in leaving the saddle point (perhaps STRIDE should be reduced to 0.10 or 0.05 for the first few steps away from the transition state), but once a few points have been taken, larger step sizes can be employed. In general, the defaults in the $IRC group are set up for this latter, cruder quality IRC. The STRIDE value for the GS2 method can usually be safely larger than for other methods, no matter what your interest in accuracy is.

    The simplest method of determining an IRC is linear gradient following, PACE=LINEAR. This method is also known as Euler's method. If you are employing PACE=LINEAR, you can select "stabilization" of the reaction path by the Ishida, Morokuma, Komornicki method. This type of corrector has no apparent mathematical basis, but works rather well since the bisector usually intersects the reaction path at right angles (for small step sizes). The ELBOW variable allows for a method intermediate to LINEAR and stabilized LINEAR, in that the stabilization will be skipped if the gradients at the original IRC point, and at the result of a linear prediction step form an angle greater than ELBOW. Set ELBOW=180 to always perform the stabilization.

    A closely related method is PACE=QUAD, which fits a quadratic polynomial to the gradient at the current and immediately previous IRC point to predict the next point. This pace has the same computational requirement as LINEAR, and is slightly more accurate due to the reuse of the old gradient. However, stabilization is not possible for this pace, thus a stabilized LINEAR path is usually more accurate than QUAD.

    Two rather more sophisticated methods for integrating the IRC equations are the fourth order Adams-Moulton predictor-corrector (PACE=AMPC4) and fourth order Runge- Kutta (PACE=RK4). AMPC4 takes a step towards the next IRC point (prediction), and based on the gradient found at this point (in the near vincinity of the next IRC point) obtains a modified step to the desired IRC point (correction). AMPC4 uses variable step sizes, based on the input STRIDE. RK4 takes several steps part way toward the next IRC point, and uses the gradient at these points to predict the next IRC point. RK4 is the most accurate integration method implemented in GAMESS, and is also the most time consuming.

    The Gonzalez-Schlegel 2nd order method finds the next IRC point by a constrained optimization on the surface of a hypersphere, centered at 1/2 STRIDE along the gradient vector leading from the previous IRC point. By construction, the reaction path between two successive IRC points is thus a circle tangent to the two gradient vectors. The algorithm is much more robust for large steps than the other methods, so you may wish to try STRIDE=0.4. The number of energy and gradients need to find the next point depends on the difficulty of the constrained optimization, but is normally not very many points. The defaults for GCUT and RCUT are meant for use with STRIDE=0.4, so if you decide to use a smaller stride you should reduce these parameters accordingly. Be sure to provide the updated hessian from the previous run when restarting PACE=GS2.

    The number of wavefunction evaluations, and energy gradients needed to jump from one point on the IRC to the next point are summarized in the following table:

    
    
         PACE      # energies   # gradients
         ----      ----------   -----------
        LINEAR        1             1
    stabilized
        LINEAR        3             2
        QUAD          1             1  (+ reuse of historical
                                                gradient)
        AMPC4         2             2  (see note)
        RK4           4             4
        GS2          2-4           2-4 (equal numbers)
    
    Note that the AMPC4 method sometimes does more than one correction step, with each such corection adding one more energy and gradient to the calculation. You get what you pay for in IRC calculations: the more energies and gradients which are used, the more accurate the path found.

    * * *

    All methods are initiated by jumping from the saddle point, parallel to the normal mode (CMODE) which has an imaginary frequency. The jump taken is designed to lower the energy by an amount EVIB. The actual distance taken is thus a function of the imaginary frequency, as a smaller FREQ will produce a larger initial jump. You can simply provide a $HESS group instead of CMODE and FREQ, which involves less typing. To find out the actual step taken for a given EVIB, use EXETYP=CHECK. The direction of the jump (towards reactants or products) is governed by FORWRD. Note that if you have decided to use small step sizes, you must employ a smaller EVIB to ensure a small first step.

    The only method which proves that a properly converged IRC has been obtained is to regenerate the IRC with a smaller step size, and check that the IRC is unchanged. Again, note that the care with which an IRC must be obtained is highly dependent on what use it is intended for.

    A small program which converts the IRC results punched to file IRCDATA into a format close to that required by the POLYRATE VTST dynamics program written in Don Truhlar's group is available. For a copy of this IRCED program, contact Mike Schmidt. The POLYRATE program must be obtained from the Truhlar group.

    
        Some key IRC references are:
    K.Ishida, K.Morokuma, A.Komornicki
          J.Chem.Phys.  66, 2153-2156 (1977)
    K.Muller
          Angew.Chem., Int.Ed.Engl.19, 1-13 (1980)
    M.W.Schmidt, M.S.Gordon, M.Dupuis
          J.Am.Chem.Soc.  107, 2585-2589 (1985)
    B.C.Garrett, M.J.Redmon, R.Steckler, D.G.Truhlar,
    K.K.Baldridge, D.Bartol, M.W.Schmidt, M.S.Gordon
          J.Phys.Chem.  92, 1476-1488(1988)
    K.K.Baldridge, M.S.Gordon, R.Steckler, D.G.Truhlar
          J.Phys.Chem.  93, 5107-5119(1989)
    C.Gonzales, H.B.Schlegel
          J.Phys.Chem.  94, 5523-5527(1990)
          J.Chem.Phys.  95, 5853-5860(1991)
    

    The IRC Discussion Closes With Some Practical Tips

    The $IRC group has a confusing array of variables, but fortunately very little thought need be given to most of them. An IRC run is restarted by moving the coordinates of the next predicted IRC point into $DATA. If there are symmetry related atoms, you must discard the generated atoms, keeping only the last atom as the unique one to be input. You also must replace the old $IRC group with its punched out replacement. Only NPOINT in $IRC need be selected for the next run. This restart data is written at the end of the PUNCH file, and for redundancy is also near the end of the OUTPUT file. Thus, only the job which initiates the IRC requires much thought about $IRC.

    The symmetry specified in the $DATA deck should be the symmetry of the reaction path. Even if a saddle point should have higher symmetry, use only the lower symmetry in the $DATA deck when initiating the IRC. The reaction path will have a lower symmetry than the saddle point whenever the normal mode with imaginary frequency is not totally symmetric. Be careful that the order and orientation of the atoms corresponds to that used in the run which generated the hessian matrix.

    If you wish to follow an IRC for a different isotope, use the $MASS group. If you wish to follow the IRC in regular Cartesian coordinates, just enter unit masses for each atom. Note that CMODE and FREQ are a function of the atomic masses, so either regenerate FREQ and CMODE, or more simply, provide the correct $HESS group.

    Transition Moments and Spin-Orbit Coupling

    GAMESS can compute transition moments and oscillator strengths for the radiative transition between two CI wavefunctions. The moments are computed using both the "length (dipole) form" and "velocity form". GAMESS can also compute the one electron portion of the "microscopic Breit-Pauli spin orbit operator". The two electron terms can be approximately accounted for by means of effective nuclear charges.

    The orbitals for the CI can be one common set of orbitals used by both CI states. If one set of orbitals is used, the transition moment or spin-orbit coupling can be found for any type of CI wavefunction GAMESS can compute.

    Alternatively, two sets of orbitals (obtained from separate MCSCF orbital optimizations) can be used. Two separate CIs will be carried out (in 1 run). The two MO sets must share a common set of frozen core orbitals, and the CI -must- be of the complete active space type. These restrictions are needed to leave the CI wavefunctions invariant under the necessary transformation to corresponding orbitals. The non-orthogonal procedure implemented in GAMESS is a GUGA driven equivalent to the method of Lengsfield, et al. Note that the FOCI and SOCI methods described by these workers are not available in GAMESS.

    If you would like to use separate orbitals for the states, use the FCORE option in $MCSCF in a SCFTYP=MCSCF optimization of the orbitals. Typically you would optimize the ground state completely, and use these MCSCF orbitals in an optimization of the excited state, under the constraint of FCORE=.TRUE. The core orbitals of such a MCSCF optimization should be declared MCC, rather than FZC, even though they are frozen.

    In the case of transition moments either one or two CI calculations are performed, necessarily on states of the same multiplicity. Thus, only a $DRT1 is read. A spin-orbit coupling run almost always does two CI calculations, as the states are usually of different multiplicities. So, spin-orbit runs might read only $DRT1, but usually read both $DRT1 and $DRT2. The first CI calculation, defined by $DRT1, must be for the state of lower multiplicity, with $DRT2 defining the state of higher multiplicity.

    You will probably have to lower the symmetry in $DRT1 and $DRT2 to C1. You may use full spatial symmetry in the DRT groups only if the two states happen to have the same total spatial symmetry.

    The transition moment and spin orbit coupling driver is a rather restricted path through GAMESS, in that

    1. GUESS=SKIP is forced, regardless of what you put in a $GUESS group. The rest of this group is ignored. The NOCC orbitals are read (as GUESS=MOREAD) in a $VEC1 group, and possibly a $VEC2 group. It is not possible to reorder the orbitals.

    2. $CIINP is not read. The CI is hardwired to consist of DRT generation, integral transformation and sorting, Hamiltonian generation, and diagonalization. This means $DRT1 (and maybe $DRT2), $TRNSFM, $CISORT, $GUGEM, and $GUGDIA input is read, and acted upon.

    3. The density matrices are not generated, and so no properties (other than the transition moment or the spin-orbit coupling) are computed.

    4. There is no restart capability provided.

    5. DRT input is given in $DRT1 and maybe $DRT2.

    6. IROOTS will determine the number of eigenvectors found in each CI calculation, except NSTATE in $GUGDIA will override IROOTS if it is larger.

    References:

    F.Weinhold, J.Chem.Phys. 54,1874-1881(1970)

    B.H.Lengsfield, III, J.A.Jafri, D.H.Phillips, C.W.Bauschlicher, Jr. J.Chem.Phys. 74,6849-6856(1981)

    "Intramediate Quantum Mechanics, 3rd Ed." Hans A. Bethe, Roman Jackiw Benjamin/Cummings Publishing, Menlo Park, CA (1986), chapters 10 and 11.

    For an application, S.Koseki, M.S.Gordon J.Mol.Spectrosc. 123, 392-404(1987)

    * * *

    Special thanks to Bob Cave and Dave Feller for their assistance in performing check spin-orbit coupling runs with the MELDF programs.